# Michael Data

##### Interpretation of Probability

or
* is a proposition that is true or false
* Frequentist or Bayesian interpretation of the value?
Frequentist: $P(a)$ is the relative frequency with which $a$ occurs in repeated trials
Bayesian: is a degree of belief of an “agent” that proposition is true

##### Properties of Random Variables

Values of the variable are mutually exclusive and exhaustive.

#### Discrete

The variable takes discrete values. . Discrete distributions can be represented as a table of probabilities for each value. Alternately, the distribution can be represented by some function of the values.

#### Continuous

The variable takes continuous values. . The distribution is typically represented with a density function. There is no real bound on the height of the density curve.

#
#
#

##### Sets of Random Variables

e.g. consider discrete random variables A,B,C. Each takes m different values, with .

is a joint distribution. It can be represented as a table with probabilities for each permutation of values. .

##### Conditional Distributions

e.g. can now be represented as a table of values, because the values of and have been fixed. The conditioned variables are either known or assumed to have those values. .

If we were to plot across different values of , the result is not a density function and does not have to integrate or sum to 1.

##### Factorization or Chain Rule

How do we get from a joint distribution to a continuous distribution?

recursively:

But order doesn't matter.

#### Deriving Bayes' Rule

Factorization provides an easy derivation for Bayes' rule:
Let's represent .

##### Law of Total Probability

How can we get from a joint distribution to unconditional a.k.a. marginal distributions?
For discrete random variables, sum out the extra variables:

For continuous random variables, integrate out the extra variables.

##### Conditional Independence

A conditional independence assumption allows reduction of the amount of information required when storing or working with a joint distribution.

e.g. for discrete random variables A,B,C,D which each take m values:
By factorization,
This uses at least terms!

We can assume that A is conditionally independent of C and D given B:

Now we factorize like this:
Now the largest table of the factorized representation is terms.

#### Genetics Example

G = maternal grandmother's genes
M = mother's genes

Y and G are conditionally independent, given M.

#### Weather Example

I = Irvine temperature
B = Beijing temperature
M = month of year

I and B are conditionally independent, given M.

##### Linear Correlation

This is a scaled covariance, and useful…