Michael Data

==Bivariate Transformation=
Identify supports $A$ and $B$ such that the transformation is a 1-to-1 map from $A$ to $B$. If there isn't an “onto” mapping, then identify a partition of $A=A_1,A_2,...$ such that there is a 1-to-1 map from each $A_i$ to $B$. (C&B 162)

$ f_u_v(u,v) = f_x_y(h_1(u,v),h_2(u,v)) * |J| $ (C&B 157)


$ J = \begin{vmatrix} dx/du & dx/dv \\ dy/du & dy/dv \end{vmatrix} $
$ |J| $ is its determinant (ad - bc)

Sum of two Poisson

If $X$ ~ $Poisson(\theta)$ and $Y$ ~ $Poisson(\lambda)$ and $X$ and $Y$ are independent, then $X + Y$ ~ $Poisson(\theta + \lambda) $

Computational Tricks

Distribution Trick

Simplify an integral into the form of a known distribution, so it integrates to 1.

Geometric Series

See Wikipedia
Used to derive moment-generating function of Poisson.

Definition of e

$\displaystyle e = \lim_{n\to\infty} (1 + \frac{1}{n})^n $

Gamma Function

$ \Gamma (x) = \displaystyle \int_0^{\infty} e^{-t} t^{x-1} dt $

Binomial Coefficients

${n\choose k} = \frac{n!}{k!(n-k)!}$

$(x + y)^n = \displaystyle \sum_{k=0}^n } {n \choose k} x^{n-k}y^k$

Covariance and Correlation

$cov(x,y) = E[(x-E(x))(y - E(y))]$

Pearson's Correlation Coefficient

$corr(x,y) = \frac{cov(x,y)}{\sigma(x)\sigma(y)}$

Chebyshev's Inequality

$P(|g(x) - k)| \geq k) \leq \frac{E(g(x))}{k} $
$P(|x - E(x)| \geq k\sigma) \leq \frac{1}{k^2} $

“can be applied to completely arbitrary distributions (unknown except for mean and variance)”
Use to show that certain limits go to zero in order to satisfy conditions for a Central Limit Theorem.

Central Limit Theorems

Given a random variable with finite mean $\mu$ and variance $\sigma^2$:


for the Mean

$P(\sqrt{n}(S_n - \mu)) \rightarrow N(0,\sigma^2)$
The sample mean $S_n = \bar{X}$ converges in distribution to $N(\mu,\frac{\sigma}{\sqrt(n)})$ for $n$ samples.

for the Sum

The sum of samples $\sum X$ converges in distribution to $N(n\mu,\sqrt{n} \sigma)$


The Random Variables have to be independent, but not necessarily identically distributed.
Lyapunov's condition:

$$\displaystyle \lim_{n \to \infty} \sum_{i=1}^n \frac{E[|X_i - \mu_i |^{2+\delta} ]}{\sigma_i^{2+\delta}} = 0$$

When Lyapunov's condition is satisfied,

$$\displaystyle \sum_{i=1}^{n} \frac{X-I - \mu_i}{\sigma_i} \to N(0,1)$$


The Random Variables do not need to be identically distributed, as long as they satisfy Lindeberg's condition.

Lindeberg's Condition
$\displaystyle \lim_{n \to \infty} \frac{1}{\sigma_n^2} \sum_{i = 1}^{n} E[|x_i|^2 * 1_{[|x_i > \epsilon \sigma_n]}] = 0$

If Lindeberg's Condition is satisfied, then $\frac{S_n}{\sigma_n} \to N(0,1)$
and $\displaystyle \lim_{n \to \infty} max(\frac{\sigma_i^2}{\sigma_n^2},1 < i < n) = 0$

Multivariate Normal Conditional distributions

If $\mu$ and $\Sigma$ are partitioned as follows

    * \boldsymbol\mu_1 \\
    * \boldsymbol\mu_2
\quad$ with sizes $\begin{bmatrix} q \times 1 \\ (N-q) \times 1 \end{bmatrix}$

    * \boldsymbol\Sigma_{11} & \boldsymbol\Sigma_{12} \\
    * \boldsymbol\Sigma_{21} & \boldsymbol\Sigma_{22}
\quad$ with sizes $\begin{bmatrix} q \times q & q \times (N-q) \\ (N-q) \times q & (N-q) \times (N-q) \end{bmatrix}$

then, the distribution of $x_1$ conditional on $x_2 = a$ is multivariate normal.

$(x_1|x_2 = a) ~ N(\bar{\mu}, \bar{\Sigma})$ where

\boldsymbol\mu_1 + \boldsymbol\Sigma_{12} \boldsymbol\Sigma_{22}^{-1}
    * \mathbf{a} - \boldsymbol\mu_2

and covariance matrix

\boldsymbol\Sigma_{11} - \boldsymbol\Sigma_{12} \boldsymbol\Sigma_{22}^{-1} \boldsymbol\Sigma_{21}$.

This matrix is the Schur complement of $\Sigma_{22}$ in Σ. This means that to calculate the conditional covariance matrix, one inverts the overall covariance matrix, drops the rows and columns corresponding to the variables being conditioned upon, and then inverts back to get the conditional covariance matrix. Here $\boldsymbol\Sigma_{22}^{-1}$ is the generalized inverse of $\boldsymbol\Sigma_{22}$

Note that knowing that $x_2 = a$ alters the variance, though the new variance does not depend on the specific value of $a$. Perhaps more surprisingly, the mean is shifted by $\boldsymbol\Sigma_{12} \boldsymbol\Sigma_{22}^{-1} \left(\mathbf{a} - \boldsymbol\mu_2 \right)$.

Compare this with the situation of not knowing the value of $a$, in which case $x_1$ would have distribution
$\mathcal{N}_q \left(\boldsymbol\mu_1, \boldsymbol\Sigma_{11} \right)$.

An interesting fact derived in order to prove this result, is that the random vectors $\mathbf{x}_2$ and $\mathbf{y}_1=\mathbf{x}_1-\boldsymbol\Sigma_{12}\boldsymbol\Sigma_{22}^{-1}\mathbf{x}_2$ are independent.

The matrix $\Sigma_{12}\Sigma_{22}^{−1}$ is known as the matrix of regression analysis coefficients.

In the bivariate case where x is partitioned into $X_1$ and $X_2$, the conditional distribution of $X_1$ given $X_2$ is

$X_1|X_2=x \ \sim\ \mathcal{N}\left(\mu_1+\frac{\sigma_1}{\sigma_2}\rho(x-\mu_2),\, (1-\rho^2)\sigma_1^2\right). $

where $\rho$ is the correlation coefficient between $X_1$ and $X_2$.

Bivariate conditional expectation

In the case

    * X_1 \\
    * X_2
\end{pmatrix}  \sim \mathcal{N} \left( \begin{pmatrix}
    * 0 \\
    * 0
\end{pmatrix} , \begin{pmatrix}
    * 1 & \rho \\
    * \rho & 1
\end{pmatrix} \right)

the following result holds

E(X_1 | X_2 > z) = \rho { \phi(z) \over \Phi(-z) } ,

where the final ratio here is called the inverse Mills ratio.

Delta Method

For random variable $\theta$ which satisfies $\sqrt(n)(Y_n - \theta) \to N(0,\sigma^2)$, sample vector $\bar{X_n}$, $n$ samples, and a function of the random variable $g(\dot)$:
Use a Taylor-series approximation of $\sqrt(n)(g(Y_n) - g(\theta)) \to N(0,\sigma^2)$ without calculating the distribution of $\theta$.