$p(x)$ is hard, so choose an easy **proposal distribution** $q(x) \in Q$. Formulate inference as an optimization problem: minimize “distance” between q and p.

Projection problem: Given p, find distribution from family of distributions Q that is closest to p.

Optimization functions
KL Divergence
I-projection and M-projection

The log-partition function can be expressed in terms of free energy and the KL divergence.

#### Mean Field

[http://en.wikipedia.org/wiki/Mean_field Mean field] assumes a propositional distribution in the form of a Gibbs distribution. Exponential function of “Means of the neighbors”.
Goal is to optimize $\text{max}_{q \in Q} H[q(x)] + \sum_x q(x) \log(\~ p(x))$.

Update step:

Guaranteed to converge to a stationary point, but not necessarily a local optimum.

Naive Mean Field: q is a set of independent marginals.
Structured Mean Field: q has some low tree-width structure

#### Junction Tree

Choose q to be a junction tree of p. This can produce exact inference.

“Construct lagrangian, stationary points, do a bit of math…”

##### Weighted Mini-Bucket Elimination

Check out Holder's inequality.
Weigth each bucket, subject to the constraint that all weights sum to one.

##### approximate free energy

##### Variational upper bounds