is hard, so choose an easy **proposal distribution** . Formulate inference as an optimization problem: minimize “distance” between q and p.

Projection problem: Given p, find distribution from family of distributions Q that is closest to p.

Optimization functions

KL Divergence

I-projection and M-projection

The log-partition function can be expressed in terms of free energy and the KL divergence.

#### Mean Field

[http://en.wikipedia.org/wiki/Mean_field Mean field] assumes a propositional distribution in the form of a Gibbs distribution. Exponential function of “Means of the neighbors”.

Goal is to optimize .

Update step:

Guaranteed to converge to a stationary point, but not necessarily a local optimum.

Naive Mean Field: q is a set of independent marginals.

Structured Mean Field: q has some low tree-width structure

#### Junction Tree

Choose q to be a junction tree of p. This can produce exact inference.

“Construct lagrangian, stationary points, do a bit of math…”

##### Weighted Mini-Bucket Elimination

Check out Holder's inequality.

Weigth each bucket, subject to the constraint that all weights sum to one.

##### approximate free energy

##### Variational upper bounds