Belief Propagation is a method for performing inference on tree-decomposed dependency graphs such as Markov Networks and Bayesian Networks ^{1)}.

If a cycle exists after constructing a minimal dual join graph, belief propagation can iteratively pass messages along nodes in the cycle. This is **Loopy Belief Propagation**. Convergence is not guaranteed, and is a difficult problem.

Can be optimized through the use of Cutset Conditioning, Bucket Elimination and Tree Search Spaces.

##### Variable Elimination

A fundamental operation is **Variable Elimination**. To eliminate a variable from the graph, update the CPT of all of its neighbors by marginalizing the variable out. That variable has then “propagated its belief” to its neighbors.

##### Hybrid Methods

To balance time and space costs, want to use a combination of variable elimination and other methods.

#### Condition-Elimination Interleave

Do Variable Elimination on variables which already have connected parents.

#* This removes easy variables without increasing induced width

Use

VEC to “remove” a variable of high degree.

GOTO step 1

##### Approximate Schemes

#### Iterative Belief Propagation

Can iteratively run belief propagation on an arc-minimal join graph. Accuracy is increased the vewer cycles that remain in the graph. Can use superbuckets-style combination of clusters together to create a more tree-like structure at the cost of computation time.

Arc-minimal join graph is a join graph that contains no cycles of a common variable between scope clusters.

Can achieve local minima of KL-divergence.

#### Iterative Join-Graph Propagation

Useful for Importance Sampling.

If it converges, achieves “pairwise consistency” a.k.a. “calibration”. A marginal within a cluster will be consistent with any other cluster. Pairwise consistency implies the normalizing constants for each cluster will be unique because they contain different scopes. It also implies symmetric messages between connected nodes, because each node provides a marginal over the separator set. By definition of convergence, symmetry also implies pairwise consistency.

$\~p(x,y,z) = \frac{p(x,y)p(y,z) }{ \sum_{x,y,z} (p(x,y)p(x,z))} = \frac{… }{ \sum_{x,y} p(x,y)p(y) }$