Michael Data

Dependency Graphs include undirected possibly-cyclic Markov networks, and DAG Bayesian networks. Induced dependencies cannot be represented by undirected graphs, but can be represented by directed graphs.
Inference can be performed through Belief Propagation, Tree Searching, or Sampling methods.

Representations
  • An I-map or independency map is a graph with connections between dependent propositions.
    • Ind(Model) ← Ind(Graph)
    • Tends to be more sparse than a D-map, and used more frequently
    • A minimal I-map is a Markov network
    • A minimal I-map DAG is a Bayesian network
  • A D-map or dependency map is a graph with connections between independent propositions.
    • Ind(Model) → Ind(Graph)
    • Also called a relevance network?
    • Tends to be very dense
  • A P-map is both a D-map and and I-map
    • Doesn't exist for all distributions

So a full graph is a trivial I-map, and an empty graph is a trivial D-map.

Properties
  • Symmetry
    • $I(X,Z,Y) \to I(Y,Z,X)$
  • decomposition
    • $I(X,Z,YW) \to I(X,Z,Y)$ and $I(X,Z,W)$
  • weak union
    • $I(X,Z,Y \cup W) \to I(X,Z \cup W, Y)$
  • contraction
    • $I(X,Z,Y)$ and $I(X,Z \cup Y, W) \to I(X,Z,Y \cup W)$

The previous four properties hold for a probabilistic model

  • intersection
    • $I(X,Z \cup W,Y)$ and $I(X,Z \cup Y,W) \to I(X,Z,YW)$

A strictly positive probabilistic model will also have intersection.

  • strong union
    • $I(X,Z,Y) \to I(X,ZW, Y)$
  • transitivity
    • $I(X,Z,Y) \to \exists t  : I(X,Z,t)$ or $I(t,Z,Y)$
  • Semigraphoid
    • Any three-place relation that satisfies the first four properties
  • Graphoid
    • Any three-place relation that satisfies the first five properties
    • A model that is Chordal is decomposable
    • Chordal graphs can be expressed as Bayesian networks or as Markov networks
    • The edges of a chordal graph can be directed acyclically so every pair of converging edges are from adjacent vertices.
    • A join tree exists which contains a vertex for each clique of the chordal graph
Testing I-Mapness

For each edge in the graph, check that there is no dependence between the two variables when conditioned on their respective remaining edges.

  • Show $<X|Z|Y>_D \to I(X,Z,Y)_M$ for every three disjoint sets of vertices X,Y,Z in DAG D and model M
Finding an Ordering for Small Induced Width

The problem is NP-complete.
Trees have width and induced-width of 1.
Chordal graphs have (width = induced-width).

Greedy algorithms

min width

This is an optimal algorithm for the width, but not for induced width
Given a graph G = (V,E) , V = {v1,…vn}, produce a min-width ordering of the nodes

  • for j = n to 1 by -1 do
    1. r ← a node in G with smallest degree
    2. put r in position j
    3. connect r's neighbors
      1. E ← E $\cup$ {($v_i$,$v_j$) | ($v_i$,r) $\in$ E, ($v_j$,r) $\in$ E}
    4. remove r
      1. G ← (G-r)

min induced-width

Also called min field

  1. Moralize the graph
  2. run min width

max cardinality

Runs in time O(|V| + |E|), faster than the others.

min-fill

In practice, is superior to min-width and max-cardinality.

  • for j = n to 1 by -1 do
    1. r ← a node in V with smallest fill edges for its parents
    2. put r in position j
    3. connect r's neighbors: E ← E union …

"anytime min-width"

(Gogate and Dechter)

Construction

Gibbs' Potential is for constructing a complete and consistent graph model while preserving the dependency structure of an arbitrary graph G:

  1. Identify the cliques of G
  2. For each clique, assign a nonnegative compatibility function which measures the relative degree of compatibility associated with each value assignment
    • Not necessarily a probability
  3. Multiply the clique probabilities together
  4. Normalize

A probability distribution formed in this way is a Markov Field.

Primal Graphs

An undirected graph that has variables as vertices and an edge connecting two variables that appear in the scope of the same function.

Hypergraphs

A hypergraph is a set of vertices and corresponding “hyperedges”. A Hyperedge is a generalization of a traditional graph edge, in that it represents any sub-set of vertices rather than just a pair.

For graphical models, a hyperedge is used to represent each function. It connects all variables in the scope of that function.

Dual Graphs

A graph that represents each function scope by a node and associates labeled edges between all nodes that share variables in their scopes. The edge is labeled by shared variables between nodes. The dual graph of any Bayesian network has a minimal-arc dual join tree.

Join Graphs

A Join Graph is any connected sub-graph of a Dual Graph. A join graph that is a tree is a join-tree a.k.a. “junction tree”. Join trees can be constructed through Tree Decomposition.

Pseudo-Trees

A tree constructed from a graph such that all back-edges connect a node to its ancestor in the tree. For search, smaller-height pseudo-tree is better. DFS-trees are a subset of psuedo-trees. Chain graphs are trivial pseudo-trees of maximum height.

Given a tree-decomposition whose tree-width is $w^{*}$, there exists a pseudo-tree T of G whose depth $m$ satisfies $m \leq w^{*} \log{n}$.

The width of a pseudo-tree is defined by a forward-order elimination.

Can be generated from a bucket-tree.

Any AND-OR Tree based on a pseudo-tree is sound and complete.