Michael Data

[http://en.wikipedia.org/wiki/Kruskal%27s_algorithm Kruskal's algorithm] finds the maximum spanning tree in a graph.

  1. sort edges in descending order by weight
  2. initialize a forest F with a one-vertex tree for each node of the input graph
  3. for each edge in the sorted order:

#* If it connects two different trees in F:
#** Add the edge to F

  1. return F
data structure steps

data: Maintain a forest
update: add an edge
query: are two vertices in the same tree?

Equivalent to [http://en.wikipedia.org/wiki/Disjoint-set_data_structure disjoint set union problem].
data: partition of nodes into disjoint sets
update: merge two sets (forming their union)
query: do two nodes belong to the same set? (equivalent to `which set contains a given node` twice)

Useful to use union find 'find' for the tree membership queries. Represent all nodes in a tree as belonging to the same subset of nodes in union-find. the 'union' operation is good for adding an edge. Simply use the union operation on the subsets of each tree edge to connect.

Performance

On a graph with n vertices and m edges, the entire Kruskal algorithm will take time = Sort(m) + m queries + (n-1) unions.

Overall running time is $O(E \log E)$ or $O(E \log V)$.