Michael Data

[http://en.wikipedia.org/wiki/Priority_queue Priority Queues] maintain a set of items along with priority values for each item. The values follow some ordering but are not unique to each item.

  • * Add
  • * Remove
  • * Find highest priority item
  • Often remove
  • * Change priority of item

Nice way to build a priority queue is to use a Binary Heap.


Dijkstra's Algorithm

Run [http://en.wikipedia.org/wiki/Dijkstra%27s_Algorithm Dijkstra's Algorithm] for shortest path in a graph from a starting vertex s.

  • * Maintain “tentative” distance D[v] for each vertex from s. (initially D[s]=0 and D[~s]=inf)
  • * Initialize a priority queue fwith vertice priorities given by D
  • * While the queue is nonempty:
  • find and remove the highest priority vertex v
  • for each edge v→w:
  • D[w] = min{D[w] , D[v]+length(v→w)}

Because priorities are only ever changed to become smaller, the “bubble up” procedure of the binary tree heap is helpful.

Because there are more priority changes than deletions, Dijkstra's algorithm can be sped up by optimizing the change-priority operation at the expense of the delete operation. One way to do this is to use a complete k-ary tree instead of a binary tree. Even for general binary heaps, increasing k slightly above 2 frequently results in real-world improvements due to memory locality.

  • time for delete-min:
  • height of the tree = log_k (n) = log(n) / log(k)
  • time per iteration = O(k)
  • total = O[k*log(n)/log(k)]
  • time for change-priority:
  • time per iteration = O(1)
  • total = O[log(n)/log(k)]

For Dijkstra, total time for n vertices and m edges is $O[m \frac{1}{k} \log(n) + n \frac{k}{\log(k)} \log(n)]$. Heuristic for optimizing this is to choose k that balances the two terms… $k=\frac{m}{n}$ for total time $\frac{m \log n}{\log \frac{m}{n}}$

Heap Sort

(Basic version)
Given a set of values to be sorted:

  1. Make items into a priority queue Q
  2. While Q is nonempty:
    1. find and remove highest priority item x
    2. output x

(in-place version)

  1. Make items into priority queue
  2. While Q is nonempty:
    1. find and remove highest priority item x
    2. put x at the start of the output list

MinHash Sketches

The sketch of a dynamic set of items in a Min Hash with add and remove operations:

  1. Create two priority queues Q1 and Q2

#* Q1 contains k items with smallest hash
# Want to prioritize for retrieval of item with largest hash, so it can be replaced if a smaller one is discovered
#* Q2 contains remaining items
Want to find items with smallest hashes

  1. Add items to the set by comparing to max(Q1)

#* if the item has a smaller hash, replace max(Q1) with the new item and place it in Q2
#* else add the new item to Q2