Michael Data

May want to do certain search operations over priority values.
Want to do priority-queue type operations: what is the next-largest integer?

  • success(q)
  • Find input object with key > q, as close as possible
  • if q > top key: abort, no such item
  • if z ⇐ bottom key, return bottom item

van Embde Boas Tree

Recursive tree structure for integer searching. Base case is any other search method. Current iteration stores min, max values and a hash function into $O(i)$ recursive structures. Record with key $x$ goes into subset $S_{hi(x)}$.
Time bound depends on N, the range of possible keys,

e.g. van Emde Boas tree [Kaas, Zijkstia '76] which are O(log log N) per operation:

  • i-bit binary number x
  • (i = 8) 11011110
  • upper half of x (i/2 bits, hi(x)):
  • x » (i/2) == 1101
  • lower half of x (i/2 bits, lo(x)):
  • x & 1)
1 « (i/2)-1) == 1110
  • VEB(i):
  • if i is a constant:
  • store a list of keys-record pairs
  • otherwise:
  • keep records with min and max keys separate
  • in instance variables top,bottom
  • partition remaining records into subsets Si
  • record with key x goes into Si( hi(x) )
  • for each non-empty subset Si:
  • create recursive data structure VEB(i/2) “Li”
  • stores records of Si but where every record
  • only uses the low half of its keys: lo(x)
  • store dictionary D
  • D(i) maps instance i to Li
  • create recursive data structure VEB(i/2), called H
  • success(q)
  • if D[hi(q)] is nonempty:
  • if lo(q) ⇐ D[hi(q)]'s top key:
  • We have a nonempty recursive structure with the same high half
  • with at least one element which could be >= q
  • return D[hi(q)].success(lo(q))
  • try:
  • return H.successor(hi(q)+1).bottom
  • except:
  • return top
  • insert(object x, key k):
  • if 0 items already:
  • top = x
  • bottom = x
  • if 1 item already:
  • update top or bottom to x
  • if more than 1 item:
  • check for updating top and bottom
  • if D[hi(k)] nonempty:
  • store x,lo(k) in D[hi(k)]
  • else:
  • make new VEB(i/2) L for …
  • deletion is a little more complicated.
  • If it is the top or bottom,
  • the new top or bottom has to be found and removed

=== Fusion Tree ===
Time bound depends only on n, the number of items in the structure.

Fusion tree is a [http://en.wikipedia.org/wiki/B-Tree B-tree] with B = $\log(n)^{1/4}$, but each node is stored as a compressed trie of its keys as binary number. Plus clever binary table-lookup tricks and arithmetic to make it fast.

B-tree: balanced search tree with greater than 2 children per node. The number of children is between B and 2B. Its height is $O(\frac{\log n }{ \log B})$.
  • * O(log n / log B) steps to search
  • each step may use O(log B) to find the correct child
  • * update is also slow
  • each step may use O(B) time
  • * search or update uses only O(log n / log B) block I/O operations

Each node contains a number of keys within a set range. For $n$ keys within the node, there will be $n+1$ child references to sub-trees defined by the values of the $n$ keys.