Michael Data

Given data points in some space, serve queries about which points are in a subset of that space.
Different data structures will handle different query shapes.
Types of queries:

  • * counting number of points in a region
  • * listing points in a region
  • * function of points in a region
  • highest priority
  • average

A range query $q$ is decomposable if $q = x \cup y \to \text{query}(Q) = \text{query}(X) \cup \text{query}(Y)$. If the query function is closed under the union operation.

Dynamic Single-Dimensional Decomposable Range Queries

A simple version of the range query problem space.
Dynamic property means that it supports insertion and deletion of data points.
Data have real-number keys as well as other associated values.
A range is an interval $L \leq key \leq R$.
The result of a query is a function of associate values, has a range over data keys.

Augmented Binary Search Tree

This solution maintains data keys in a balanced binary search tree.
Then each tree node is augmented by $\otimes$.
For each node x: augment(x)= value(x) $\otimes$ augment(x.left) $\otimes$ augment(x.right)

Use the augment function to update augmented values after insertion, deletion, or rotation. For insertion and deletion, have to update everything back to the root. For rotation, only have to recompute augmented value for the new lower node.

Querying:
Perform a binary search to identify nodes closest to but within the query bounds. Use the augmented values of nodes encountered during the searches in order to reconstruct the query function over the correct range. The augmented values help prevent us from requiring to traverse the entire range in question.

Complexity

When $\otimes$ is $O(1)$, insertion and deletion updates are $O(\log n)$.

Retrieval is $O(\log n + k)$ for an unaugmented binary search tree which has to traverse the entire range.
Retrieval in a tree augmented with range count is …

Range Minimum

Equivalent to the Lowest Common Ancestor problem. Can be solved with linear space requirements and constant query time.
What is the minimum value in a subarray? Given a a static array $A[0,\dots,n-1]$. What is $min(A[i]\dots A[j])$ for given $0 \leq i \leq j \lt n$

Binary Tree Solution

Make a binary tree with the elements of A at its leaves. Each internal node stores the minimum of its 2 children. O(n) space and O(log n ) time, and it allows updates.

Table Solution

Build a table T such that T[i,j] stores the query results for query(i,j).
Takes O(n^2) space, O(1) query time.

Hybrid Solution

If you take two overlapping range queries, the query of their union is the minimum of their query values. Optimize the table solution by stoing every possible range covered by 2 smaller ranges.
Define an array U[i,k] = query(i,i+2^k - 1).
Now U[i,0] = A[i]
U[i,x] (x > 0) = min(U[i,x-1] , U[i+2^{x-1},x-1]).

  • for x = 0,1,2,… floor(log(n)):
  • for i = 0,1,… (n-2^x):
  • if x== 0:
  • U[i,x] = A[i]
  • else
  • U[i,x] = min(U[i,x-1] , U[i + 2^{x-1},x-1])
  • O(n log n ) time and O(n log n ) space
  • Query time is then O(1)

Conversion to Lowest Common Ancestor

Given range-minimum data: $A[0,\dots,n-1]$.
Construct a Cartesian tree out of $A$. The root is the minimum of the array, then recurse for two new sub-arrays.

  • Start with one-node tree for i=0
  • for i = 1,2,3,…(n-1):
  • walk up the tree from node (i-1) until finding a value x ⇐ A[i]
  • splice A[i] in as right child of x
Lowest Common Ancestor

Equivalent to the Range Minimum problem.
Given a rooted tree and two nodes (x,y), what is the closest node that is an ancestor of both of them. (each node is an ancestor of itself).

Cartesian Tree Solution

  • Label each node by its distance from the root.
  • Double each edge weight of the tree.
  • Write down sequence of (level,node) pairs in sequence given by an Euler tour of the tree.
  • Store a dictionary of (node , first position in tour) pairs. Use a Cartesian tree to store this.
  • LCA(x,y) = node at min(first pos(x) , first pos(y))

[http://en.wikipedia.org/wiki/Cartesian_tree Cartesian Tree] construction:

  • From a sequence of numbers
  • create a node for the rightmost number
  • for each value x from left to right after the first:
  • search for y, the first value smaller than x
  • in the path from the previously created node to the root
  • make the current right-child of y into the left-child of x
  • make x the right-child of y
  • if no other nodes were smaller than x, make the root a left-child of x
  • given existing tree N
  • root = min weight edge of N
  • remove that edge and split N into two subtrees
  • recurse on each subtree

Applications

Distance(a,b) = distance(a,mouth) + distance(b,mouth) - 2*distance(LCA(a,b),mouth).
The Lowest-Common Ancestor lookup allows for this quick distance formula.

Computer Networks

bandwidth(x,y) = minimum edge bandwidth on path from x to y.
Construct a “Cartesian tree”, a binary tree in which internal nodes correspond to edges of N, and leaves correspond to vertices. Left/right child orientation doesn't matter.

Conversion to Range Minimum

Write down the depth of each entry of an Euler tour of A into a new array A'.
-Note that each adjacent pair in A' differs by at most 1.

To find LCA, find first occurrence of each node in A'. Range minimum on this range in A' provides the LCA node.

  • Optimization for the U-matrix not noted here:
  • Convert A' to B: blocks of k = floor(log_2 (n) / 2, each with a binary value that indicate whether the previous pair increased or decreased by 1.
  • -There are 2n/logn blocks but only sqrt(n) different patterns.
  • For each pattern in B, build a table of k^2 query answers
  • For each block, store a pointer to its table.
  • Each table is 2n/logn by logn, and so stores linear amount of data over all.

Can handle all possible range queries by combining the table of range solutions with an array of block minima.