Tuesday, January 27, 2015

Order statistics binary search tree

In the past few days, Peng Li (http://allenlipeng47.com/PersonalPage/index/view/117/nkey) and I have embargoed a journey in the Trees (aka forest). Tree is a wonderful data structure, probably the most important data structure along with HashTable. In the last few blogs I talked about "Segment Tree" and its implementation. I also talked about "Skip List", which is the alternative to binary tree.

Now here is one more! Order statistics binary search tree. This is probably one of the simplest high order binary search tree. It is just like binary search tree, except we store additional information in the node, usually this additional information can be computed recursively and bubble-up approach by taking advantaging of the tree's inherent characteristics. 

For example, we can store the size of subtree rooted at each node. 

Node.size = Node.left.size + Node.right.size

Once we have that data, then we can support the following two operations in O(lgN) time.

Select(i) — find the i'th smallest element stored in the tree

Rank(x) – find the rank of element x in the tree, i.e. its index in the sorted list of elements of the tree

public class OrderStatisticTreeQ {
public class Node {
int data;
int size;
Node left;
Node right;
}

/*Select(i) — find the i'th smallest element stored in the tree
Rank(x) – find the rank of element x in the tree, i.e. its index in the sorted list of elements of the tree
*/
//k starts at 0
public int findKthSmallest(Node root, int k){
int level = k;
Node curr = root;
while(curr!=null){
int leftSize = (curr.left ==null? 0 : curr.left.size);
  if(level == leftSize )
return curr.data;
  if(leftSize > level)
curr = curr.left;
  else{
curr = curr.right;
level = level - (leftSize+1);
  }
}
return -1;
}

//index starts at 0
public int findIndexFor(Node root, int value){
Node curr = root;
int passed = 0;
while(curr!=null){
if(curr.data == value){
return (curr.left==null?0:curr.left.size) + passed;
}else if(curr.data > value){
curr = curr.left;
}else{
passed += (curr.left==null?0:curr.left.size) + 1;
curr = curr.right;
}
}
return -1;
}
}

6 comments:

  1. I added a Node(int data, int size) construction method. It doesn't pass following case:
    public static void main(String[] args) {
    Node n4 = new Node(4, 3);
    Node n2 = new Node(2, 1);
    Node n5 = new Node(5, 2);
    Node n6 = new Node(6, 1);
    n4.left = n2; n4.right = n5;
    n5.right = n6;
    int index = findIndexFor(n4, 2); //result is -1, not right
    int smallest = findKthSmallest(n4, 2); //throw NullPointerException
    }

    1. I think the following should be changed from:
    return (curr.left == null ? 0 : curr.left.size) + passed - 1;
    to:
    return (curr.left == null ? 0 : curr.left.size) + passed;

    2. The following 2 lines may throw NullPointerException:
    level = k - (curr.left.size+1);
    passed = curr.left.size + 1;

    ReplyDelete
    Replies
    1. also it should be "passed += curr.left.size + 1;"

      Delete
    2. the code was not well thought, i should spend more time on it before i released it. Now i corrected both.

      Delete
    3. hehe. I think NullPointerException is still there. Based on my test case, you should try this: findIndexFor(n4, 100);

      Delete
    4. i think it was this line throw NPE. i should do null checks.
      passed += (curr.left==null?0:curr.left.size) + 1;

      Delete