Friday, November 13, 2015

max gap problem and pigoen hole principle

Source: http://allenlipeng47.com/PersonalPage/index/view/199/nkey

Given a unsorted array with n elements. How can we find the largest gap between consecutive numbers of sorted version in O(n)?
For example, we have a unsorted array
9 19 13 12 33 41 22
the sorted version is
9 12 13 19 22 33 41
the output of the algorithm should be 33-22 = 11

  
One way to do it is to using counting sort or radix sort if the integer range is known and small compared to size of array. 

Another way to do it is to using pigeon hole principle, divide up the k numbers to k+2 equal-size buckets, so at least there will be some empty buckets, and we know the max gap will be larger than the size of buckets and coming from the difference between the min and max of each bucket.

Here is the link to the algorithm:
maximum-gap-program

 /*
     * maximum gap algorithm using pigeon hole principle
     * without losing generality, assume the numbers are integers
     *
     * note the interval is left inclusive and right exclusive
     *    
        if the numbers can not be evenly divided up between min and max
         then we may need one extra interval.
         and pigeon hole principle still holds
    */
    public static int maxGap(int[] nums){
        int min = nums[0], max = nums[0];
        for(int k : nums){
            min = Math.min(min, k);
            max = Math.max(max, k);
        }
       
        int size = nums.length;
        int interval = (max-min)/(size-1);
        /*
         * we will create either n-1 or n pigeon holes
         */
        int count = size-1;
        if(interval*count+ min <=max)
            count++;
       
        Integer[] mins = new Integer[count];
        Integer[] maxs = new Integer[count];
        mins[0] = min; maxs[count-1] = max;
       
        for(int k : nums){
            int index = (k-min)/interval;
            mins[index] = mins[index] == null ? k : (Math.min(k, mins[index]));
            maxs[index] = maxs[index] == null ? k : (Math.max(k, maxs[index]));
        }
       
        int maxGap = maxs[0] - mins[0];
        int pre = maxs[0];
        for(int i = 1; i<count; i++){
            if(mins[i]==null)
                continue;
            maxGap = Math.max(mins[i] - pre, maxGap);
            maxGap = Math.max(maxs[i]-mins[i], maxGap);
            pre = maxs[i];
        }
       
        return maxGap;
    }

2 comments:

  1. I think its a universal method. But I don't think you need to calculate the mins[i]-pre for each pairs. The maxGap must happen on the 2 sides of an empty bucket. We only check the pre and min[i] between empty bucket. So, find the empty bucket is an essence.

    ReplyDelete