Tuesday, January 6, 2015

Find a straight line with maximum points in 2D space

I came across this post in stackoverflow.com,

http://stackoverflow.com/questions/4179581/what-is-the-most-efficient-algorithm-to-find-a-straight-line-that-goes-through-m

Given n points on 2D spaces, each point's x, y coordinates are intergers. Find a line which has maximum number of points on it.

This problem tests our knowledge about two concepts: 1) relative prime or calculate GCD; 2) how hash table works, if we override equals(), we must override hashCode(). Again two simple and fundamental knowledges can solve this seemingly complex problems.

The algorithm has those steps:

Given list of n points, saying points[k] where k = 0, 1, ..., n-1
max: the maximum number of points on a line
1.  for each point points[i], loop through any other points in the list points[j], where j!=i
2.  representing the slope of the line by calculating a pair of numbers, (points[j].y-points[i].y,  points[j].x-points[i].x)
     if(points[j].x-points[i].x==0) which is a vertical line, then form a pair of numbers (1, 0)
     else then divided by the GCD of the original pair, to form a pair of relative prime numbers
 
     use the above pair as slope to create new Line object. Note the starting point is fixed which is points[i]

3. use line object as key, store the list of points, e.g. points[j] as value in a hash table, see below for implementation on Line class' hasCode() and equals() methods
4. update the max


/*line class using pair of relative prime numbers 
*to represent the line from a pair of points
 */
static class Line {
int dx, dy;
public Line(int x, int y){
if(x==0){//take care of vertical line
dx = 0; dy = 1;
} else {
int g = gcd(x, y);
dx = x/g; 
dy = y/g;
if(dy<0){//this will make sure (-1, -1) and (1, 1) maps to the same line
dy = dy*-1; dx = dx*-1;
}
}
        }
@Override
public int hashCode(){
return dx*31+dy;
}
@Override
public boolean equals(Object line){
if(line == this)
return true;
if(line instanceof Line){
return this.dx == ((Line)line).dx && this.dy == ((Line)line).dy;
}
return false;
        }
        public static int gcd(int x, int y){
if(y==0)
return x;
x = Math.abs(x);
y = Math.abs(y);
int r = 0;
for(; (r=x%y)!=0; x=y, y=r){}
return y;
}
}
static class Point{
int x, y;
public Point(int x, int y){this.x = x; this.y = y;}
}
/*
* code loop through each points with any other points, and form a line object, store the line 
* and associated points in a hashtable
* the line is defined as pair of relative prime (p1.x-p.x, p1.y-p.y)
*/
static Set<Point> findMaxPointsLine(Point[] points){
int max = Integer.MAX_VALUE;
Line maxLine = null;
Set<Point> maxPoints = new HashSet<Point>();
for(int i=0; i<points.length; i++){
//map store all of the lines go through points[i] and its associated points on the line, 
//besides the points[i] itself
Map<Line, HashSet<Point>> myLines = new HashMap<Line, HashSet<Point>>();
for(int j=0; j!=i && j<points.length; j++){
Line line = new Line(points[j].x - points[i].x, points[j].y - points[j].y);
if(!myLines.containsKey(line)){
HashSet<Point> set = new HashSet<Point>();
set.add(points[j]);
myLines.put(line, set);
}else{
myLines.get(line).add(points[j]);
}
if(myLines.get(line).size()>max){
max = myLines.get(line).size();
maxLine = line;
}
}
// add point[i] itself
  myLines.get(maxLine).add(points[i]);
maxPoints = myLines.get(maxLine);
}
return maxPoints;
}



6 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. For example, for p0=(0,0), p1=(1,1), p2=(0,1), p3=(1,2), we test line1=(p0, p1) and line2=(p2, p3). According to your Line() constructor, line1 and line2 will considered as the same line. They will be put in the same place in myLines of HashMap. But actually, we know they are 2 different lines. They should be processed differently.

    ReplyDelete
    Replies
    1. p0(0, 0) is fixed, now we test line 1(p0, p1), line 2(p0, p1), line 3(p0, p3), and ask the question is they are the same line object. based on hashCode() and equals(), there will be three keys in the hash table, so three different lines.

      then next time, we fixed p1, then test lines: (p1, p0), (p1, p2), (p1, p3).

      next time, we fixed p2, test lines (p2, p0) (p2, p1), (p2, p3)

      the final output should be maxPoints = 2

      Delete
  3. Does the O(n^2) time complexity mean, that at maximum, we would store n! lines in hashmap, right?

    ReplyDelete
  4. nope, it is O(n^2) time, but inside the outer loop, the hash is instantiated as empty:
    //map store all of the lines go through points[i] and its associated points on the line,
    //besides the points[i] itself
    Map> myLines = new HashMap>();

    ReplyDelete