"Behind every stack of books there is a flood of knowledge."
The Hough transform is an incredible tool that lets you identify lines. Not just lines, but other shapes as well. In this article, I’ll talk about the mechanics behind the Hough transform. It will involve a bit of math, but just elementary concepts you learned in school.
In this article, we’ll work with lines only, though the technique can be easily extended to other shapes.
Lets say you take the snapshot of pole. You figure out edge pixels (using the Canny edge detector, the Sobel edge detector, or any other thing). Now you want a geometrical representation of the pole’s edge.You want to know its slope, its intercept, etc. But right now the “edge” is just a sequence of pixels.
You can loop through all pixels, and some how figure out the slope and intercept. But that is one difficult task. Images are never perfect.
So you want some mechanism that give more weightage to pixels that are already in a line. This is exactly what the Hough Transform does.
It lets each point on the image “vote”. And because of the mathematical properties of the transform, this “voting” allows us to figure out prominent lines in the image.
A lines is a collection of points. And managing a collection of points is tougher than managing a single point. Obviously. So the first thing we learn is how to represent a line as a single point, without losing any information about it.
This is done through the m-c space.
As shown in the above picture, every line has two quantities associated with it, the slope and the intercept. With these two numbers, you can describe a line completely.
So the parameter space, or the mc space was devised. Every line in the xy space is equivalent to a single point in the mc space. Neat eh?
Now onto the next step. Consider a point (say, (xa, ya) )in the xy space. What would its representation in the mc space be?
For starters, you could guess that infinite lines pass through a point. So, for every line passing through (xa, ya), there would be a point in the mc space.
And you’re correct there, because of the following:
So, a point in the xy space is equivalent to a line in the mc space.
The Hough transform is all about doing what we just learned: converting points in the xy space to lines in the mc space.
You taken an edge detected image, and for every point that is non black, you draw lines in the mc place. Obviously, some lines will intersect. These intersections mark are the parameters of the line.
The following picture will clarify the idea:
The points 1, 2, 3, and 4 are represented as various lines in the mc space. And the intersection of these lines is equivalent to the original line.
OpenCV already comes with a function to perform hough transforms. It lets you choose between different variants of the transform: the standard hough transform, the probabilistic hough transform and the multi-scale hough transform. Here I’ll get into the technical details of getting the command, cvHoughLines2, to work. The command expects and returns parameters in a certain format.
The cvHoughLines2 command goes like this:
CvSeq* cvHoughLines2(CvArr* image, void* line_storage, int method, double rho, double theta, int threshold, double param1=0, double param2=0);
We’ll go into each parameter in detail.
image is the image you want to do the hough transform on. This has to be an 8-bit single channel binary image. Though you can supply a grayscale image, it will be treated as a binary image (non-zero pixels are used).
line_storage is the place where this function stores its result. This can be either aCvMemoryStorage structure or a matrix with N rows. More on this parameter later.
method is either CV_HOUGH_STANDARD, CV_HOUGH_PROBABILISTIC, or CV_HOUGH_MULTI_SCALE. And you can guess they’re for the standard hough transform, the probabilistic hough transform and the multi-scale hough transform.
rho and theta set the desired accuracy of the transform. rho is in pixels and theta is in radians. The smaller the value, the better the transform will be… but it’ll also take more time. Usually, values 1 and 0.01 should be sufficient.
threshold determines which lines are returned. Each line has a particular number of “votes”. This parameter sets the minimum number of “votes” in order to qualify as a potential line. You might want to read about The Hough Transform for more information on this.
param1 and param2 are used by the different transforms.
Getting results out of this command depends on the line_storage parameter. You have two options: supply a CVMat matrix or supply a CvMemoryStorage stucture.
This one is straight forward. You give it a matrix, and the function will populate this matrix with its results. For different method values, this matrix must have different formats:
The function will set the number of rows of the matrix to the number of lines detected. Also, it will return a NULL.
If you provide a memory storage, the function will return a CvSeq* sequence. Using this sequence, you can access the various parameter of the detected lines:
float* currentLine = (float*) cvGetSeqElem(line_seq , index);
A circle can be described completely with three pieces of information: the center (a, b) and the radius. (The center consists of two parts, hence a total of three)
x = a + Rcosθ
y = b + Rsinθ
When the θ varies from 0 to 360, a complete circle of radius R is generated.
So with the Circle Hough Transform, we expect to find triplets of (x, y, R) that are highly probably circles in the image. That is, we want to find three parameters. Thus, the parameter space is 3D… meaning things can get ugly if you don’t tread slowly. Out of memory errors are common even if your programming language uses virtual memory.
So we’ll start simple.
To begin, we’ll start with the assumption that you’re looking for circles of a particular radius, that is, R is known. The equation of each circle is:
x = a + Rcosθ
y = b + Rsinθ
So, every point in the xy space will be equivalent to a circle in the ab space (R isn’t a parameter, we already know it). This is because on rearranging the equations, we get:
a = x1 – Rcosθ
b = y1 – Rsinθ
for a particular point (x1, y1). And θ sweeps from 0 to 360 degrees.
So, the flow of events is something like this:
Here’s an example:
We’d like to find circles in this image. First, detect edges to get an image something like this:
I used a sobel operator to get the images. And finally, for every white pixel in the above image, you create a circle in the ab-space. So, the ab space looks something like this:
The horizontal axis is the ‘a’ axis, the vertical axis is the ‘b’ axis. The brighter a spot, more the number of votes case at the point. And more votes imply a greater probability of a point being a center.
In the above image, you can see the centers clearly. And these points can be easily extracted.
Here’s a superimposed image that might help you understand the idea even better:
In the above image, three random points were chose. Circles of radius R are drawn around them (the red, blue and green circles). And then, votes are cast at the pixels of these circles. Simple as that.
Note that the technique worked even though the entire circle’s perimeter was not visible. The two circles overlapped, and yet they were detected as separate circles.
When the radius is not known, the simplest solution is to just guess. Assume R = 1, and then run the same algorithm. Then assume R = 2, and run it again. Assume R = 3…. and so on.
Whats the upper limit of R? A safe limit would be the length of the diagonal of the image. No possible circle on the image can have a radius greater than or equal to the diagonal.
So, you’ll end up with a 3D parameter space. Each horizontal plane would be equivalent to a 2D parameter space where R is known.
Also, you’ll end up with a double cone around the centers. Here’s an example. This of this like a CAT scan… going through the different slices of ab planes:
The accuracy depends on the number of accumulator cells you have. If you have cells for 0, 0.01, 0.02, 0.03, and so on, you’ll get better results than 0, 1, 2, 3. But at the same time, the amount of memory required increases.
Sometimes, spurious circles are detected. This can happen if the image has a lot of circles. Close to each other. This problem can be overcome by just checking if a circle actually exists at the highest voted centers.
Virtual Fashion Education
"chúng tôi chỉ là tôi tớ của anh em, vì Đức Kitô" (2Cr 4,5b)
News About Tech, Money and Innovation
Modern art using the GPU
Find the perfect theme for your blog.
Learn to Learn
Con tằm đến thác vẫn còn vương tơ
Khoa Vật lý, Đại học Sư phạm Tp.HCM - ĐT :(08)-38352020 - 109
Blog Toán Cao Cấp (M4Ps)
Indulge- Travel, Adventure, & New Experiences
"Behind every stack of books there is a flood of knowledge."
The latest news on WordPress.com and the WordPress community.