Approaches to the correspondence problem can be broadly classified into two categories: the intensity-based matching and the feature-based matching techniques. In the first category, the matching process is applied directly to the intensity profiles of the two images, while in the second, features are first extracted from the images and the matching process is applied to the features.
As shown in the previous section, the epipolar lines coincide with the horizontal scanlines if the cameras are parallel, the corresponding points in both images must therefore lie on the same horizontal scanline. Such stereo configurations reduce the search for correspondences from two-dimensions (the entire image) to one-dimension. In fact, a close look at the intensity profiles from the corresponding row of the image pair reveals that the two intensity profiles differ only by a horizontal shift and a local foreshortening. Fig. 5(a) and (b) depict the images taken with a camera that undergoes a displacement in the horizontal direction, the image pair therefore corresponds to a parallel camera set up. Two black lines are marked at rows 80 and 230 in both images. Fig. 5(c) and (d), (e) and (f), respectively show the intensity profiles of row 80 and row 230 of the two images.
![]() |
The similarity between the one-dimensional intensity profiles of the two images suggests an optimization process would be suitable. Indeed, Barn in 1987 attempted matching the parallel stereo images using simulated annealing. He defined an energy function Eij as:
where denotes the intensity value of the left image at the i-th row and j-th column and
denotes the intensity value of the right image at the same row but at the k-th column; D(i,j) is the disparity value (or horizontal shift in this case) at the ij-position of the left image.
The above is clearly a constrained optimization problem in which the only constraint being used is a minimum change of disparity values . This constraint is commonly known as the continuity constraint (see Section
).
Robe later incorporated the use of a multiresolution scheme (see Section ) together with a smoothness constraint similar to that of Barn into the constrained optimization process. In addition to the horizontal shift of corresponding pixels, they also allowed the corresponding pixels to undergo vertical shift (i.e. disparity in the vertical direction), so their matching method is not restricted to only parallel stereo images. The energy function to be minimized, as expected, is more complicated than the one given above.
The advantage of this intensity profile matching is that a dense disparity map and, consequently a dense depth (or range) map, is output. Unfortunately, like all constrained optimization problems, whether the system would converge to the global minima is still an open problem, although, as reported by Robe92, the multiresolution scheme, to a certain extent, helped speed up convergence and avoid local minima.
An alternative approach in intensity-based stereo matching, commonly known as the window-based method, is to only match those regions in the images that are “interesting”, for instance, regions that contain high variation of intensity values in the horizontal, vertical, and diagonal directions. The simple Moravec’s interest operator (1979) detects such regions (correspond to regions that have grey-level corners) from the image pair, and it has been widely used in many stereo matching systems (e.g. the SRI STEREOSYS system (Hann, 1985)). After the interesting regions are detected, a simple correlation scheme is applied in the matching process; a match is assigned to regions that are highly correlated in the two images.
The problem associated with this window-based approach is that the size of the correlation windows must be carefully chosen. If the correlation windows are too small, the intensity variation in the windows will not be distinctive enough, and many false matches may result. If they are too large, resolution is lost, since neighbouring image regions with different disparities will be combined in the measurement. Worse, the two windows may not correlate unless the disparity within the windows is constant, which suggests that the multiresolution scheme is again appropriate. Unfortunately, the most serious shortcoming of the window-based approach — its sensitivity to the differences in foreshortening — may sometimes render the approach useless. Fig. 6 shows a segment MN in the scene
projecting onto and
at segments
and
, respectively. Because of the large difference between the orientations of the retinal planes and the scene plane, segment
is much longer than segment
.The windows in the two images would only give the best correlation measure if the window used in
has the size of
and the window used in
has the size of
.Such variation of window size to compensate foreshortening is not possible without the knowledge of the scene. This appears to pose a chicken-and-egg problem, since the whole point is to recover shape using correlation matching.
In the feature-based approach, the image pair is first preprocessed by an operator so as to extract the features that are stable under the change of viewpoint, the matching process is then applied to theattributes associated with the detected features. The obvious question here is what type of features that one should use? Edge elements, corners, line segments, and curve segments are features that are robust against the change of perspective, and they have been widely used in many stereo vision work. Edge elements and corners are easy to detect, but may suffer from occlusion; line and curve segments require extra computation time, but are more robust against occlusion (they are longer and so are less likely to be completely occluded). Higher level image features such as circles, ellipses, and polygonal regions have also been used as features for stereo matching, these features are, however, restricted to images of indoor scenes.
Most feature-based stereo matching systems are not restricted to using only a specific type of features, instead, a collection of feature types is incorporated. For instance, the system proposed by Weng in 1988 combines intensity, edges, and corners to form multiple attributes for matching; Lim and Binford (1987), on the other hand, used a hierarchy of features varying from edges, curves, to surfaces and bodies (2-D regions) for high-level attribute matching.
It should be noted that, due to image noise, the end-points (and thus the mid-points also) of line segments are normally not reliably detected, stereo matching process that relies on the coordinates of these points do not produce good reconstruction of 3-D coordinates. In fact, for a pair of matching line segments, any point on the first line segment can correspond to every other point on the second line segment, and this ambiguity can only be resolved if the end-points of the two line segments are known exactly. Using line segments as features for matching also has the following drawbacks:
Polygonal regions are very high-level features and could be costly to extract.
Stereo matching process is a very difficult search procedure. In order to minimum false matches, some matching constraints must be imposed. Below is a list of the commonly used constraints.
By introducing one more camera into the system, the ambiguity involved in matching can be further reduced: Given a feature point and potential matches
and
, an epipolar line
can be constructed using m and the epipolar geometry between
and
, another epipolar line
can also be constructed using m‘ and the epipolar geometry between
and
. The two epipolar lines
and
must intersect at m” if
.
Other criteria may also be used in the update process, e.g. the geometric support as proposed by Ayache and Faverjon (1987).
The coarse-to-fine multiresolution matching scheme works as follows:
![]() |
An image pyramid of edges can be obtained by convolving the images with a Laplacian of Gaussian filter of different widths ()followed by a detection of zero-crossings (Fig. 8). An image pyramid of grey level images can be obtained by applying a smoothing operation followed by sub-sampling (i.e. reducing the image resolution by a factor of 2). Such image pyramid is sometimes referred to as the processing cone (Fig. 9). The SRI STEREOSYS system described later used this smooth and sub-sample scheme.
![]() |
Source:
Content:
Source:
http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWENS/LECT11/lect11.html
Virtual Fashion Education
"chúng tôi chỉ là tôi tớ của anh em, vì Đức Kitô" (2Cr 4,5b)
hienphap.net
News About Tech, Money and Innovation
Modern art using the GPU
Find the perfect theme for your blog.
Learn to Learn
Con tằm đến thác vẫn còn vương tơ
Khoa Vật lý, Đại học Sư phạm Tp.HCM - ĐT :(08)-38352020 - 109
Blog Toán Cao Cấp (M4Ps)
Indulge- Travel, Adventure, & New Experiences
"Behind every stack of books there is a flood of knowledge."
The latest news on WordPress.com and the WordPress community.