Scale Invariant Feature Transform by David Lowe Short Explanation of The Approach by Michela Lecca
Scale Invariant Feature Transform by David Lowe Short Explanation of The Approach by Michela Lecca
What is SIFT ?
SIFT is an algorithm developed by David Lowe in 2004 for
the extraction of interest points from gray-level images. The algorithm is described in D. Lowe. Distinctive Image Features from ScaleInvariant Keypoints. Int. Journal of Computer Vision, 2004 A C++ implementation is available on the net http://www.vlfeat.org/~vedaldi/code/siftpp.html
What is SIFT ?
The input is a gray-level image. The output is a list of 2D
points on the image each associated to a vector of lowlevel descriptors. These points are said keypoints and their descriptors are invariant by rescaling, in-plane rotating, noise addition and in some cases by changes of illuminant.
Keypoints provide a local image description. They are used to find visual correspondences between
SIFT: Application
Image Alignment Example
SIFT: Application
Image Correspondences
SIFT: Application
Object Recognition
SIFT: Application
Object Recognition
Work Flow
IMAGE
SCALE-SPACE IMAGE REPRESENTATION KEYPOINTS COMPUTATION BY DoG
KEYPOINTS ORIENTATION
SIFT DESCRIPTOR
Scale-Space Representation
SIFT describes an image or a portion of it by interest
At each level of the pyramid the image is rescaled (sub-sampled) and smoothed by a Gaussian
Scale-Space Representation
The SIFT scale-space image representation consists of a
set of N octaves defined by two parameters and . Let be the input image. Each octave is an ordered set of + 3 images such that
with and
i-th sub-sample of .
and
SIFT Octaves
Suppose s = 2.
extracted by SIFT are corners, i.e. discontinuity points of the gradient function:
of the DoG in each octave is very fast and efficient. In fact the DoG is obtained by subtraction of subsequent images in the considered octave.
Keypoints Computation
The keypoints are the extrema of the DoG functions, i.e.
Keypoints Computation
The location of the extrema is refined by considering a
parabolic fit. Due to the re-iterated Gaussian filtering, many extrema exhibit small values of the contrast. These keypoints are not robust to noise and they are generally not relavant for the description of the image. Two filters are used to discard the keypoints with small contrast and the edges, that are not discriminative for the image. This step is achieved by considering the approximation of the DoG gradient by the Taylor polynom truncated at the first order.
SIFT descriptors
Each keypoint is now codified as a triplet (x, y, s) whose
The orientation of the gradient of the points in N is represented by an histogram H with 36 bins. The peak of H is assigned to (x, y, s), so that the keypoint is described now by a vector (x, y, s, q), where q is the orientation of the peak of H. If there are more peaks q1, , qn more keypoints (x, y, s, q1), , (x, y, s, qn) are generated.
SIFT descriptors
Each keypoint is now codified as a triplet (x, y, s) whose
The orientation of the gradient of the points in N is represented by an histogram H with 36 bins. The peak of H is assigned to (x, y, s), so that the keypoint is described now by a vector (x, y, s, q), where q is the orientation of the peak of H. If there are more peaks q1, , qn more keypoints (x, y, s, q1), , (x, y, s, qn) are generated.
SIFT descriptors
For each keypoint P a squared region R around P is
considered and partitioned in 4x4 parts. An histogram with 8 bins is used for representing the orientation of the points in each of the sub-regions of R. The final descriptor associated to P is a vector that concatenate the histograms of the sub-regions of R. The descriptor vector has (4x4)x 8 = 128 entries.
Matching
Lowe proposes a method for matching the keypoints. Let R, Q be the lists with the keypoints of two images I1, I2.
References
[SIFT] D. Lowe. Distinctive Image Features from Scale-
Invariant Keypoints. Int. Journal of Computer Vision, 2004 [GLOH] Mikolajczyk, K. and Schmid, C. 2005. A Performance Evaluation of Local Descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27, 10 (Oct. 2005), 1615-1630. [SURF] H. Bay, A. Ess, T. Tuytelaars, L. Van Gool. SURF: Speeded Up Robust Features, Computer Vision and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346-359, 2008 [PCA-SIFT] Y. Ke and R. Sukthankar, PCA-SIFT: A More Distinctive Representation for Local Image DescriptorsComputer Vision and Pattern Recognition, 2004