Deep Learning Reforms Image Matching: A Survey and Outlook
By: Shihua Zhang , Zizhuo Li , Kaining Zhang and more
Potential Business Impact:
Teaches computers to see and understand pictures better.
Image matching, which establishes correspondences between two-view images to recover 3D structure and camera geometry, serves as a cornerstone in computer vision and underpins a wide range of applications, including visual localization, 3D reconstruction, and simultaneous localization and mapping (SLAM). Traditional pipelines composed of ``detector-descriptor, feature matcher, outlier filter, and geometric estimator'' falter in challenging scenarios. Recent deep-learning advances have significantly boosted both robustness and accuracy. This survey adopts a unique perspective by comprehensively reviewing how deep learning has incrementally transformed the classical image matching pipeline. Our taxonomy highly aligns with the traditional pipeline in two key aspects: i) the replacement of individual steps in the traditional pipeline with learnable alternatives, including learnable detector-descriptor, outlier filter, and geometric estimator; and ii) the merging of multiple steps into end-to-end learnable modules, encompassing middle-end sparse matcher, end-to-end semi-dense/dense matcher, and pose regressor. We first examine the design principles, advantages, and limitations of both aspects, and then benchmark representative methods on relative pose recovery, homography estimation, and visual localization tasks. Finally, we discuss open challenges and outline promising directions for future research. By systematically categorizing and evaluating deep learning-driven strategies, this survey offers a clear overview of the evolving image matching landscape and highlights key avenues for further innovation.
Similar Papers
Self-Supervised Contrastive Embedding Adaptation for Endoscopic Image Matching
CV and Pattern Recognition
Helps surgeons see better inside bodies.
Monocular visual simultaneous localization and mapping: (r)evolution from geometry to deep learning-based pipelines
Robotics
Helps robots see and map tricky places.
Doctoral Thesis: Geometric Deep Learning For Camera Pose Prediction, Registration, Depth Estimation, and 3D Reconstruction
CV and Pattern Recognition
Makes 3D pictures from photos for VR.