Handling Multiple Hypotheses in Coarse-to-Fine Dense Image Matching
By: Matthieu Vilain , Rémi Giraud , Yannick Berthoumieu and more
Potential Business Impact:
Finds better matches between two pictures.
Dense image matching aims to find a correspondent for every pixel of a source image in a partially overlapping target image. State-of-the-art methods typically rely on a coarse-to-fine mechanism where a single correspondent hypothesis is produced per source location at each scale. In challenging cases -- such as at depth discontinuities or when the target image is a strong zoom-in of the source image -- the correspondents of neighboring source locations are often widely spread and predicting a single correspondent hypothesis per source location at each scale may lead to erroneous matches. In this paper, we investigate the idea of predicting multiple correspondent hypotheses per source location at each scale instead. We consider a beam search strategy to propagat multiple hypotheses at each scale and propose integrating these multiple hypotheses into cross-attention layers, resulting in a novel dense matching architecture called BEAMER. BEAMER learns to preserve and propagate multiple hypotheses across scales, making it significantly more robust than state-of-the-art methods, especially at depth discontinuities or when the target image is a strong zoom-in of the source image.
Similar Papers
Handling Multiple Hypotheses in Coarse-to-Fine Dense Image Matching
CV and Pattern Recognition
Finds better picture matches, even with big zooms.
RoMa v2: Harder Better Faster Denser Feature Matching
CV and Pattern Recognition
Makes computers see 3D scenes more accurately.
RoMa v2: Harder Better Faster Denser Feature Matching
CV and Pattern Recognition
Makes computer vision see better in tough scenes.