MV-TAP: Tracking Any Point in Multi-View Videos
By: Jahyeok Koo , Inès Hyeonsu Kim , Mungyeom Kim and more
Potential Business Impact:
Tracks moving things better in many camera views.
Multi-view camera systems enable rich observations of complex real-world scenes, and understanding dynamic objects in multi-view settings has become central to various applications. In this work, we present MV-TAP, a novel point tracker that tracks points across multi-view videos of dynamic scenes by leveraging cross-view information. MV-TAP utilizes camera geometry and a cross-view attention mechanism to aggregate spatio-temporal information across views, enabling more complete and reliable trajectory estimation in multi-view videos. To support this task, we construct a large-scale synthetic training dataset and real-world evaluation sets tailored for multi-view tracking. Extensive experiments demonstrate that MV-TAP outperforms existing point-tracking methods on challenging benchmarks, establishing an effective baseline for advancing research in multi-view point tracking.
Similar Papers
Multi-View 3D Point Tracking
CV and Pattern Recognition
Tracks moving things in 3D with few cameras.
MVTOP: Multi-View Transformer-based Object Pose-Estimation
CV and Pattern Recognition
Helps robots see objects from many angles.
Look Around and Pay Attention: Multi-camera Point Tracking Reimagined with Transformers
CV and Pattern Recognition
Tracks objects better using many cameras.