Visual Place Recognition for Large-Scale UAV Applications
By: Ioannis Tsampikos Papapetros, Ioannis Kansizoglou, Antonios Gasteratos
Potential Business Impact:
Helps drones find their way using many pictures.
Visual Place Recognition (vPR) plays a crucial role in Unmanned Aerial Vehicle (UAV) navigation, enabling robust localization across diverse environments. Despite significant advancements, aerial vPR faces unique challenges due to the limited availability of large-scale, high-altitude datasets, which limits model generalization, along with the inherent rotational ambiguity in UAV imagery. To address these challenges, we introduce LASED, a large-scale aerial dataset with approximately one million images, systematically sampled from 170,000 unique locations throughout Estonia over a decade, offering extensive geographic and temporal diversity. Its structured design ensures clear place separation significantly enhancing model training for aerial scenarios. Furthermore, we propose the integration of steerable Convolutional Neural Networks (CNNs) to explicitly handle rotational variance, leveraging their inherent rotational equivariance to produce robust, orientation-invariant feature representations. Our extensive benchmarking demonstrates that models trained on LASED achieve significantly higher recall compared to those trained on smaller, less diverse datasets, highlighting the benefits of extensive geographic coverage and temporal diversity. Moreover, steerable CNNs effectively address rotational ambiguity inherent in aerial imagery, consistently outperforming conventional convolutional architectures, achieving on average 12\% recall improvement over the best-performing non-steerable network. By combining structured, large-scale datasets with rotation-equivariant neural networks, our approach significantly enhances model robustness and generalization for aerial vPR.
Similar Papers
Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
CV and Pattern Recognition
Helps cars find their way using pictures and scans.
DSFormer: A Dual-Scale Cross-Learning Transformer for Visual Place Recognition
CV and Pattern Recognition
Helps robots find their way in new places.
CQVPR: Landmark-aware Contextual Queries for Visual Place Recognition
CV and Pattern Recognition
Helps cameras find places by seeing surroundings.