Strategic Vantage Selection for Learning Viewpoint-Agnostic Manipulation Policies
By: Sreevishakh Vasudevan, Som Sagar, Ransalu Senanayake
Potential Business Impact:
Teaches robots to grab things from any angle.
Vision-based manipulation has shown remarkable success, achieving promising performance across a range of tasks. However, these manipulation policies often fail to generalize beyond their training viewpoints, which is a persistent challenge in achieving perspective-agnostic manipulation, especially in settings where the camera is expected to move at runtime. Although collecting data from many angles seems a natural solution, such a naive approach is both resource-intensive and degrades manipulation policy performance due to excessive and unstructured visual diversity. This paper proposes Vantage, a framework that systematically identifies and integrates data from optimal perspectives to train robust, viewpoint-agnostic policies. By formulating viewpoint selection as a continuous optimization problem, we iteratively fine-tune policies on a few vantage points. Since we leverage Bayesian optimization to efficiently navigate the infinite space of potential camera configurations, we are able to balance exploration of novel views and exploitation of high-performing ones, thereby ensuring data collection from a minimal number of effective viewpoints. We empirically evaluate this framework on diverse standard manipulation tasks using multiple policy learning methods, demonstrating that fine-tuning with data from strategic camera placements yields substantial performance gains, achieving average improvements of up to 46.19% when compared to fixed, random, or heuristic-based strategies.
Similar Papers
Learning Activity View-invariance Under Extreme Viewpoint Changes via Curriculum Knowledge Distillation
CV and Pattern Recognition
Teaches computers to understand videos from any angle.
NVSPolicy: Adaptive Novel-View Synthesis for Generalizable Language-Conditioned Policy Learning
Robotics
Robots see more, learn faster, and do tasks better.
Zero-Shot Visual Generalization in Robot Manipulation
Robotics
Robots learn to do tasks in new places.