Vision-Conditioned Variational Bayesian Last Layer Dynamics Models
By: Paul Brunzema , Thomas Lew , Ray Zhang and more
Agile control of robotic systems often requires anticipating how the environment affects system behavior. For example, a driver must perceive the road ahead to anticipate available friction and plan actions accordingly. Achieving such proactive adaptation within autonomous frameworks remains a challenge, particularly under rapidly changing conditions. Traditional modeling approaches often struggle to capture abrupt variations in system behavior, while adaptive methods are inherently reactive and may adapt too late to ensure safety. We propose a vision-conditioned variational Bayesian last-layer dynamics model that leverages visual context to anticipate changes in the environment. The model first learns nominal vehicle dynamics and is then fine-tuned with feature-wise affine transformations of latent features, enabling context-aware dynamics prediction. The resulting model is integrated into an optimal controller for vehicle racing. We validate our method on a Lexus LC500 racing through water puddles. With vision-conditioning, the system completed all 12 attempted laps under varying conditions. In contrast, all baselines without visual context consistently lost control, demonstrating the importance of proactive dynamics adaptation in high-performance applications.
Similar Papers
ARCADE: Adaptive Robot Control with Online Changepoint-Aware Bayesian Dynamics Learning
Robotics
Helps robots learn and fix mistakes instantly.
Vehicle Dynamics Embedded World Models for Autonomous Driving
Robotics
Helps self-driving cars learn to drive any car.
Ego-centric Predictive Model Conditioned on Hand Trajectories
CV and Pattern Recognition
Predicts actions and what happens next.