LVLM-MPC Collaboration for Autonomous Driving: A Safety-Aware and Task-Scalable Control Architecture
By: Kazuki Atsuta , Kohei Honda , Hiroyuki Okuda and more
Potential Business Impact:
Makes self-driving cars safer and smarter.
This paper proposes a novel Large Vision-Language Model (LVLM) and Model Predictive Control (MPC) integration framework that delivers both task scalability and safety for Autonomous Driving (AD). LVLMs excel at high-level task planning across diverse driving scenarios. However, since these foundation models are not specifically designed for driving and their reasoning is not consistent with the feasibility of low-level motion planning, concerns remain regarding safety and smooth task switching. This paper integrates LVLMs with MPC Builder, which automatically generates MPCs on demand, based on symbolic task commands generated by the LVLM, while ensuring optimality and safety. The generated MPCs can strongly assist the execution or rejection of LVLM-driven task switching by providing feedback on the feasibility of the given tasks and generating task-switching-aware MPCs. Our approach provides a safe, flexible, and adaptable control framework, bridging the gap between cutting-edge foundation models and reliable vehicle operation. We demonstrate the effectiveness of our approach through a simulation experiment, showing that our system can safely and effectively handle highway driving while maintaining the flexibility and adaptability of LVLMs.
Similar Papers
Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation
Robotics
Robots understand goals and act better.
Toward Automatic Safe Driving Instruction: A Large-Scale Vision Language Model Approach
CV and Pattern Recognition
Helps cars watch drivers and roads for safety.
Real-Time LPV-Based Non-Linear Model Predictive Control for Robust Trajectory Tracking in Autonomous Vehicles
Robotics
Helps self-driving cars steer perfectly.