Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling
By: Xiaolong Tang , Meina Kan , Shiguang Shan and more
Potential Business Impact:
Teaches self-driving cars to drive safely.
Safe and feasible trajectory planning is essential for real-world autonomous driving systems. However, existing learning-based planning methods often rely on expert demonstrations, which not only lack explicit safety awareness but also risk inheriting unsafe behaviors such as speeding from suboptimal human driving data. Inspired by the success of large language models, we propose Plan-R1, a novel two-stage trajectory planning framework that formulates trajectory planning as a sequential prediction task, guided by explicit planning principles such as safety, comfort, and traffic rule compliance. In the first stage, we train an autoregressive trajectory predictor via next motion token prediction on expert data. In the second stage, we design rule-based rewards (e.g., collision avoidance, speed limits) and fine-tune the model using Group Relative Policy Optimization (GRPO), a reinforcement learning strategy, to align its predictions with these planning principles. Experiments on the nuPlan benchmark demonstrate that our Plan-R1 significantly improves planning safety and feasibility, achieving state-of-the-art performance. Our code will be made public soon.
Similar Papers
Towards Bio-Inspired Robotic Trajectory Planning via Self-Supervised RNN
Robotics
Teaches robot arms to move to new spots.
CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-scale Reinforcement Learning in Autonomous Driving
Robotics
Helps self-driving cars plan safer, smarter routes.
Reliable and Real-Time Highway Trajectory Planning via Hybrid Learning-Optimization Frameworks
Robotics
Makes self-driving cars safely avoid crashes.