Score: 1

RoaD: Rollouts as Demonstrations for Closed-Loop Supervised Fine-Tuning of Autonomous Driving Policies

Published: December 1, 2025 | arXiv ID: 2512.01993v1

By: Guillermo Garcia-Cobo , Maximilian Igl , Peter Karkus and more

BigTech Affiliations: NVIDIA

Potential Business Impact:

Teaches self-driving cars to learn from their own mistakes.

Business Areas:

Autonomous Vehicles Transportation

Autonomous driving policies are typically trained via open-loop behavior cloning of human demonstrations. However, such policies suffer from covariate shift when deployed in closed loop, leading to compounding errors. We introduce Rollouts as Demonstrations (RoaD), a simple and efficient method to mitigate covariate shift by leveraging the policy's own closed-loop rollouts as additional training data. During rollout generation, RoaD incorporates expert guidance to bias trajectories toward high-quality behavior, producing informative yet realistic demonstrations for fine-tuning. This approach enables robust closed-loop adaptation with orders of magnitude less data than reinforcement learning, and avoids restrictive assumptions of prior closed-loop supervised fine-tuning (CL-SFT) methods, allowing broader applications domains including end-to-end driving. We demonstrate the effectiveness of RoaD on WOSAC, a large-scale traffic simulation benchmark, where it performs similar or better than the prior CL-SFT method; and in AlpaSim, a high-fidelity neural reconstruction-based simulator for end-to-end driving, where it improves driving score by 41\% and reduces collisions by 54\%.

Bootstrapping Reinforcement Learning with Sub-optimal Policies for Autonomous Driving

Robotics

Teaches self-driving cars to learn faster.

4 Sep 2025 0

87%

RaC: Robot Learning for Long-Horizon Tasks by Scaling Recovery and Correction

Robotics

Teaches robots to fix their own mistakes.

9 Sep 2025 1

87%

OpenREAD: Reinforced Open-Ended Reasoing for End-to-End Autonomous Driving with LLM-as-Critic

CV and Pattern Recognition

Teaches cars to think and drive better.

1 Dec 2025 2

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

12 pages

RoaD: Rollouts as Demonstrations for Closed-Loop Supervised Fine-Tuning of Autonomous Driving Policies

Teaches self-driving cars to learn from their own mistakes.

Technical Abstract

Bootstrapping Reinforcement Learning with Sub-optimal Policies for Autonomous Driving

RaC: Robot Learning for Long-Horizon Tasks by Scaling Recovery and Correction

OpenREAD: Reinforced Open-Ended Reasoing for End-to-End Autonomous Driving with LLM-as-Critic