Score: 2

From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving

Published: August 9, 2025 | arXiv ID: 2508.07029v2

By: Antonio Guillen-Perez

Potential Business Impact:

Teaches self-driving cars to avoid crashes.

Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.

From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving

Machine Learning (CS)

Teaches self-driving cars to learn from mistakes.

9 Aug 2025 2

90%

Behavior-Adaptive Q-Learning: A Unifying Framework for Offline-to-Online RL

Machine Learning (CS)

Helps robots learn safely from past mistakes.

5 Nov 2025 1

89%

Benchmarking Offline Reinforcement Learning for Emotion-Adaptive Social Robotics

Robotics

Teaches robots to understand feelings from old data.

21 Sep 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

13 pages

From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving

Teaches self-driving cars to avoid crashes.

Technical Abstract

From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving

Behavior-Adaptive Q-Learning: A Unifying Framework for Offline-to-Online RL

Benchmarking Offline Reinforcement Learning for Emotion-Adaptive Social Robotics