Data Scaling Laws for End-to-End Autonomous Driving
By: Alexander Naumann , Xunjiang Gu , Tolga Dimlioglu and more
Potential Business Impact:
Makes self-driving cars learn better with more data.
Autonomous vehicle (AV) stacks have traditionally relied on decomposed approaches, with separate modules handling perception, prediction, and planning. However, this design introduces information loss during inter-module communication, increases computational overhead, and can lead to compounding errors. To address these challenges, recent works have proposed architectures that integrate all components into an end-to-end differentiable model, enabling holistic system optimization. This shift emphasizes data engineering over software integration, offering the potential to enhance system performance by simply scaling up training resources. In this work, we evaluate the performance of a simple end-to-end driving architecture on internal driving datasets ranging in size from 16 to 8192 hours with both open-loop metrics and closed-loop simulations. Specifically, we investigate how much additional training data is needed to achieve a target performance gain, e.g., a 5% improvement in motion prediction accuracy. By understanding the relationship between model performance and training dataset size, we aim to provide insights for data-driven decision-making in autonomous driving development.
Similar Papers
Scaling Laws of Motion Forecasting and Planning -- Technical Report
Machine Learning (CS)
Makes self-driving cars predict and plan better.
Statistical Analysis and End-to-End Performance Evaluation of Traffic Models for Automotive Data
Networking and Internet Architecture
Makes self-driving cars share information faster.
PAVE: An End-to-End Dataset for Production Autonomous Vehicle Evaluation
CV and Pattern Recognition
Tests self-driving cars for safer real-world driving.