Score: 0

Diffusion-Based Generation and Imputation of Driving Scenarios from Limited Vehicle CAN Data

Published: September 15, 2025 | arXiv ID: 2509.12375v1

By: Julian Ripper , Ousama Esbel , Rafael Fietzek and more

Potential Business Impact:

Makes car data better for training computers.

Business Areas:

Autonomous Vehicles Transportation

Training deep learning methods on small time series datasets that also include corrupted samples is challenging. Diffusion models have shown to be effective to generate realistic and synthetic data, and correct corrupted samples through imputation. In this context, this paper focuses on generating synthetic yet realistic samples of automotive time series data. We show that denoising diffusion probabilistic models (DDPMs) can effectively solve this task by applying them to a challenging vehicle CAN-dataset with long-term data and a limited number of samples. Therefore, we propose a hybrid generative approach that combines autoregressive and non-autoregressive techniques. We evaluate our approach with two recently proposed DDPM architectures for time series generation, for which we propose several improvements. To evaluate the generated samples, we propose three metrics that quantify physical correctness and test track adherence. Our best model is able to outperform even the training data in terms of physical correctness, while showing plausible driving behavior. Finally, we use our best model to successfully impute physically implausible regions in the training data, thereby improving the data quality.

LiDAR Point Cloud Image-based Generation Using Denoising Diffusion Probabilistic Models

CV and Pattern Recognition

Makes self-driving cars see better in bad weather.

23 Sep 2025 1

90%

Synthetic Power Flow Data Generation Using Physics-Informed Denoising Diffusion Probabilistic Models

Machine Learning (CS)

Creates realistic power data for smart grids.

24 Apr 2025 0

89%

TIMED: Adversarial and Autoregressive Refinement of Diffusion-Based Time Series Generation

Machine Learning (CS)

Creates realistic fake time data for predictions.

23 Sep 2025 1

View PDF Login to Bookmark

Page Count

6 pages

Diffusion-Based Generation and Imputation of Driving Scenarios from Limited Vehicle CAN Data

Makes car data better for training computers.

Technical Abstract

LiDAR Point Cloud Image-based Generation Using Denoising Diffusion Probabilistic Models

Synthetic Power Flow Data Generation Using Physics-Informed Denoising Diffusion Probabilistic Models

TIMED: Adversarial and Autoregressive Refinement of Diffusion-Based Time Series Generation