Diffusion-Based Generation and Imputation of Driving Scenarios from Limited Vehicle CAN Data
By: Julian Ripper , Ousama Esbel , Rafael Fietzek and more
Potential Business Impact:
Makes car data better for training computers.
Training deep learning methods on small time series datasets that also include corrupted samples is challenging. Diffusion models have shown to be effective to generate realistic and synthetic data, and correct corrupted samples through imputation. In this context, this paper focuses on generating synthetic yet realistic samples of automotive time series data. We show that denoising diffusion probabilistic models (DDPMs) can effectively solve this task by applying them to a challenging vehicle CAN-dataset with long-term data and a limited number of samples. Therefore, we propose a hybrid generative approach that combines autoregressive and non-autoregressive techniques. We evaluate our approach with two recently proposed DDPM architectures for time series generation, for which we propose several improvements. To evaluate the generated samples, we propose three metrics that quantify physical correctness and test track adherence. Our best model is able to outperform even the training data in terms of physical correctness, while showing plausible driving behavior. Finally, we use our best model to successfully impute physically implausible regions in the training data, thereby improving the data quality.
Similar Papers
LiDAR Point Cloud Image-based Generation Using Denoising Diffusion Probabilistic Models
CV and Pattern Recognition
Makes self-driving cars see better in bad weather.
Synthetic Power Flow Data Generation Using Physics-Informed Denoising Diffusion Probabilistic Models
Machine Learning (CS)
Creates realistic power data for smart grids.
TIMED: Adversarial and Autoregressive Refinement of Diffusion-Based Time Series Generation
Machine Learning (CS)
Creates realistic fake time data for predictions.