ColliderML: The First Release of an OpenDataDetector High-Luminosity Physics Benchmark Dataset
By: Doğa Elitez , Paul Gessinger , Daniel Murnane and more
We introduce ColliderML - a large, open, experiment-agnostic dataset of fully simulated and digitised proton-proton collisions in High-Luminosity Large Hadron Collider conditions ($\sqrt{s}=14$ TeV, mean pile-up $μ= 200$). ColliderML provides one million events across ten Standard Model and Beyond Standard Model processes, plus extensive single-particle samples, all produced with modern next-to-leading order matrix element calculation and showering, realistic per-event pile-up overlay, a validated OpenDataDetector geometry, and standard reconstructions. The release fills a major gap for machine learning (ML) research on detector-level data, provided on the ML-friendly Hugging Face platform. We present physics coverage and the generation, simulation, digitisation and reconstruction pipeline, describe format and access, and initial collider physics benchmarks.
Similar Papers
Real-Time Analysis of Unstructured Data with Machine Learning on Heterogeneous Architectures
High Energy Physics - Experiment
Helps find tiny particles faster with less power.
A First Full Physics Benchmark for Highly Granular Calorimeter Surrogates
High Energy Physics - Experiment
Makes particle detectors faster and more accurate.
Real-Time Analysis of Unstructured Data with Machine Learning on Heterogeneous Architectures
Data Analysis, Statistics and Probability
Helps find tiny particles faster with less power.