FM4NPP: A Scaling Foundation Model for Nuclear and Particle Physics
By: David Park , Shuhang Li , Yi Huang and more
Potential Business Impact:
Helps scientists find tiny particles faster.
Large language models have revolutionized artificial intelligence by enabling large, generalizable models trained through self-supervision. This paradigm has inspired the development of scientific foundation models (FMs). However, applying this capability to experimental particle physics is challenging due to the sparse, spatially distributed nature of detector data, which differs dramatically from natural language. This work addresses if an FM for particle physics can scale and generalize across diverse tasks. We introduce a new dataset with more than 11 million particle collision events and a suite of downstream tasks and labeled data for evaluation. We propose a novel self-supervised training method for detector data and demonstrate its neural scalability with models that feature up to 188 million parameters. With frozen weights and task-specific adapters, this FM consistently outperforms baseline models across all downstream tasks. The performance also exhibits robust data-efficient adaptation. Further analysis reveals that the representations extracted by the FM are task-agnostic but can be specialized via a single linear mapping for different downstream tasks.
Similar Papers
Towards a Physics Foundation Model
Machine Learning (CS)
Simulates many physics problems with one program.
Foundation Models for Scientific Discovery: From Paradigm Enhancement to Paradigm Transition
Machine Learning (CS)
Computers now discover science on their own.
An Evaluation of Representation Learning Methods in Particle Physics Foundation Models
Machine Learning (CS)
Teaches computers to understand tiny particles better.