Fine-Tuning Vision-Language Models for Neutrino Event Analysis in High-Energy Physics Experiments
By: Dikshant Sagar , Kaiwen Yu , Alejandro Yankelevich and more
Potential Business Impact:
Helps scientists identify tiny particles from pictures.
Recent progress in large language models (LLMs) has shown strong potential for multimodal reasoning beyond natural language. In this work, we explore the use of a fine-tuned Vision-Language Model (VLM), based on LLaMA 3.2, for classifying neutrino interactions from pixelated detector images in high-energy physics (HEP) experiments. We benchmark its performance against an established CNN baseline used in experiments like NOvA and DUNE, evaluating metrics such as classification accuracy, precision, recall, and AUC-ROC. Our results show that the VLM not only matches or exceeds CNN performance but also enables richer reasoning and better integration of auxiliary textual or semantic context. These findings suggest that VLMs offer a promising general-purpose backbone for event classification in HEP, paving the way for multimodal approaches in experimental neutrino physics.
Similar Papers
Adapting Vision-Language Models for Neutrino Event Classification in High-Energy Physics
Machine Learning (CS)
Helps scientists find tiny particles in pictures.
Adapting Vision-Language Models for Neutrino Event Classification in High-Energy Physics
Machine Learning (CS)
Helps scientists find tiny particles in pictures.
Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis
CV and Pattern Recognition
Helps doctors understand cancer treatment images better.