FeynTune: Large Language Models for High-Energy Theory
By: Paul Richmond , Prarit Agarwal , Borun Chowdhury and more
Potential Business Impact:
Helps scientists understand complex physics faster.
We present specialized Large Language Models for theoretical High-Energy Physics, obtained as 20 fine-tuned variants of the 8-billion parameter Llama-3.1 model. Each variant was trained on arXiv abstracts (through August 2024) from different combinations of hep-th, hep-ph and gr-qc. For a comparative study, we also trained models on datasets that contained abstracts from disparate fields such as the q-bio and cs categories. All models were fine-tuned using two distinct Low-Rank Adaptation fine-tuning approaches and varying dataset sizes, and outperformed the base model on hep-th abstract completion tasks. We compare performance against leading commercial LLMs (ChatGPT, Claude, Gemini, DeepSeek) and derive insights for further developing specialized language models for High-Energy Theoretical Physics.
Similar Papers
Towards EnergyGPT: A Large Language Model Specialized for the Energy Sector
Computation and Language
EnergyGPT understands and writes about energy better.
Low-Resource Fine-Tuning for Multi-Task Structured Information Extraction with a Billion-Parameter Instruction-Tuned Model
Computation and Language
Small AI learns to find info cheaply.
Evolution of meta's llama models and parameter-efficient fine-tuning of large language models: a survey
Artificial Intelligence
Makes smart computer programs learn faster and better.