FeynTune: Large Language Models for High-Energy Theory

Paul Richmond; Prarit Agarwal; Borun Chowdhury; Vasilis Niarchos; Constantinos Papageorgakis

arXiv:2508.03716·cs.CL·March 2, 2026

FeynTune: Large Language Models for High-Energy Theory

Paul Richmond, Prarit Agarwal, Borun Chowdhury, Vasilis Niarchos, Constantinos Papageorgakis

PDF

1 Models

TL;DR

FeynTune introduces specialized LLMs fine-tuned on high-energy physics abstracts, outperforming base models and commercial LLMs in domain-specific tasks, with insights for future model development.

Contribution

The paper develops and evaluates 20 fine-tuned Llama-3.1 variants tailored for high-energy physics, demonstrating improved performance over base models and commercial LLMs.

Findings

01

Fine-tuned models outperform base Llama-3.1 on hep-th tasks.

02

Models trained on physics abstracts outperform those trained on disparate fields.

03

Insights provided for future development of domain-specific language models.

Abstract

We present specialized Large Language Models for theoretical High-Energy Physics, obtained as 20 fine-tuned variants of the 8-billion parameter Llama-3.1 model. Each variant was trained on arXiv abstracts (through August 2024) from different combinations of hep-th, hep-ph and gr-qc. For a comparative study, we also trained models on datasets that contained abstracts from disparate fields such as the q-bio and cs categories. All models were fine-tuned using two distinct Low-Rank Adaptation fine-tuning approaches and varying dataset sizes, and outperformed the base model on hep-th abstract completion tasks. We compare performance against leading commercial LLMs (ChatGPT, Claude, Gemini, DeepSeek) and derive insights for further developing specialized language models for High-Energy Theoretical Physics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
HWresearch/LLM4HEP
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.