TL;DR
FeynTune introduces specialized LLMs fine-tuned on high-energy physics abstracts, outperforming base models and commercial LLMs in domain-specific tasks, with insights for future model development.
Contribution
The paper develops and evaluates 20 fine-tuned Llama-3.1 variants tailored for high-energy physics, demonstrating improved performance over base models and commercial LLMs.
Findings
Fine-tuned models outperform base Llama-3.1 on hep-th tasks.
Models trained on physics abstracts outperform those trained on disparate fields.
Insights provided for future development of domain-specific language models.
Abstract
We present specialized Large Language Models for theoretical High-Energy Physics, obtained as 20 fine-tuned variants of the 8-billion parameter Llama-3.1 model. Each variant was trained on arXiv abstracts (through August 2024) from different combinations of hep-th, hep-ph and gr-qc. For a comparative study, we also trained models on datasets that contained abstracts from disparate fields such as the q-bio and cs categories. All models were fine-tuned using two distinct Low-Rank Adaptation fine-tuning approaches and varying dataset sizes, and outperformed the base model on hep-th abstract completion tasks. We compare performance against leading commercial LLMs (ChatGPT, Claude, Gemini, DeepSeek) and derive insights for further developing specialized language models for High-Energy Theoretical Physics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
