Fine-Tuning Small Reasoning Models for Quantum Field Theory
Nathaniel S. Woodward, Zhiqi Gao, Yurii Kvasiuk, Kendrick M. Smith, Frederic Sala, Moritz M\"unchmeyer

TL;DR
This study fine-tunes small reasoning models on quantum field theory problems, developing a data pipeline and analyzing how reasoning improves through reinforcement learning and supervised fine-tuning.
Contribution
It introduces a novel data generation pipeline and benchmarks for fine-tuning small models on QFT reasoning, with comprehensive analysis of reasoning improvements.
Findings
Models show improved reasoning accuracy after fine-tuning.
The data pipeline enables synthetic and adapted problem generation.
Analysis reveals how reasoning errors evolve during training.
Abstract
Despite the growing application of Large Language Models (LLMs) to theoretical physics, there is little academic exploration into how domain-specific physics reasoning ability develops while training these models. To investigate this, we perform the first academic fine-tuning study of small (7B-parameter) reasoning models dedicated specifically to theoretical physics. Because open-source verifiable training data required to train such capabilities is scarce, we developed a robust data generation pipeline that can both create synthetic problems and make existing human-authored problems suitable for model training. Selecting Quantum Field Theory (QFT) as our primary domain, we generated over 2,500 synthetic problems alongside a curated collection of human-adapted problems sourced from arXiv and standard pedagogical resources. We conduct both Reinforcement Learning (RL) and Supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
