Fine-Tuning Small Reasoning Models for Quantum Field Theory

Nathaniel S. Woodward; Zhiqi Gao; Yurii Kvasiuk; Kendrick M. Smith; Frederic Sala; Moritz M\"unchmeyer

arXiv:2604.18936·cs.LG·April 22, 2026

Fine-Tuning Small Reasoning Models for Quantum Field Theory

Nathaniel S. Woodward, Zhiqi Gao, Yurii Kvasiuk, Kendrick M. Smith, Frederic Sala, Moritz M\"unchmeyer

PDF

1 Datasets

TL;DR

This study fine-tunes small reasoning models on quantum field theory problems, developing a data pipeline and analyzing how reasoning improves through reinforcement learning and supervised fine-tuning.

Contribution

It introduces a novel data generation pipeline and benchmarks for fine-tuning small models on QFT reasoning, with comprehensive analysis of reasoning improvements.

Findings

01

Models show improved reasoning accuracy after fine-tuning.

02

The data pipeline enables synthetic and adapted problem generation.

03

Analysis reveals how reasoning errors evolve during training.

Abstract

Despite the growing application of Large Language Models (LLMs) to theoretical physics, there is little academic exploration into how domain-specific physics reasoning ability develops while training these models. To investigate this, we perform the first academic fine-tuning study of small (7B-parameter) reasoning models dedicated specifically to theoretical physics. Because open-source verifiable training data required to train such capabilities is scarce, we developed a robust data generation pipeline that can both create synthetic problems and make existing human-authored problems suitable for model training. Selecting Quantum Field Theory (QFT) as our primary domain, we generated over 2,500 synthetic problems alongside a curated collection of human-adapted problems sourced from arXiv and standard pedagogical resources. We conduct both Reinforcement Learning (RL) and Supervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

nswoodward/VerifiableQFT
dataset· 2.0k dl
2.0k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.