Memorize Theorems, Not Instances: Probing SFT Generalization through Mathematical Reasoning
Ruiying Peng, Mengyu Yang, Jing Lei, Xiaohui Li, Xueyu Wu, Xinlei Chen

TL;DR
This paper introduces Theorem-SFT, a fine-tuning method that improves reasoning generalization by focusing on theorem application rather than surface answer patterns, leading to significant performance gains.
Contribution
Theorem-SFT reorients supervision towards explicit theorem application, reducing reliance on spurious correlations and enhancing reasoning generalization across models and benchmarks.
Findings
+8.8% on MATH benchmark with Theorem-SFT
+20.27% on GeoQA benchmark with Theorem-SFT
Fine-tuning MLP layers alone matches full-layer performance
Abstract
Supervised Fine-Tuning (SFT) is widely used for task-specific adaptation, yet recent work shows it systematically undermines reasoning generalization. We argue the root cause is not memorization itself, but its target: vanilla SFT drives models to exploit and memorize spurious surface correlations in problem-solution pairs, leaving them brittle to superficial input variations. To address this, we propose Theorem-SFT, which reorients supervision toward explicit theorem application by teaching models how rules are invoked rather than what answers look like. Theorem-SFT yields consistent gains across benchmarks and model families: +8.8% on MATH (LLaMA3.2-3B-Instruct) and +20.27% on GeoQA (Qwen2.5-VL-7B-Instruct) without modality-specific re-training. Fine-tuning MLP layers alone matches full-layers performance, implicating feed-forward components as the primary locus of reasoning rules.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
