Reliable Use of Lemmas via Eligibility Reasoning and Section$-$Aware Reinforcement Learning

Zhikun Xu; Xiaodong Yu; Ben Zhou; Jiang Liu; Jialian Wu; Ze Wang; Ximeng Sun; Hao Chen; Zicheng Liu

arXiv:2602.00998·cs.CL·February 3, 2026

Reliable Use of Lemmas via Eligibility Reasoning and Section$-$Aware Reinforcement Learning

Zhikun Xu, Xiaodong Yu, Ben Zhou, Jiang Liu, Jialian Wu, Ze Wang, Ximeng Sun, Hao Chen, Zicheng Liu

PDF

Open Access

TL;DR

This paper introduces RULES, a reinforcement learning framework that improves large language models' ability to correctly judge the usefulness of lemmas by formalizing the task and incorporating section-aware loss masking, leading to more robust lemma validation.

Contribution

The paper proposes a novel structured prediction approach with section-aware reinforcement learning to enhance lemma judgment accuracy in language models.

Findings

01

Consistent in-domain improvements over baseline models.

02

Enhanced robustness against perturbations that break applicability.

03

Maintained or slightly improved end-to-end task performance.

Abstract

Recent large language models (LLMs) perform strongly on mathematical benchmarks yet often misapply lemmas, importing conclusions without validating assumptions. We formalize lemma $-$ judging as a structured prediction task: given a statement and a candidate lemma, the model must output a precondition check and a conclusion $-$ utility check, from which a usefulness decision is derived. We present RULES, which encodes this specification via a two $-$ section output and trains with reinforcement learning plus section $-$ aware loss masking to assign penalty to the section responsible for errors. Training and evaluation draw on diverse natural language and formal proof corpora; robustness is assessed with a held $-$ out perturbation suite; and end $-$ to $-$ end evaluation spans competition $-$ style, perturbation $-$ aligned, and theorem $-$ based problems across various LLMs. Results show consistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms