CoSineVerifier: Tool-Augmented Answer Verification for Computation-Oriented Scientific Questions
Ruixiang Feng, Zhenwei An, Yuntao Wen, Ran Le, Yiming Jia, Chen Yang, Zongchao Chen, Lisi Chen, Shen Gao, Shuo Shang, Yang Song, Tao Zhang

TL;DR
CoSineVerifier is a tool-augmented answer verifier that uses external computational tools to improve verification accuracy in scientific question answering, especially in algebra and physics domains, outperforming existing methods.
Contribution
The paper introduces CoSineVerifier, a novel tool-augmented verifier with a two-stage training pipeline that enhances verification in computation-oriented scientific questions.
Findings
Achieves state-of-the-art results on VerifyBench-Hard and SCI-Bench.
Outperforms existing verifiers in RLVR tasks on AIME'24 and AIME'25.
Demonstrates strong generalization across STEM and reasoning tasks.
Abstract
Answer verification methods are widely employed in language model training pipelines spanning data curation, evaluation, and reinforcement learning with verifiable rewards (RLVR). While prior work focus on developing unified verifiers applicable across multiple reasoning scenarios, significant challenges remain in computation-oriented scientific domains, such as algebraic equivalence checking and physical constant substitution. In this paper, we introduce \model, a tool-augmented verifier that leverages external executors to perform precise computations and symbolic simplifications. \model enables robust verification that goes beyond simple semantic matching. We propose a novel two-stage pipeline, which begin with cold-start fine-tuning and followed by multi-turn reinforcement learning with tool integration. Extensive experiments conducted on STEM subjects, general QA, and long-form…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
