VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

Hao Peng; Yunjia Qi; Xiaozhi Wang; Bin Xu; Lei Hou; Juanzi Li

arXiv:2506.09942·cs.CL·June 12, 2025

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

Hao Peng, Yunjia Qi, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li

PDF

Open Access 1 Repo 4 Models 2 Datasets 1 Video

TL;DR

This paper introduces VerIF, a verification method combining rule-based and LLM-based verification to improve reinforcement learning for instruction-following large language models, achieving state-of-the-art results.

Contribution

The paper proposes VerIF, a novel verification approach for RL in instruction following, supported by a new dataset and demonstrated to enhance model performance.

Findings

01

Models trained with VerIF outperform previous methods on benchmarks.

02

VerIF models generalize well to unseen constraints.

03

The approach maintains overall model capabilities.

Abstract

Reinforcement learning with verifiable rewards (RLVR) has become a key technique for enhancing large language models (LLMs), with verification engineering playing a central role. However, best practices for RL in instruction following remain underexplored. In this work, we explore the verification challenge in RL for instruction following and propose VerIF, a verification method that combines rule-based code verification with LLM-based verification from a large reasoning model (e.g., QwQ-32B). To support this approach, we construct a high-quality instruction-following dataset, VerInstruct, containing approximately 22,000 instances with associated verification signals. We apply RL training with VerIF to two models, achieving significant improvements across several representative instruction-following benchmarks. The trained models reach state-of-the-art performance among models of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thu-keg/verif
pytorchOfficial

Models

Datasets

Videos

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following· underline

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms