CFBench: A Comprehensive Constraints-Following Benchmark for LLMs
Tao Zhang, Chenglin Zhu, Yanjun Shen, Wenjing Luo, Yan Zhang, Hao, Liang, Tao Zhang, Fan Yang, Mingan Lin, Yujing Qiao, Weipeng Chen, Bin Cui,, Wentao Zhang, Zenan Zhou

TL;DR
CFBench is a large-scale benchmark designed to evaluate LLMs on their ability to comprehensively follow diverse real-world constraints across multiple NLP tasks, addressing limitations of previous fragmented assessments.
Contribution
The paper introduces CFBench, a comprehensive and systematic benchmark with 1,000 samples covering diverse constraints and scenarios, along with an advanced evaluation methodology for LLMs.
Findings
Current LLMs show significant room for improvement in constraints following.
The benchmark reveals varying performance across different constraint types.
Evaluation methodology aligns better with user perceptions of constraint adherence.
Abstract
The adeptness of Large Language Models (LLMs) in comprehending and following natural language instructions is critical for their deployment in sophisticated real-world applications. Existing evaluations mainly focus on fragmented constraints or narrow scenarios, but they overlook the comprehensiveness and authenticity of constraints from the user's perspective. To bridge this gap, we propose CFBench, a large-scale Comprehensive Constraints Following Benchmark for LLMs, featuring 1,000 curated samples that cover more than 200 real-life scenarios and over 50 NLP tasks. CFBench meticulously compiles constraints from real-world instructions and constructs an innovative systematic framework for constraint types, which includes 10 primary categories and over 25 subcategories, and ensures each constraint is seamlessly integrated within the instructions. To make certain that the evaluation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Mathematics, Computing, and Information Processing
MethodsFocus
