IntentGrasp: A Comprehensive Benchmark for Intent Understanding

Yuwei Yin; Chuyuan Li; Giuseppe Carenini

arXiv:2605.06832·cs.CL·May 11, 2026

IntentGrasp: A Comprehensive Benchmark for Intent Understanding

Yuwei Yin, Chuyuan Li, Giuseppe Carenini

PDF

1 Repo 1 Datasets

TL;DR

IntentGrasp is a new comprehensive benchmark for evaluating and improving the intent understanding capabilities of large language models across diverse domains.

Contribution

It introduces a large-scale dataset, evaluation sets, and a novel fine-tuning method, Intentional Fine-Tuning (IFT), to significantly enhance LLMs' intent understanding.

Findings

01

Most LLMs perform poorly on intent understanding, with scores below 60%.

02

IFT improves intent understanding scores by over 30 F1 points.

03

Models trained with IFT generalize well across different domains.

Abstract

Accurately understanding the intent behind speech, conversation, and writing is crucial to the development of helpful Large Language Model (LLM) assistants. This paper introduces IntentGrasp, a comprehensive benchmark for evaluating the intent understanding capability of LLMs. Derived from 49 high-quality, open-licensed corpora spanning 12 diverse domains, IntentGrasp is constructed through source datasets curation, intent label contextualization, and task format unification. IntentGrasp contains a large-scale training set of 262,759 instances and two evaluation sets: an All Set of 12,909 test cases and a more balanced and challenging Gem Set of 470 cases. Extensive evaluations on 20 LLMs across 7 families (including frontier models such as GPT-5.4, Gemini-3.1-Pro, and Claude-Opus-4.7) demonstrate unsatisfactory performance, with scores below 60% on All Set and below 25% on Gem set.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuweiyin/IntentGrasp
github

Datasets

yuweiyin/IntentGrasp
dataset· 1.2k dl
1.2k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.