When and What to Ask: AskBench and Rubric-Guided RLVR for LLM Clarification

Jiale Zhao; Ke Fang; Lu Cheng

arXiv:2602.11199·cs.CL·April 22, 2026

When and What to Ask: AskBench and Rubric-Guided RLVR for LLM Clarification

Jiale Zhao, Ke Fang, Lu Cheng

PDF

TL;DR

This paper introduces AskBench, an interactive benchmark for evaluating LLM clarification abilities, and proposes rubric-guided RLVR to improve task accuracy and interaction quality.

Contribution

It presents AskBench for systematic evaluation of LLM clarification and introduces RLVR, a reinforcement learning approach guided by structured rubrics to enhance LLM performance.

Findings

01

AskBench effectively evaluates LLM clarification in multi-turn interactions.

02

RLVR improves LLM accuracy and rubric adherence.

03

Models generalize well to unseen domains.

Abstract

Large language models (LLMs) often respond even when prompts omit critical details or include misleading information, leading to hallucinations or reinforced misconceptions. We study how to evaluate and improve LLMs' ability to decide when and what to ask for clarification without sacrificing task performance. We introduce AskBench, an interactive benchmark that converts standard QA pairs into multi-turn interactions with explicit checkpoints. A unified judge loop evaluates final answers and simulates user responses as needed. AskBench covers two settings: AskMind, with intent-deficient queries requiring clarification, and AskOverconfidence, with queries containing false premises that must be identified and corrected. We further propose rubric-guided reinforcement learning with verifier-based rewards (RLVR), which uses structured rubrics to encourage targeted clarification. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.