Harnessing RLHF for Robust Unanswerability Recognition and Trustworthy Response Generation in LLMs

Shuyuan Lin; Lei Duan; Philip Hughes; Yuxuan Sheng

arXiv:2507.16951·cs.CL·July 24, 2025

Harnessing RLHF for Robust Unanswerability Recognition and Trustworthy Response Generation in LLMs

Shuyuan Lin, Lei Duan, Philip Hughes, Yuxuan Sheng

PDF

Open Access

TL;DR

This paper presents SALU, a novel LLM-based approach that integrates unanswerability detection within the model, improving reliability and reducing hallucinations in conversational information retrieval systems.

Contribution

SALU introduces a multi-task training and reinforcement learning framework that enables LLMs to recognize unanswerable questions and abstain appropriately, enhancing trustworthiness.

Findings

01

SALU outperforms baseline systems in accuracy for answering or abstaining.

02

Human evaluations show higher factuality and appropriate abstention.

03

SALU significantly reduces hallucinations in responses.

Abstract

Conversational Information Retrieval (CIR) systems, while offering intuitive access to information, face a significant challenge: reliably handling unanswerable questions to prevent the generation of misleading or hallucinated content. Traditional approaches often rely on external classifiers, which can introduce inconsistencies with the core generative Large Language Models (LLMs). This paper introduces Self-Aware LLM for Unanswerability (SALU), a novel approach that deeply integrates unanswerability detection directly within the LLM's generative process. SALU is trained using a multi-task learning framework for both standard Question Answering (QA) and explicit abstention generation for unanswerable queries. Crucially, it incorporates a confidence-score-guided reinforcement learning with human feedback (RLHF) phase, which explicitly penalizes hallucinated responses and rewards…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques