$\textit{Dial BeInfo for Faithfulness}$: Improving Factuality of   Information-Seeking Dialogue via Behavioural Fine-Tuning

Evgeniia Razumovskaia; Ivan Vuli\'c; Pavle Markovi\'c; Tomasz Cichy,; Qian Zheng; Tsung-Hsien Wen; Pawe{\l} Budzianowski

arXiv:2311.09800·cs.CL·March 5, 2024·1 cites

$\textit{Dial BeInfo for Faithfulness}$: Improving Factuality of Information-Seeking Dialogue via Behavioural Fine-Tuning

Evgeniia Razumovskaia, Ivan Vuli\'c, Pavle Markovi\'c, Tomasz Cichy,, Qian Zheng, Tsung-Hsien Wen, Pawe{\l} Budzianowski

PDF

Open Access

TL;DR

This paper introduces BeInfo, a behavioural fine-tuning method that significantly enhances the factual accuracy of large language models in information-seeking dialogues, reducing hallucinations and improving real-world applicability.

Contribution

The paper presents BeInfo, a novel behavioural tuning approach that improves the faithfulness of large language models in dialogue systems across multiple datasets and domains.

Findings

01

Models with BeInfo are more faithful to source knowledge.

02

BeInfo improves performance on unseen domains in zero-shot settings.

03

Fine-tuned models outperform GPT-4 on limited real-world dialogues.

Abstract

Factuality is a crucial requirement in information seeking dialogue: the system should respond to the user's queries so that the responses are meaningful and aligned with the knowledge provided to the system. However, most modern large language models suffer from hallucinations, that is, they generate responses not supported by or contradicting the knowledge source. To mitigate the issue and increase faithfulness of information-seeking dialogue systems, we introduce BeInfo, a simple yet effective method that applies behavioural tuning to aid information-seeking dialogue. Relying on three standard datasets, we show that models tuned with BeInfo} become considerably more faithful to the knowledge source both for datasets and domains seen during BeInfo-tuning, as well as on unseen domains, when applied in a zero-shot manner. In addition, we show that the models with 3B parameters (e.g.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications