Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via   Self-Evaluation

Xiaoying Zhang; Baolin Peng; Ye Tian; Jingyan Zhou; Lifeng Jin,; Linfeng Song; Haitao Mi; Helen Meng

arXiv:2402.09267·cs.CL·June 12, 2024·2 cites

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin,, Linfeng Song, Haitao Mi, Helen Meng

PDF

Open Access 1 Video

TL;DR

This paper introduces a self-alignment method that uses an LLM's self-evaluation to improve its factual accuracy, reducing hallucinations without relying on external annotations.

Contribution

It proposes Self-Eval and Self-Knowledge Tuning to enable LLMs to self-assess and improve factuality through internal signals, advancing factual accuracy in language models.

Findings

01

Significant reduction in hallucinations on TruthfulQA and BioGEN tasks.

02

Enhanced confidence calibration in LLMs after self-alignment.

03

Improved factual accuracy over baseline Llama models.

Abstract

Despite showing increasingly human-like abilities, large language models (LLMs) often struggle with factual inaccuracies, i.e. "hallucinations", even when they hold relevant knowledge. To address these hallucinations, current approaches typically necessitate high-quality human factuality annotations. In this work, we explore Self-Alignment for Factuality, where we leverage the self-evaluation capability of an LLM to provide training signals that steer the model towards factuality. Specifically, we incorporate Self-Eval, a self-evaluation component, to prompt an LLM to validate the factuality of its own generated responses solely based on its internal knowledge. Additionally, we design Self-Knowledge Tuning (SK-Tuning) to augment the LLM's self-evaluation ability by improving the model's confidence estimation and calibration. We then utilize these self-annotated responses to fine-tune…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation· underline

Taxonomy

TopicsBlockchain Technology Applications and Security