Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for   Large Language Models

Hongbang Yuan; Yubo Chen; Pengfei Cao; Zhuoran Jin; Kang Liu; Jun Zhao

arXiv:2406.12416·cs.CL·June 28, 2024

Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models

Hongbang Yuan, Yubo Chen, Pengfei Cao, Zhuoran Jin, Kang Liu, Jun Zhao

PDF

Open Access 1 Video

TL;DR

This paper investigates the limitations of current preference-based fine-tuning of large language models in maintaining factuality across different data domains and introduces APEFT, a new method that improves factual accuracy by focusing on individual facts.

Contribution

The paper reveals that under-alignment causes factuality issues under distribution shifts and proposes APEFT, a framework that enhances factuality awareness at the fact level for better model alignment.

Findings

01

APEFT improves factuality by 3.45% on average across datasets.

02

Existing preference learning models show minimal or negative performance on out-of-domain data.

03

Under-alignment, not over-alignment, is the main cause of factuality failure under distribution shifts.

Abstract

Large language models (LLMs) have achieved remarkable success but still tend to generate factually erroneous responses, a phenomenon known as hallucination. A recent trend is to use preference learning to fine-tune models to align with factuality. However, existing work primarily evaluates fine-tuned models on in-domain (ID) datasets and the factuality on out-of-domain (OOD) datasets remains underexplored. In this paper, we conduct a comprehensive evaluation of the factuality of different models tuned by various preference learning algorithms and demonstrate that their performance on OOD datasets either increases minimally or decreases. Subsequently, we reveal that the main cause of model's failure to uphold factuality under a distribution shift is \textbf{under-alignment}, rather than \textbf{over-alignment}, by analyzing the token distribution shift of the models before and after…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

MethodsSeventeen Ways to Call Uphold Helpline Full Guide USA 24 Hour Assistance · ALIGN