FactAlign: Long-form Factuality Alignment of Large Language Models

Chao-Wei Huang; Yun-Nung Chen

arXiv:2410.01691·cs.CL·October 3, 2024

FactAlign: Long-form Factuality Alignment of Large Language Models

Chao-Wei Huang, Yun-Nung Chen

PDF

Open Access 1 Repo 3 Models 3 Datasets 1 Video

TL;DR

FactAlign is a new framework that improves the factual accuracy of large language models' long-form responses by using fine-grained sentence-level alignment guided by automatic factuality evaluation.

Contribution

It introduces fKTO, a novel sentence-level alignment algorithm that extends KTO, to enhance factuality in long-form LLM outputs, addressing hallucination issues.

Findings

01

Significantly improves factual accuracy of LLM responses

02

Enhances helpfulness without sacrificing factual precision

03

Capable of training LLMs for more informative responses

Abstract

Large language models have demonstrated significant potential as the next-generation information access engines. However, their reliability is hindered by issues of hallucination and generating non-factual content. This is particularly problematic in long-form responses, where assessing and ensuring factual accuracy is complex. In this paper, we address this gap by proposing FactAlign, a novel alignment framework designed to enhance the factuality of LLMs' long-form responses while maintaining their helpfulness. We introduce fKTO, a fine-grained, sentence-level alignment algorithm that extends the Kahneman-Tversky Optimization (KTO) alignment method. Leveraging recent advances in automatic factuality evaluation, FactAlign utilizes fine-grained factuality assessments to guide the alignment process. Our experiments on open-domain prompts and information-seeking questions demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

miulab/factalign
noneOfficial

Models

Datasets

Videos

FactAlign: Long-form Factuality Alignment of Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods