Measuring text summarization factuality using atomic facts entailment   metrics in the context of retrieval augmented generation

N. E. Kriman

arXiv:2408.15171·cs.CL·August 28, 2024

Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation

N. E. Kriman

PDF

Open Access

TL;DR

This paper introduces a method to evaluate the factual accuracy of LLM-generated summaries by leveraging atomic facts entailment metrics, addressing hallucination issues in retrieval augmented generation.

Contribution

It proposes a Naive Bayes-based approach to measure factuality of summaries, providing a new metric for assessing LLM output accuracy.

Findings

01

Effective in detecting factual inaccuracies in summaries

02

Improves reliability of LLM-generated content

03

Addresses hallucination problem in retrieval augmented generation

Abstract

The use of large language models (LLMs) has significantly increased since the introduction of ChatGPT in 2022, demonstrating their value across various applications. However, a major challenge for enterprise and commercial adoption of LLMs is their tendency to generate inaccurate information, a phenomenon known as "hallucination." This project proposes a method for estimating the factuality of a summary generated by LLMs when compared to a source text. Our approach utilizes Naive Bayes classification to assess the accuracy of the content produced.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques