Model-based Preference Optimization in Abstractive Summarization without   Human Feedback

Jaepill Choi; Kyubyung Chae; Jiwoo Song; Yohan Jo; Taesup Kim

arXiv:2409.18618·cs.CL·October 3, 2024

Model-based Preference Optimization in Abstractive Summarization without Human Feedback

Jaepill Choi, Kyubyung Chae, Jiwoo Song, Yohan Jo, Taesup Kim

PDF

Open Access 1 Repo

TL;DR

This paper introduces Model-based Preference Optimization (MPO), a novel method to improve abstractive summarization by fine-tuning large language models without human feedback, using model-generated preference data.

Contribution

The paper presents MPO, a new approach that leverages model-generated preferences for fine-tuning LLMs, eliminating the need for costly human feedback in summarization tasks.

Findings

01

MPO significantly improves summary quality across multiple datasets.

02

The method enhances faithfulness and relevance without human-labeled preferences.

03

Model-generated preference data effectively guides model fine-tuning.

Abstract

In abstractive summarization, the challenge of producing concise and accurate summaries arises from the vast amount of information contained in the source document. Consequently, although Large Language Models (LLMs) can generate fluent text, they often introduce inaccuracies by hallucinating content not found in the original source. While supervised fine-tuning methods that maximize likelihood contribute to this issue, they do not consistently enhance the faithfulness of the summaries. Preference-based optimization methods, such as Direct Preference Optimization (DPO), can further refine the model to align with human preferences. However, these methods still heavily depend on costly human feedback. In this work, we introduce a novel and straightforward approach called Model-based Preference Optimization (MPO) to fine-tune LLMs for improved summarization abilities without any human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cjaep/MPO
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Semantic Web and Ontologies · Data Mining Algorithms and Applications

MethodsALIGN