Model-based Preference Optimization in Abstractive Summarization without Human Feedback
Jaepill Choi, Kyubyung Chae, Jiwoo Song, Yohan Jo, Taesup Kim

TL;DR
This paper introduces Model-based Preference Optimization (MPO), a novel method to improve abstractive summarization by fine-tuning large language models without human feedback, using model-generated preference data.
Contribution
The paper presents MPO, a new approach that leverages model-generated preferences for fine-tuning LLMs, eliminating the need for costly human feedback in summarization tasks.
Findings
MPO significantly improves summary quality across multiple datasets.
The method enhances faithfulness and relevance without human-labeled preferences.
Model-generated preference data effectively guides model fine-tuning.
Abstract
In abstractive summarization, the challenge of producing concise and accurate summaries arises from the vast amount of information contained in the source document. Consequently, although Large Language Models (LLMs) can generate fluent text, they often introduce inaccuracies by hallucinating content not found in the original source. While supervised fine-tuning methods that maximize likelihood contribute to this issue, they do not consistently enhance the faithfulness of the summaries. Preference-based optimization methods, such as Direct Preference Optimization (DPO), can further refine the model to align with human preferences. However, these methods still heavily depend on costly human feedback. In this work, we introduce a novel and straightforward approach called Model-based Preference Optimization (MPO) to fine-tune LLMs for improved summarization abilities without any human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Semantic Web and Ontologies · Data Mining Algorithms and Applications
MethodsALIGN
