Improving Summarization with Human Edits
Zonghai Yao, Benjamin J Schloss, and Sai P. Selvaraj

TL;DR
This paper introduces SALT, a novel training method leveraging human edits and imitation edits to enhance summarization quality, especially in medical domains, outperforming traditional reinforcement learning from human feedback methods.
Contribution
The paper proposes SALT, a new training approach that effectively utilizes human and imitation edits for improving summarization models, extending feedback methods to specialized domains.
Findings
SALT improves summarization quality with human and imitation edits.
SALT outperforms traditional RLHF methods like DPO on human-edit data.
The approach is effective in both general and medical domain summarization.
Abstract
Recent work has shown the promise of learning with human feedback paradigms to produce human-determined high-quality text. Existing works use human feedback to train large language models (LLMs) in general domain abstractive summarization and have obtained summary quality exceeding traditional likelihood training. In this paper, we focus on a less explored form of human feedback -- Human Edits. We propose Sequence Alignment (un)Likelihood Training (SALT), a novel technique to use both the human-edited and model-generated data together in the training loop. In addition, we demonstrate simulating Human Edits with ground truth summaries coming from existing training data -- Imitation edits, along with the model-generated summaries obtained after the training, to reduce the need for expensive human-edit data. In our experiments, we extend human feedback exploration from general domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsDirect Preference Optimization · Focus
