Make The Most of Prior Data: A Solution for Interactive Text Summarization with Preference Feedback
Duy-Hung Nguyen, Nguyen Viet Dung Nghiem, Bao-Sinh Nguyen and, Dung Tien Le, Shahab Sabahi, Minh-Tien Nguyen, Hung Le

TL;DR
This paper introduces an interactive training framework for text summarization that effectively uses offline data and preference feedback to improve summarization quality and sample efficiency in dynamic human-AI interactions.
Contribution
It presents a novel framework combining offline data and a new reward model for preference-based training in summarization, enhancing performance in online and few-shot settings.
Findings
Improved ROUGE scores across datasets
Enhanced sample efficiency in preference learning
Effective in active, few-shot, and online scenarios
Abstract
For summarization, human preference is critical to tame outputs of the summarizer in favor of human interests, as ground-truth summaries are scarce and ambiguous. Practical settings require dynamic exchanges between human and AI agent wherein feedback is provided in an online manner, a few at a time. In this paper, we introduce a new framework to train summarization models with preference feedback interactively. By properly leveraging offline data and a novel reward model, we improve the performance regarding ROUGE scores and sample-efficiency. Our experiments on three various datasets confirm the benefit of the proposed framework in active, few-shot and online settings of preference learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Malware Detection Techniques
