Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning
Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Geunbae Lee, Jungseul Ok

TL;DR
This paper introduces a multi-objective reinforcement learning approach for text summarization that balances multiple quality dimensions using novel optimization strategies and a QA-based reward model, leading to improved summary quality.
Contribution
It proposes two multi-dimensional optimization strategies and a QA-based reward model for balanced, multi-dimensional summarization, addressing limitations of prior methods.
Findings
Significant performance improvements over baselines.
Effective balancing of multiple summary quality dimensions.
Ability to control summary length through discount factor adjustment.
Abstract
The evaluation of summary quality encompasses diverse dimensions such as consistency, coherence, relevance, and fluency. However, existing summarization methods often target a specific dimension, facing challenges in generating well-balanced summaries across multiple dimensions. In this paper, we propose multi-objective reinforcement learning tailored to generate balanced summaries across all four dimensions. We introduce two multi-dimensional optimization (MDO) strategies for adaptive learning: 1) MDO_min, rewarding the current lowest dimension score, and 2) MDO_pro, optimizing multiple dimensions similar to multi-task learning, resolves conflicting gradients across dimensions through gradient projection. Unlike prior ROUGE-based rewards relying on reference summaries, we use a QA-based reward model that aligns with human preferences. Further, we discover the capability to regulate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Text and Document Classification Technologies
