Reranking-based Generation for Unbiased Perspective Summarization
Narutatsu Ri, Nicholas Deas, Kathleen McKeown

TL;DR
This paper improves unbiased perspective summarization by developing reliable evaluation metrics and demonstrating the effectiveness of reranking and preference tuning methods with LLMs.
Contribution
It introduces a new benchmarking framework for metric reliability and shows that reranking-based methods outperform traditional approaches in perspective summarization.
Findings
Language model-based metrics outperform traditional metrics.
Reranking methods significantly improve summary quality.
Preference tuning further enhances summarization performance.
Abstract
Generating unbiased summaries in real-world settings such as political perspective summarization remains a crucial application of Large Language Models (LLMs). Yet, existing evaluation frameworks rely on traditional metrics for measuring key attributes such as coverage and faithfulness without verifying their applicability, and efforts to develop improved summarizers are still nascent. We address these gaps by (1) identifying reliable metrics for measuring perspective summary quality, and (2) investigating the efficacy of LLM-based methods beyond zero-shot inference. Namely, we build a test set for benchmarking metric reliability using human annotations and show that traditional metrics underperform compared to language model-based metrics, which prove to be strong evaluators. Using these metrics, we show that reranking-based methods yield strong results, and preference tuning with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Sentiment Analysis and Opinion Mining
