LFOSum: Summarizing Long-form Opinions with Large Language Models
Mir Tafseer Nayeem, Davood Rafiei

TL;DR
This paper introduces a new dataset, two training-free LLM-based summarization methods, and evaluation metrics for long-form opinion summarization, addressing challenges of existing models in handling lengthy reviews and ensuring faithful summaries.
Contribution
The paper presents a novel dataset of long user reviews, two training-free LLM summarization approaches for long inputs, and new automatic evaluation metrics for assessing summary faithfulness.
Findings
Open-source LLMs can improve focus in summaries.
Challenges remain in balancing sentiment and format.
Evaluation metrics provide granular assessment of faithfulness.
Abstract
Online reviews play a pivotal role in influencing consumer decisions across various domains, from purchasing products to selecting hotels or restaurants. However, the sheer volume of reviews -- often containing repetitive or irrelevant content -- leads to information overload, making it challenging for users to extract meaningful insights. Traditional opinion summarization models face challenges in handling long inputs and large volumes of reviews, while newer Large Language Model (LLM) approaches often fail to generate accurate and faithful summaries. To address those challenges, this paper introduces (1) a new dataset of long-form user reviews, each entity comprising over a thousand reviews, (2) two training-free LLM-based summarization approaches that scale to long inputs, and (3) automatic evaluation metrics. Our dataset of user reviews is paired with in-depth and unbiased critical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Computational and Text Analysis Methods
