Distilling Opinions at Scale: Incremental Opinion Summarization using   XL-OPSUMM

Sri Raghava Muddu; Rupasai Rangaraju; Tejpalsingh Siledar; Swaroop; Nath; Pushpak Bhattacharyya; Swaprava Nath; Suman Banerjee; Amey Patil,; Muthusamy Chelliah; Sudhanshu Shekhar Singh; Nikesh Garera

arXiv:2406.10886·cs.CL·June 18, 2024

Distilling Opinions at Scale: Incremental Opinion Summarization using XL-OPSUMM

Sri Raghava Muddu, Rupasai Rangaraju, Tejpalsingh Siledar, Swaroop, Nath, Pushpak Bhattacharyya, Swaprava Nath, Suman Banerjee, Amey Patil,, Muthusamy Chelliah, Sudhanshu Shekhar Singh, Nikesh Garera

PDF

Open Access

TL;DR

This paper introduces Xl-OpSumm, a scalable incremental opinion summarization framework for large review datasets, demonstrating improved performance over existing models using Llama-3-8B-8k on new and existing datasets.

Contribution

The paper presents a novel incremental summarization framework, Xl-OpSumm, capable of handling large-scale review data and introduces a new dataset, Xl-Flipkart, for evaluation.

Findings

01

Xl-OpSumm achieves a 4.38% ROUGE-1 F1 gain over competitors.

02

Xl-OpSumm performs well on both AMASUM and the new Xl-Flipkart datasets.

03

The framework effectively summarizes thousands of reviews incrementally.

Abstract

Opinion summarization in e-commerce encapsulates the collective views of numerous users about a product based on their reviews. Typically, a product on an e-commerce platform has thousands of reviews, each review comprising around 10-15 words. While Large Language Models (LLMs) have shown proficiency in summarization tasks, they struggle to handle such a large volume of reviews due to context limitations. To mitigate, we propose a scalable framework called Xl-OpSumm that generates summaries incrementally. However, the existing test set, AMASUM has only 560 reviews per product on average. Due to the lack of a test set with thousands of reviews, we created a new test set called Xl-Flipkart by gathering data from the Flipkart website and generating summaries using GPT-4. Through various automatic evaluations and extensive analysis, we evaluated the framework's efficiency on two datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques · Topic Modeling

MethodsSparse Evolutionary Training · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Linear Layer · Multi-Head Attention