Evaluating LLMs and Pre-trained Models for Text Summarization Across   Diverse Datasets

Tohida Rehman; Soumabha Ghosh; Kuntal Das; Souvik Bhattacharjee,; Debarshi Kumar Sanyal; Samiran Chattopadhyay

arXiv:2502.19339·cs.CL·March 14, 2025

Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets

Tohida Rehman, Soumabha Ghosh, Kuntal Das, Souvik Bhattacharjee,, Debarshi Kumar Sanyal, Samiran Chattopadhyay

PDF

Open Access

TL;DR

This paper systematically evaluates four prominent large language models for text summarization across diverse datasets using multiple automatic metrics, highlighting their strengths and limitations in generating coherent summaries.

Contribution

It provides a comprehensive comparison of BART, FLAN-T5, LLaMA-3-8B, and Gemma-7B on five datasets, offering insights into their performance in text summarization tasks.

Findings

01

BART and FLAN-T5 outperform LLaMA-3-8B and Gemma-7B on most datasets.

02

Different models excel in different dataset types, indicating varied strengths.

03

Automatic metrics reveal specific strengths and weaknesses of each model.

Abstract

Text summarization plays a crucial role in natural language processing by condensing large volumes of text into concise and coherent summaries. As digital content continues to grow rapidly and the demand for effective information retrieval increases, text summarization has become a focal point of research in recent years. This study offers a thorough evaluation of four leading pre-trained and open-source large language models: BART, FLAN-T5, LLaMA-3-8B, and Gemma-7B, across five diverse datasets CNN/DM, Gigaword, News Summary, XSum, and BBC News. The evaluation employs widely recognized automatic metrics, including ROUGE-1, ROUGE-2, ROUGE-L, BERTScore, and METEOR, to assess the models' capabilities in generating coherent and informative summaries. The results reveal the comparative strengths and limitations of these models in processing various text types.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Residual Connection · Layer Normalization · Attention Is All You Need · Multi-Head Attention · Adam · Softmax · Dropout · Byte Pair Encoding