Unraveling the Capabilities of Language Models in News Summarization

Abdurrahman Odaba\c{s}{\i}; G\"oksel Biricik

arXiv:2501.18128·cs.CL·January 31, 2025

Unraveling the Capabilities of Language Models in News Summarization

Abdurrahman Odaba\c{s}{\i}, G\"oksel Biricik

PDF

Open Access 1 Repo

TL;DR

This paper benchmarks 20 recent language models for news summarization, revealing that larger models like GPT-3.5-Turbo and GPT-4 excel, while some smaller models also show promising results, with demonstration examples not always improving performance.

Contribution

It provides a comprehensive evaluation of various language models in zero-shot and few-shot settings for news summarization, highlighting the impact of reference quality and model capabilities.

Findings

01

GPT-3.5-Turbo and GPT-4 outperform others in summarization quality.

02

Few-shot demonstrations sometimes worsen results due to poor reference summaries.

03

Certain smaller models like Qwen1.5-7B show competitive performance.

Abstract

Given the recent introduction of multiple language models and the ongoing demand for improved Natural Language Processing tasks, particularly summarization, this work provides a comprehensive benchmarking of 20 recent language models, focusing on smaller ones for the news summarization task. In this work, we systematically test the capabilities and effectiveness of these models in summarizing news article texts which are written in different styles and presented in three distinct datasets. Specifically, we focus in this study on zero-shot and few-shot learning settings and we apply a robust evaluation methodology that combines different evaluation concepts including automatic metrics, human evaluation, and LLM-as-a-judge. Interestingly, including demonstration examples in the few-shot learning setting did not enhance models' performance and, in some cases, even led to worse quality of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

odabashi/LMs-in-News-Summarization
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Label Smoothing · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Transformer · Attention Dropout · Linear Layer · Dense Connections