Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Borui Xu, Yao Chen, Zeyi Wen, Weiguo Liu, Bingsheng He

TL;DR
This study evaluates 19 small language models for news summarization, revealing their performance variations, capabilities, and limitations compared to large models, with practical insights for resource-constrained applications.
Contribution
It provides a comprehensive comparison of small language models for news summarization, highlighting their strengths, weaknesses, and factors affecting their performance.
Findings
Top SLMs like Phi3-Mini and Llama3.2-3B-Ins perform comparably to 70B LLMs.
SLMs excel with simple prompts but struggle with complex ones.
Instruction tuning does not always improve summarization quality.
Abstract
The increasing demand for efficient summarization tools in resource-constrained environments highlights the need for effective solutions. While large language models (LLMs) deliver superior summarization quality, their high computational resource requirements limit practical use applications. In contrast, small language models (SLMs) present a more accessible alternative, capable of real-time summarization on edge devices. However, their summarization capabilities and comparative performance against LLMs remain underexplored. This paper addresses this gap by presenting a comprehensive evaluation of 19 SLMs for news summarization across 2,000 news samples, focusing on relevance, coherence, factual consistency, and summary length. Our findings reveal significant variations in SLM performance, with top-performing models such as Phi3-Mini and Llama3.2-3B-Ins achieving results comparable to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Computational and Text Analysis Methods
