Divide and summarize: improve SLM text summarization

Alexandre Bailly; Antoine Saubin; Gabriel Kocevar; Jonathan Bodin

PMC · DOI:10.3389/frai.2025.1604034·August 1, 2025

Divide and summarize: improve SLM text summarization

Alexandre Bailly, Antoine Saubin, Gabriel Kocevar, Jonathan Bodin

PDF

Open Access

TL;DR

This paper compares two text summarization methods for small language models, finding that the 'Map' method improves accuracy and avoids losing information from the middle of texts.

Contribution

The study introduces and validates the 'Map' method as a superior alternative to the 'Stuff' method for SLM-based summarization.

Findings

01

The Map method retains key facts from the beginning and middle of texts better than the Stuff method.

02

SLMs using the Map method achieved performance comparable to LLMs using the Stuff method.

03

The Map method effectively addresses the 'Lost in the Middle' problem in SLM summarization.

Abstract

Text summarization is a longstanding challenge in natural language processing, with recent advancements driven by the adoption of Large Language Models (LLMs) and Small Language Models (SLMs). Despite these developments, issues such as the “Lost in the Middle” problem—where LLMs tend to overlook information in the middle of lengthy prompts—persist. Traditional summarization, often termed the “Stuff” method, processes an entire text in a single pass. In contrast, the “Map” method divides the text into segments, summarizes each independently, and then synthesizes these partial summaries into a final output, potentially mitigating the “Lost in the Middle” issue. This study investigates whether the Map method outperforms the Stuff method for texts that fit within the context window of SLMs and assesses its effectiveness in addressing the “Lost in the Middle” problem. We conducted a…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Figures8

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems