Efficient Extractive Summarization with MAMBA-Transformer Hybrids for Low-Resource Scenarios
Nisrine Ait Khayi

TL;DR
This paper presents a novel hybrid Mamba-Transformer model for extractive summarization that efficiently processes full documents in low-resource settings, outperforming existing models in quality and speed.
Contribution
It introduces the first hybrid architecture combining transformers with state space models for extractive summarization, enabling linear-time processing of long documents.
Findings
Achieves +0.23 ROUGE-1 on ArXiv compared to BERTSUM
Significant improvements across all tested datasets (p < 0.001)
Increases inference speed by 24-27% on CNN/DailyMail
Abstract
Extractive summarization of long documents is bottlenecked by quadratic complexity, often forcing truncation and limiting deployment in resource-constrained settings. We introduce the first Mamba-Transformer hybrid for extractive summarization, combining the semantic strength of pre-trained transformers with the linear-time processing of state space models. Leveraging Mamba's ability to process full documents without truncation, our approach preserves context while maintaining strong summarization quality. The architecture includes: (1) a transformer encoder for sentence-level semantics, (2) a Mamba state space model to capture inter-sentence dependencies efficiently, and (3) a linear classifier for sentence relevance prediction. Across news, argumentative, and scientific domains under low-resource conditions, our method achieves: (1) large gains over BERTSUM and MATCHSUM, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Text and Document Classification Technologies
