DWTSumm: Discrete Wavelet Transform for Document Summarization

Rana Salama; Abdou Youssef; Mona Diab

arXiv:2604.21070·cs.CL·April 24, 2026

DWTSumm: Discrete Wavelet Transform for Document Summarization

Rana Salama, Abdou Youssef, Mona Diab

PDF

TL;DR

This paper introduces a DWT-based multi-resolution framework that improves domain-specific document summarization with LLMs by preserving semantics and reducing hallucinations.

Contribution

The paper presents a novel DWT-based approach that decomposes text embeddings into global and local components, enhancing summarization quality and factual grounding in domain-specific documents.

Findings

01

DWT-based summaries achieve comparable ROUGE-L scores to baselines.

02

Semantic similarity and factual grounding improve over 2% and 4% respectively.

03

Fidelity reaches up to 97%, indicating reduced hallucinations.

Abstract

Summarizing long, domain-specific documents with large language models (LLMs) remains challenging due to context limitations, information loss, and hallucinations, particularly in clinical and legal settings. We propose a Discrete Wavelet Transform (DWT)-based multi-resolution framework that treats text as a semantic signal and decomposes it into global (approximation) and local (detail) components. Applied to sentence- or word-level embeddings, DWT yields compact representations that preserve overall structure and critical domain-specific details, which are used directly as summaries or to guide LLM generation. Experiments on clinical and legal benchmarks demonstrate comparable ROUGE-L scores. Compared to a GPT-4o baseline, the DWT based summarization consistently improve semantic similarity and grounding, achieving gains of over 2% in BERTScore, more than 4\% in Semantic Fidelity,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.