DocDeshadower: Frequency-Aware Transformer for Document Shadow Removal
Ziyang Zhou, Yingtie Lei, Xuhang Chen, Shenghong Luo, Wenjun Zhang,, Chi-Man Pun, Zhen Wang

TL;DR
This paper introduces DocDeshadower, a frequency-aware Transformer model that effectively removes shadows from scanned documents by decomposing images into multiple frequency bands and employing specialized modules for different scales.
Contribution
The paper presents a novel multi-frequency Transformer architecture based on the Laplacian Pyramid for improved document shadow removal, addressing limitations of existing methods.
Findings
Outperforms state-of-the-art shadow removal methods
Effectively handles varying shadow intensities and scales
Preserves document details while removing shadows
Abstract
Shadows in scanned documents pose significant challenges for document analysis and recognition tasks due to their negative impact on visual quality and readability. Current shadow removal techniques, including traditional methods and deep learning approaches, face limitations in handling varying shadow intensities and preserving document details. To address these issues, we propose DocDeshadower, a novel multi-frequency Transformer-based model built upon the Laplacian Pyramid. By decomposing the shadow image into multiple frequency bands and employing two critical modules: the Attention-Aggregation Network for low-frequency shadow removal and the Gated Multi-scale Fusion Transformer for global refinement. DocDeshadower effectively removes shadows at different scales while preserving document content. Extensive experiments demonstrate DocDeshadower's superior performance compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeophysical Methods and Applications · Handwritten Text Recognition Techniques · Digital Media Forensic Detection
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Dense Connections · Linear Layer · Dropout · Adam · Label Smoothing · Absolute Position Encodings · Byte Pair Encoding
