ShaDocFormer: A Shadow-Attentive Threshold Detector With Cascaded Fusion Refiner for Document Shadow Removal
Weiwen Chen, Yingtie Lei, Shenghong Luo, Ziyang Zhou, Mingxian Li,, Chi-Man Pun

TL;DR
ShaDocFormer is a Transformer-based model that effectively detects and removes shadows from document images captured by mobile devices, improving readability through a novel shadow detection and image restoration approach.
Contribution
It introduces ShaDocFormer, combining a shadow-attentive threshold detector and a cascaded fusion refiner for accurate shadow mask detection and image restoration.
Findings
Outperforms state-of-the-art methods in shadow removal quality
Achieves higher accuracy in shadow mask detection
Demonstrates effective handling of illumination variations
Abstract
Document shadow is a common issue that arises when capturing documents using mobile devices, which significantly impacts readability. Current methods encounter various challenges, including inaccurate detection of shadow masks and estimation of illumination. In this paper, we propose ShaDocFormer, a Transformer-based architecture that integrates traditional methodologies and deep learning techniques to tackle the problem of document shadow removal. The ShaDocFormer architecture comprises two components: the Shadow-attentive Threshold Detector (STD) and the Cascaded Fusion Refiner (CFR). The STD module employs a traditional thresholding technique and leverages the attention mechanism of the Transformer to gather global information, thereby enabling precise detection of shadow masks. The cascaded and aggregative structure of the CFR module facilitates a coarse-to-fine restoration process…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications
MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Linear Layer · Residual Connection · Adam · Spatial-Channel Token Distillation · Byte Pair Encoding · Softmax
