DeS3: Adaptive Attention-driven Self and Soft Shadow Removal using ViT Similarity
Yeying Jin, Wei Ye, Wenhan Yang, Yuan Yuan, Robby T. Tan

TL;DR
DeS3 introduces an adaptive attention and ViT similarity-based approach for effective removal of hard, soft, and self shadows from single images, outperforming existing methods by leveraging novel loss functions and attention mechanisms.
Contribution
The paper proposes a novel shadow removal method using adaptive attention and ViT similarity loss, enhancing scene structure recovery without relying on binary shadow masks.
Findings
Outperforms state-of-the-art methods on multiple datasets
Achieves 16% lower RMSE on LRSS dataset
Effectively removes various shadow types including soft and self shadows
Abstract
Removing soft and self shadows that lack clear boundaries from a single image is still challenging. Self shadows are shadows that are cast on the object itself. Most existing methods rely on binary shadow masks, without considering the ambiguous boundaries of soft and self shadows. In this paper, we present DeS3, a method that removes hard, soft and self shadows based on adaptive attention and ViT similarity. Our novel ViT similarity loss utilizes features extracted from a pre-trained Vision Transformer. This loss helps guide the reverse sampling towards recovering scene structures. Our adaptive attention is able to differentiate shadow regions from the underlying objects, as well as shadow regions from the object casting the shadow. This capability enables DeS3 to better recover the structures of objects even when they are partially occluded by shadows. Different from existing methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsImage and Signal Denoising Methods · Image Enhancement Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Layer Normalization · Softmax · Adam · Absolute Position Encodings · Residual Connection
