SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection

Yonghui Wang; Shaokai Liu; Li Li; Wengang Zhou; Houqiang Li

arXiv:2408.03521·cs.CV·August 8, 2024

SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection

Yonghui Wang, Shaokai Liu, Li Li, Wengang Zhou, Houqiang Li

PDF

Open Access 1 Repo

TL;DR

SwinShadow introduces a transformer-based approach utilizing shifted window mechanisms to improve adjacent shadow detection, especially when object and shadow colors are similar, outperforming existing methods on benchmark datasets.

Contribution

The paper proposes a novel SwinTransformer-based architecture with shifted windows, deep supervision, and double attention modules for enhanced adjacent shadow detection.

Findings

01

Achieves superior BER on SBU, UCF, and ISTD datasets.

02

Effectively distinguishes shadows from similar-colored objects.

03

Outperforms existing shadow detection methods.

Abstract

Shadow detection is a fundamental and challenging task in many computer vision applications. Intuitively, most shadows come from the occlusion of light by the object itself, resulting in the object and its shadow being contiguous (referred to as the adjacent shadow in this paper). In this case, when the color of the object is similar to that of the shadow, existing methods struggle to achieve accurate detection. To address this problem, we present SwinShadow, a transformer-based architecture that fully utilizes the powerful shifted window mechanism for detecting adjacent shadows. The mechanism operates in two steps. Initially, it applies local self-attention within a single window, enabling the network to focus on local details. Subsequently, it shifts the attention windows to facilitate inter-window attention, enabling the capture of a broader range of adjacent information. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

harrytea/SwinShadow
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Video Surveillance and Tracking Methods

MethodsLinear Layer · Residual Connection · Multi-Head Attention · Stochastic Depth · Attention Is All You Need · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · Softmax · Absolute Position Encodings