CSDN: A Context-Gated Self-Adaptive Detection Network for Real-Time Object Detection
Haolin Wei

TL;DR
CSDN introduces a Transformer-based detection head that enhances global context modeling and adapts to objects of various sizes, significantly improving real-time object detection accuracy with minimal fine-tuning.
Contribution
The paper proposes CSDN, a novel context-gated, scale-adaptive detection network inspired by human perception, addressing CNN limitations and redundancy in self-attention modules.
Findings
CSDN improves detection accuracy across multiple CNN-based detectors.
The method enhances global context understanding and scale adaptation.
Minimal fine-tuning yields significant performance gains.
Abstract
Convolutional neural networks (CNNs) have long been the cornerstone of target detection, but they are often limited by limited receptive fields, which hinders their ability to capture global contextual information. We re-examined the DETR-inspired detection head and found substantial redundancy in its self-attention module. To solve these problems, we introduced the Context-Gated Scale-Adaptive Detection Network (CSDN), a Transformer-based detection header inspired by human visual perception: when observing an object, we always concentrate on one site, perceive the surrounding environment, and glance around the object. This mechanism enables each region of interest (ROI) to adaptively select and combine feature dimensions and scale information from different patterns. CSDN provides more powerful global context modeling capabilities and can better adapt to objects of different sizes and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
