Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects
Feng Yan, Xiaoheng Jiang, Yang Lu, Lisha Cui, Shupan Li, Jiale Cao,, Mingliang Xu, and Dacheng Tao

TL;DR
This paper introduces GCANet, a lightweight neural network that combines a transformer encoder with novel attention modules to improve surface defect detection accuracy while maintaining real-time efficiency.
Contribution
The paper proposes a novel global context aggregation network with a Depth-wise Self-Attention module and Channel Reference Attention, enhancing feature representation in lightweight defect detection models.
Findings
Achieves 91.79% Fβw on SD-saliency-900 dataset.
Runs at 272fps on a single GPU.
Outperforms 17 state-of-the-art methods in accuracy-efficiency trade-off.
Abstract
Surface defect inspection is a very challenging task in which surface defects usually show weak appearances or exist under complex backgrounds. Most high-accuracy defect detection methods require expensive computation and storage overhead, making them less practical in some resource-constrained defect detection applications. Although some lightweight methods have achieved real-time inference speed with fewer parameters, they show poor detection accuracy in complex defect scenarios. To this end, we develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure. First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module. The proposed DSA performs element-wise similarity in channel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Industrial Vision Systems and Defect Detection
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
