MINet: Multi-scale Interactive Network for Real-time Salient Object   Detection of Strip Steel Surface Defects

Kunye Shen; Xiaofei Zhou; Zhi Liu

arXiv:2405.16096·cs.CV·May 28, 2024

MINet: Multi-scale Interactive Network for Real-time Salient Object Detection of Strip Steel Surface Defects

Kunye Shen, Xiaofei Zhou, Zhi Liu

PDF

1 Repo

TL;DR

MINet is a lightweight, real-time neural network designed for efficient salient object detection of strip steel surface defects, achieving high speed and accuracy with fewer parameters.

Contribution

The paper introduces a multi-scale interactive module and a lightweight network architecture that significantly reduces parameters and computational cost for defect detection.

Findings

01

Achieves 721FPS on GPU and 6.3FPS on CPU for 368x368 images.

02

Uses only 0.28 million parameters, outperforming many existing methods.

03

Maintains competitive detection accuracy with state-of-the-art approaches.

Abstract

The automated surface defect detection is a fundamental task in industrial production, and the existing saliencybased works overcome the challenging scenes and give promising detection results. However, the cutting-edge efforts often suffer from large parameter size, heavy computational cost, and slow inference speed, which heavily limits the practical applications. To this end, we devise a multi-scale interactive (MI) module, which employs depthwise convolution (DWConv) and pointwise convolution (PWConv) to independently extract and interactively fuse features of different scales, respectively. Particularly, the MI module can provide satisfactory characterization for defect regions with fewer parameters. Embarking on this module, we propose a lightweight Multi-scale Interactive Network (MINet) to conduct real-time salient object detection of strip steel surface defects. Comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kunye-shen/minet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution · Depthwise Convolution · Pointwise Convolution