Enhancing Shape Perception and Segmentation Consistency for Industrial Image Inspection

Guoxuan Mao; Ting Cao; Ziyang Li; Yuan Dong

arXiv:2505.14718·cs.CV·May 22, 2025

Enhancing Shape Perception and Segmentation Consistency for Industrial Image Inspection

Guoxuan Mao, Ting Cao, Ziyang Li, Yuan Dong

PDF

Open Access

TL;DR

This paper introduces SPENet, a shape-aware efficient network for industrial image segmentation that improves consistency and accuracy by focusing on object contours and boundary adaptation, suitable for real-time applications.

Contribution

The paper proposes a novel shape-aware network with boundary and body supervision, a fuzzy boundary description method, and a new consistency metric, advancing industrial segmentation performance.

Findings

01

Achieves superior segmentation accuracy on industrial datasets.

02

Reduces segmentation inconsistency with the new CMSE metric.

03

Offers a faster, more efficient model suitable for real-time industrial inspection.

Abstract

Semantic segmentation stands as a pivotal research focus in computer vision. In the context of industrial image inspection, conventional semantic segmentation models fail to maintain the segmentation consistency of fixed components across varying contextual environments due to a lack of perception of object contours. Given the real-time constraints and limited computing capability of industrial image detection machines, it is also necessary to create efficient models to reduce computational complexity. In this work, a Shape-Aware Efficient Network (SPENet) is proposed, which focuses on the shapes of objects to achieve excellent segmentation consistency by separately supervising the extraction of boundary and body information from images. In SPENet, a novel method is introduced for describing fuzzy boundaries to better adapt to real-world scenarios named Variable Boundary Domain (VBD).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection · Image and Object Detection Techniques · Image Processing Techniques and Applications

MethodsFocus · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings