SGANet: Semantic and Geometric Alignment for Multimodal Multi-view Anomaly Detection
Letian Bai, Chengyu Tao, Juan Du

TL;DR
SGANet is a novel framework that improves multimodal multi-view anomaly detection by aligning semantic and geometric features across viewpoints and modalities, leading to state-of-the-art results.
Contribution
It introduces a unified approach combining semantic and geometric alignment modules for more accurate anomaly detection across multiple views and modalities.
Findings
SGANet outperforms existing methods on SiM3D and Eyecandies datasets.
The framework effectively enhances anomaly detection and localization accuracy.
Extensive experiments validate its robustness in industrial scenarios.
Abstract
Multi-view anomaly detection aims to identify surface defects on complex objects using observations captured from multiple viewpoints. However, existing unsupervised methods often suffer from feature inconsistency arising from viewpoint variations and modality discrepancies. To address these challenges, we propose a Semantic and Geometric Alignment Network (SGANet), a unified framework for multimodal multi-view anomaly detection that effectively combines semantic and geometric alignment to learn physically coherent feature representations across viewpoints and modalities. SGANet consists of three key components. The Selective Cross-view Feature Refinement Module (SCFRM) selectively aggregates informative patch features from adjacent views to enhance cross-view feature interaction. The Semantic-Structural Patch Alignment (SSPA) enforces semantic alignment across modalities while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
