TL;DR
This paper introduces MHENet, a novel RGB-D camouflaged object detection framework that enhances modality-specific features through hierarchical modules and adaptive fusion, significantly improving detection accuracy.
Contribution
The paper proposes a new hierarchical enhancement and adaptive fusion approach for RGB-D COD, effectively utilizing modality-specific cues to improve detection performance.
Findings
MHENet outperforms 16 state-of-the-art methods on four benchmarks.
Hierarchical modules effectively amplify subtle texture and geometric cues.
Adaptive fusion improves the integration of RGB and depth features.
Abstract
Camouflaged object detection (COD) is challenging due to high target-background similarity, and recent methods address this by complementarily using RGB-D texture and geometry cues. However, RGB-D COD methods still underutilize modality-specific cues, which limits fusion quality. We believe this is because RGB and depth features are fused directly after backbone extraction without modality-specific enhancement. To address this limitation, we propose MHENet, an RGB-D COD framework that performs modality-specific hierarchical enhancement and adaptive fusion of RGB and depth features. Specifically, we introduce a Texture Hierarchical Enhancement Module (THEM) to amplify subtle texture variations by extracting high-frequency information and a Geometry Hierarchical Enhancement Module (GHEM) to enhance geometric structures via learnable gradient extraction, while preserving cross-scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
