Modality-Specific Hierarchical Enhancement for RGB-D Camouflaged Object Detection

Yuzhen Niu; Yangqing Wang; Ri Cheng; Fusheng Li; Rongshen Wang; Zhichen Yang

arXiv:2604.02935·cs.CV·April 6, 2026

Modality-Specific Hierarchical Enhancement for RGB-D Camouflaged Object Detection

Yuzhen Niu, Yangqing Wang, Ri Cheng, Fusheng Li, Rongshen Wang, Zhichen Yang

PDF

1 Repo

TL;DR

This paper introduces MHENet, a novel RGB-D camouflaged object detection framework that enhances modality-specific features through hierarchical modules and adaptive fusion, significantly improving detection accuracy.

Contribution

The paper proposes a new hierarchical enhancement and adaptive fusion approach for RGB-D COD, effectively utilizing modality-specific cues to improve detection performance.

Findings

01

MHENet outperforms 16 state-of-the-art methods on four benchmarks.

02

Hierarchical modules effectively amplify subtle texture and geometric cues.

03

Adaptive fusion improves the integration of RGB and depth features.

Abstract

Camouflaged object detection (COD) is challenging due to high target-background similarity, and recent methods address this by complementarily using RGB-D texture and geometry cues. However, RGB-D COD methods still underutilize modality-specific cues, which limits fusion quality. We believe this is because RGB and depth features are fused directly after backbone extraction without modality-specific enhancement. To address this limitation, we propose MHENet, an RGB-D COD framework that performs modality-specific hierarchical enhancement and adaptive fusion of RGB and depth features. Specifically, we introduce a Texture Hierarchical Enhancement Module (THEM) to amplify subtle texture variations by extracting high-frequency information and a Geometry Hierarchical Enhancement Module (GHEM) to enhance geometric structures via learnable gradient extraction, while preserving cross-scale…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

afdsgh/MHENet
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.