CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation
Jialun Pei, Tao Jiang, He Tang, Nian Liu, Yueming Jin, Deng-Ping Fan,, Pheng-Ann Heng

TL;DR
CalibNet introduces a dual-branch architecture that calibrates RGB and depth features for improved salient instance segmentation, achieving state-of-the-art results on multiple benchmarks.
Contribution
The paper presents a novel dual-branch cross-modal calibration framework with new modules and a dataset for RGB-D salient instance segmentation.
Findings
Achieves 58.0% AP on COME15K-N benchmark.
Outperforms existing methods significantly.
Introduces a new dataset with 1,940 annotated images.
Abstract
We propose a novel approach for RGB-D salient instance segmentation using a dual-branch cross-modal feature calibration architecture called CalibNet. Our method simultaneously calibrates depth and RGB features in the kernel and mask branches to generate instance-aware kernels and mask features. CalibNet consists of three simple modules, a dynamic interactive kernel (DIK) and a weight-sharing fusion (WSF), which work together to generate effective instance-aware kernels and integrate cross-modal features. To improve the quality of depth features, we incorporate a depth similarity assessment (DSA) module prior to DIK and WSF. In addition, we further contribute a new DSIS dataset, which contains 1,940 images with elaborate instance-level annotations. Extensive experiments on three challenging benchmarks show that CalibNet yields a promising result, i.e., 58.0% AP with 320*480 input size on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Virtual Reality Applications and Impacts · Advanced Image and Video Retrieval Techniques
