Hierarchical Graph Interaction Transformer with Dynamic Token Clustering   for Camouflaged Object Detection

Siyuan Yao; Hao Sun; Tian-Zhu Xiang; Xiao Wang; Xiaochun Cao

arXiv:2408.15020·cs.CV·September 24, 2024

Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection

Siyuan Yao, Hao Sun, Tian-Zhu Xiang, Xiao Wang, Xiaochun Cao

PDF

Open Access 1 Repo

TL;DR

This paper introduces HGINet, a hierarchical graph interaction network with dynamic token clustering, designed to improve camouflaged object detection by effectively distinguishing objects from backgrounds through hierarchical feature interaction.

Contribution

The paper proposes a novel hierarchical graph interaction transformer with dynamic token clustering for enhanced camouflaged object detection, addressing limitations of existing methods.

Findings

01

HGINet outperforms state-of-the-art methods on multiple datasets.

02

The dynamic token clustering improves local region distinguishability.

03

Hierarchical feature interaction enhances semantic understanding.

Abstract

Camouflaged object detection (COD) aims to identify the objects that seamlessly blend into the surrounding backgrounds. Due to the intrinsic similarity between the camouflaged objects and the background region, it is extremely challenging to precisely distinguish the camouflaged objects by existing approaches. In this paper, we propose a hierarchical graph interaction network termed HGINet for camouflaged object detection, which is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features. Specifically, we first design a region-aware token focusing attention (RTFA) with dynamic token clustering to excavate the potentially distinguishable tokens in the local region. Afterwards, a hierarchical graph interaction transformer (HGIT) is proposed to construct bi-directional aligned communication between hierarchical features in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

garyson1204/hginet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection

MethodsSoftmax · Attention Is All You Need