CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation
Zhuoyan Luo, Yinghao Wu, Tianheng Cheng, Yong Liu, Yicheng Xiao,, Hongfa Wang, Xiao-Ping Zhang, Yujiu Yang

TL;DR
This paper introduces CoHD, a hierarchical decoding framework that enhances generalized referring expression segmentation by incorporating counting and multi-granularity object understanding, significantly improving performance over existing methods.
Contribution
The paper proposes a novel counting-aware hierarchical decoding framework (CoHD) that decouples referring semantics into different granularities and incorporates counting supervision for better object comprehension.
Findings
Outperforms state-of-the-art GRES methods on multiple benchmarks.
Effectively models multi-granularity object information and counting.
Demonstrates significant accuracy improvements in complex referring scenarios.
Abstract
The newly proposed Generalized Referring Expression Segmentation (GRES) amplifies the formulation of classic RES by involving complex multiple/non-target scenarios. Recent approaches address GRES by directly extending the well-adopted RES frameworks with object-existence identification. However, these approaches tend to encode multi-granularity object information into a single representation, which makes it difficult to precisely represent comprehensive objects of different granularity. Moreover, the simple binary object-existence identification across all referent scenarios fails to specify their inherent differences, incurring ambiguity in object understanding. To tackle the above issues, we propose a \textbf{Co}unting-Aware \textbf{H}ierarchical \textbf{D}ecoding framework (CoHD) for GRES. By decoupling the intricate referring semantics into different granularity with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsFocus
