Dynamic Graph Attention for Referring Expression Comprehension
Sibei Yang, Guanbin Li, Yizhou Yu

TL;DR
This paper introduces a dynamic graph attention network that models object relationships and linguistic structure for improved referring expression comprehension, enabling multi-step reasoning and interpretable visual evidence generation.
Contribution
It proposes a novel language-guided visual reasoning approach using a dynamic graph attention network for complex referring expressions.
Findings
Significantly outperforms state-of-the-art methods on benchmark datasets.
Enables interpretable stepwise reasoning with visual evidence.
Effectively models relationships and linguistic structure for complex expressions.
Abstract
Referring expression comprehension aims to locate the object instance described by a natural language referring expression in an image. This task is compositional and inherently requires visual reasoning on top of the relationships among the objects in the image. Meanwhile, the visual reasoning process is guided by the linguistic structure of the referring expression. However, existing approaches treat the objects in isolation or only explore the first-order relationships between objects without being aligned with the potential complexity of the expression. Thus it is hard for them to adapt to the grounding of complex referring expressions. In this paper, we explore the problem of referring expression comprehension from the perspective of language-driven visual reasoning, and propose a dynamic graph attention network to perform multi-step reasoning by modeling both the relationships…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Advanced Image and Video Retrieval Techniques
