LatentGNN: Learning Efficient Non-local Relations for Visual Recognition
Songyang Zhang, Shipeng Yan, Xuming He

TL;DR
LatentGNN introduces a scalable graph neural network approach with a latent space to efficiently model non-local relations, significantly improving visual recognition performance while reducing computational complexity.
Contribution
It proposes a novel low-rank GNN with a latent space to efficiently capture non-local dependencies in visual recognition tasks.
Findings
Outperforms prior methods with large margins
Maintains low computational cost
Effective on multiple visual recognition tasks
Abstract
Capturing long-range dependencies in feature representations is crucial for many visual recognition tasks. Despite recent successes of deep convolutional networks, it remains challenging to model non-local context relations between visual features. A promising strategy is to model the feature context by a fully-connected graph neural network (GNN), which augments traditional convolutional features with an estimated non-local context representation. However, most GNN-based approaches require computing a dense graph affinity matrix and hence have difficulty in scaling up to tackle complex real-world visual problems. In this work, we propose an efficient and yet flexible non-local relation representation based on a novel class of graph neural networks. Our key idea is to introduce a latent space to reduce the complexity of graph, which allows us to use a low-rank representation for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Domain Adaptation and Few-Shot Learning
MethodsGraph Neural Network
