Learning Graph Embeddings for Compositional Zero-shot Learning

Muhammad Ferjad Naeem; Yongqin Xian; Federico Tombari; Zeynep Akata

arXiv:2102.01987·cs.CV·May 5, 2021

Learning Graph Embeddings for Compositional Zero-shot Learning

Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces a novel graph-based approach called Compositional Graph Embedding (CGE) for zero-shot learning of unseen visual concept compositions, outperforming existing methods on standard benchmarks.

Contribution

The paper proposes a new end-to-end graph formulation that models dependencies between visual primitives, enabling zero-shot generalization without external knowledge bases.

Findings

01

CGE outperforms state-of-the-art on MIT-States and UT-Zappos datasets.

02

Introduces a new benchmark based on GQA dataset.

03

Demonstrates effective knowledge transfer between seen and unseen compositions.

Abstract

In compositional zero-shot learning, the goal is to recognize unseen compositions (e.g. old dog) of observed visual primitives states (e.g. old, cute) and objects (e.g. car, dog) in the training set. This is challenging because the same state can for example alter the visual appearance of a dog drastically differently from a car. As a solution, we propose a novel graph formulation called Compositional Graph Embedding (CGE) that learns image features, compositional classifiers, and latent representations of visual primitives in an end-to-end manner. The key to our approach is exploiting the dependency between states, objects, and their compositions within a graph structure to enforce the relevant knowledge transfer from seen to unseen compositions. By learning a joint compatibility that encodes semantics between concepts, our model allows for generalization to unseen compositions without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ExplainableML/czsl
pytorchOfficial

Datasets

nihalnayak/cgqa
dataset· 40 dl
40 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.