Energy-Based Learning for Scene Graph Generation

Mohammed Suhail; Abhay Mittal; Behjat Siddiquie; Chris Broaddus; Jayan; Eledath; Gerard Medioni; Leonid Sigal

arXiv:2103.02221·cs.CV·March 4, 2021

Energy-Based Learning for Scene Graph Generation

Mohammed Suhail, Abhay Mittal, Behjat Siddiquie, Chris Broaddus, Jayan, Eledath, Gerard Medioni, Leonid Sigal

PDF

1 Repo

TL;DR

This paper introduces an energy-based learning framework for scene graph generation that captures structural information, improving performance and data efficiency, especially in low-data scenarios.

Contribution

The paper proposes a novel energy-based framework that incorporates scene graph structure into learning, enhancing existing models' performance and data efficiency.

Findings

01

Up to 21% performance improvement on Visual Genome

02

Up to 27% performance improvement on GQA

03

Effective in zero- and few-shot learning scenarios

Abstract

Traditional scene graph generation methods are trained using cross-entropy losses that treat objects and relationships as independent entities. Such a formulation, however, ignores the structure in the output space, in an inherently structured prediction problem. In this work, we introduce a novel energy-based learning framework for generating scene graphs. The proposed formulation allows for efficiently incorporating the structure of scene graphs in the output space. This additional constraint in the learning framework acts as an inductive bias and allows models to learn efficiently from a small number of labels. We use the proposed energy-based framework to train existing state-of-the-art models and obtain a significant performance improvement, of up to 21% and 27%, on the Visual Genome and GQA benchmark datasets, respectively. Furthermore, we showcase the learning efficiency of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mods333/energy-based-scene-graph
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRegion Proposal Network · Convolution · Softmax · RoIAlign · Mask R-CNN