AGQA 2.0: An Updated Benchmark for Compositional Spatio-Temporal   Reasoning

Madeleine Grunde-McLaughlin; Ranjay Krishna; Maneesh Agrawala

arXiv:2204.06105·cs.CV·April 14, 2022·1 cites

AGQA 2.0: An Updated Benchmark for Compositional Spatio-Temporal Reasoning

Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala

PDF

Open Access

TL;DR

AGQA 2.0 is an improved benchmark for evaluating models' ability to perform compositional spatio-temporal reasoning in videos, featuring stricter answer balancing to reduce biases and provide more reliable assessments.

Contribution

The paper introduces AGQA 2.0, an enhanced version of the benchmark with improved balancing procedures to better evaluate visual reasoning models.

Findings

01

Models show improved performance on AGQA 2.0

02

Biases are further reduced in the new benchmark

03

AGQA 2.0 provides a more reliable evaluation of compositional reasoning

Abstract

Prior benchmarks have analyzed models' answers to questions about videos in order to measure visual compositional reasoning. Action Genome Question Answering (AGQA) is one such benchmark. AGQA provides a training/test split with balanced answer distributions to reduce the effect of linguistic biases. However, some biases remain in several AGQA categories. We introduce AGQA 2.0, a version of this benchmark with several improvements, most namely a stricter balancing procedure. We then report results on the updated benchmark for all experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning