Learning Action-Effect Dynamics from Pairs of Scene-graphs

Shailaja Keyur Sampat; Pratyay Banerjee; Yezhou Yang; Chitta Baral

arXiv:2212.03433·cs.CV·December 8, 2022·1 cites

Learning Action-Effect Dynamics from Pairs of Scene-graphs

Shailaja Keyur Sampat, Pratyay Banerjee, Yezhou Yang, Chitta Baral

PDF

Open Access

TL;DR

This paper introduces a novel method that uses scene-graph representations of images to reason about the effects of actions described in natural language, demonstrating improved performance and generalization on the CLEVR_HYP dataset.

Contribution

The paper presents a new approach leveraging scene-graphs for action-effect reasoning, enhancing data efficiency and generalization over existing models.

Findings

01

Effective in reasoning about action effects from scene-graphs

02

Improves performance and generalization on CLEVR_HYP dataset

03

Requires less data compared to previous models

Abstract

'Actions' play a vital role in how humans interact with the world. Thus, autonomous agents that would assist us in everyday tasks also require the capability to perform 'Reasoning about Actions & Change' (RAC). Recently, there has been growing interest in the study of RAC with visual and linguistic inputs. Graphs are often used to represent semantic structure of the visual content (i.e. objects, their attributes and relationships among objects), commonly referred to as scene-graphs. In this work, we propose a novel method that leverages scene-graph representation of images to reason about the effects of actions described in natural language. We experiment with existing CLEVR_HYP (Sampat et. al, 2021) dataset and show that our proposed approach is effective in terms of performance, data efficiency, and generalization capability compared to existing models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning