Modeling Dynamic Environments with Scene Graph Memory

Andrey Kurenkov; Michael Lingelbach; Tanmay Agarwal; Emily Jin,; Chengshu Li; Ruohan Zhang; Li Fei-Fei; Jiajun Wu; Silvio Savarese; Roberto; Mart\'in-Mart\'in

arXiv:2305.17537·cs.LG·June 13, 2023·2 cites

Modeling Dynamic Environments with Scene Graph Memory

Andrey Kurenkov, Michael Lingelbach, Tanmay Agarwal, Emily Jin,, Chengshu Li, Ruohan Zhang, Li Fei-Fei, Jiajun Wu, Silvio Savarese, Roberto, Mart\'in-Mart\'in

PDF

Open Access

TL;DR

This paper introduces Scene Graph Memory and a neural network architecture to predict object locations in dynamic, partially observable environments, improving search efficiency for embodied AI agents.

Contribution

It presents a novel scene graph memory representation and a Node Edge Predictor model for link prediction in dynamic, partially observable graphs, addressing a key challenge in embodied AI.

Findings

01

NEP outperforms baselines in diverse environments

02

SGM effectively captures accumulated observations

03

Method adapts well to various object movement dynamics

Abstract

Embodied AI agents that search for objects in large environments such as households often need to make efficient decisions by predicting object locations based on partial information. We pose this as a new type of link prediction problem: link prediction on partially observable dynamic graphs. Our graph is a representation of a scene in which rooms and objects are nodes, and their relationships are encoded in the edges; only parts of the changing graph are known to the agent at each timestep. This partial observability poses a challenge to existing link prediction approaches, which we address. We propose a novel state representation -- Scene Graph Memory (SGM) -- with captures the agent's accumulated set of observations, as well as a neural net architecture called a Node Edge Predictor (NEP) that extracts information from the SGM to search efficiently. We evaluate our method in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Mobility and Location-Based Analysis · Human Pose and Action Recognition