Learning a Visually Grounded Memory Assistant

Meera Hahn; Kevin Carlberg; Ruta Desai; James Hillis

arXiv:2210.03787·cs.CV·October 11, 2022

Learning a Visually Grounded Memory Assistant

Meera Hahn, Kevin Carlberg, Ruta Desai, James Hillis

PDF

Open Access

TL;DR

This paper presents a large-scale dataset and analysis of human memory and assistance in 3D environments, using a novel interface and machine learning models to understand when people seek help during navigation tasks.

Contribution

It introduces a new interface and dataset for studying human memory in 3D environments, and develops models to predict when assistance is needed, bridging machine learning and cognitive science.

Findings

01

Collected the Visually Grounded Memory Assistant Dataset from large-scale human experiments.

02

Developed models that predict assistance requests based on visual and semantic features.

03

Provided insights into human memory encoding and assistance needs during navigation in 3D spaces.

Abstract

We introduce a novel interface for large scale collection of human memory and assistance. Using the 3D Matterport simulator we create a realistic indoor environments in which we have people perform specific embodied memory tasks that mimic household daily activities. This interface was then deployed on Amazon Mechanical Turk allowing us to test and record human memory, navigation and needs for assistance at a large scale that was previously impossible. Using the interface we collect the `The Visually Grounded Memory Assistant Dataset' which is aimed at developing our understanding of (1) the information people encode during navigation of 3D environments and (2) conditions under which people ask for memory assistance. Additionally we experiment with with predicting when people will ask for assistance using models trained on hand-selected visual and semantic features. This provides an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning

MethodsTest