There is a Time and Place for Reasoning Beyond the Image

Xingyu Fu; Ben Zhou; Ishaan Preetam Chandratreya; Carl Vondrick; Dan; Roth

arXiv:2203.00758·cs.CV·March 29, 2022

There is a Time and Place for Reasoning Beyond the Image

Xingyu Fu, Ben Zhou, Ishaan Preetam Chandratreya, Carl Vondrick, Dan, Roth

PDF

Open Access 1 Repo

TL;DR

This paper introduces TARA, a dataset and model for reasoning about the time and place of images using contextual information, demonstrating a significant gap between current models and human performance.

Contribution

The work presents a new dataset, TARA, with 16k images and associated spatio-temporal data, and proposes a model that improves reasoning about image context beyond state-of-the-art methods.

Findings

01

70% gap between model and human performance

02

Segment-wise reasoning improves accuracy

03

Dataset enables research on open-ended reasoning

Abstract

Images are often more significant than only the pixels to human eyes, as we can infer, associate, and reason with contextual information from other sources to establish a more complete picture. For example, in Figure 1, we can find a way to identify the news articles related to the picture through segment-wise understandings of the signs, the buildings, the crowds, and more. This reasoning could provide the time and place the image was taken, which will help us in subsequent tasks, such as automatic storyline construction, correction of image source in intended effect photographs, and upper-stream processing such as image clustering for certain location or time. In this work, we formulate this problem and introduce TARA: a dataset with 16k images with their associated news, time, and location, automatically extracted from New York Times, and an additional 61k examples as distant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zeyofu/tara
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Image Retrieval and Classification Techniques