Visual Intelligence through Human Interaction

Ranjay Krishna; Mitchell Gordon; Li Fei-Fei; Michael Bernstein

arXiv:2111.06913·cs.CV·November 16, 2021

Visual Intelligence through Human Interaction

Ranjay Krishna, Mitchell Gordon, Li Fei-Fei, Michael Bernstein

PDF

TL;DR

This paper explores innovative human-computer interaction methods to enhance data collection, evaluation, and contribution in computer vision, leveraging crowdsourcing, social interventions, and psychophysical grounding.

Contribution

It introduces novel interaction strategies for faster data collection, increased volunteer contributions, and reliable human evaluation of generative vision models.

Findings

01

Crowdsourcing interface accelerates data collection by tenfold.

02

Automated social interventions boost volunteer contributions.

03

Psychophysics-based evaluation ensures reliable human assessment.

Abstract

Over the last decade, Computer Vision, the branch of Artificial Intelligence aimed at understanding the visual world, has evolved from simply recognizing objects in images to describing pictures, answering questions about images, aiding robots maneuver around physical spaces and even generating novel visual content. As these tasks and applications have modernized, so too has the reliance on more data, either for model training or for evaluation. In this chapter, we demonstrate that novel interaction strategies can enable new forms of data collection and evaluation for Computer Vision. First, we present a crowdsourcing interface for speeding up paid data collection by an order of magnitude, feeding the data-hungry nature of modern vision models. Second, we explore a method to increase volunteer contributions using automated social interventions. Third, we develop a system to ensure human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.