Visual Intelligence through Human Interaction
Ranjay Krishna, Mitchell Gordon, Li Fei-Fei, Michael Bernstein

TL;DR
This paper explores innovative human-computer interaction methods to enhance data collection, evaluation, and contribution in computer vision, leveraging crowdsourcing, social interventions, and psychophysical grounding.
Contribution
It introduces novel interaction strategies for faster data collection, increased volunteer contributions, and reliable human evaluation of generative vision models.
Findings
Crowdsourcing interface accelerates data collection by tenfold.
Automated social interventions boost volunteer contributions.
Psychophysics-based evaluation ensures reliable human assessment.
Abstract
Over the last decade, Computer Vision, the branch of Artificial Intelligence aimed at understanding the visual world, has evolved from simply recognizing objects in images to describing pictures, answering questions about images, aiding robots maneuver around physical spaces and even generating novel visual content. As these tasks and applications have modernized, so too has the reliance on more data, either for model training or for evaluation. In this chapter, we demonstrate that novel interaction strategies can enable new forms of data collection and evaluation for Computer Vision. First, we present a crowdsourcing interface for speeding up paid data collection by an order of magnitude, feeding the data-hungry nature of modern vision models. Second, we explore a method to increase volunteer contributions using automated social interventions. Third, we develop a system to ensure human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
