VizWiz-FewShot: Locating Objects in Images Taken by People With Visual   Impairments

Yu-Yun Tseng; Alexander Bell; and Danna Gurari

arXiv:2207.11810·cs.CV·July 26, 2022

VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments

Yu-Yun Tseng, Alexander Bell, and Danna Gurari

PDF

Open Access

TL;DR

This paper presents VizWiz-FewShot, a new dataset for few-shot object localization featuring images from visually impaired photographers, highlighting challenges in current algorithms and encouraging further research.

Contribution

The paper introduces the first dataset focusing on objects with holes, large size variation, and text, derived from real-world images by visually impaired users, and evaluates current algorithms on it.

Findings

01

Existing algorithms perform poorly on the new dataset.

02

Objects with holes and extreme sizes are difficult to detect.

03

Text-rich objects pose additional challenges.

Abstract

We introduce a few-shot localization dataset originating from photographers who authentically were trying to learn about the visual content in the images they took. It includes nearly 10,000 segmentations of 100 categories in over 4,500 images that were taken by people with visual impairments. Compared to existing few-shot object detection and instance segmentation datasets, our dataset is the first to locate holes in objects (e.g., found in 12.3\% of our segmentations), it shows objects that occupy a much larger range of sizes relative to the images, and text is over five times more common in our objects (e.g., found in 22.4\% of our segmentations). Analysis of three modern few-shot localization algorithms demonstrates that they generalize poorly to our new dataset. The algorithms commonly struggle to locate objects with holes, very small and very large objects, and objects lacking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications