The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan, Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci,, Alexander Kolesnikov, Tom Duerig, Vittorio Ferrari

TL;DR
Open Images V4 is a large-scale, richly annotated dataset that supports multiple computer vision tasks including classification, detection, and relationship understanding, enabling advanced research and model development.
Contribution
The paper introduces Open Images V4, a comprehensive dataset with unified annotations for multiple tasks, significantly larger and more diverse than previous datasets, facilitating multi-task learning and structured reasoning.
Findings
Dataset contains 9.2 million images with extensive annotations.
Provides 15 times more bounding boxes than previous datasets.
Supports multiple tasks with unified annotations, enabling new research avenues.
Abstract
We present Open Images V4, a dataset of 9.2M images with unified annotations for image classification, object detection and visual relationship detection. The images have a Creative Commons Attribution license that allows to share and adapt the material, and they have been collected from Flickr without a predefined list of class names or tags, leading to natural class statistics and avoiding an initial design bias. Open Images V4 offers large scale across several dimensions: 30.1M image-level labels for 19.8k concepts, 15.4M bounding boxes for 600 object classes, and 375k visual relationship annotations involving 57 classes. For object detection in particular, we provide 15x more bounding boxes than the next largest datasets (15.4M boxes on 1.9M images). The images often show complex scenes with several objects (8 annotated objects per image on average). We annotated visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
