AI Challenger : A Large-scale Dataset for Going Deeper in Image   Understanding

Jiahong Wu; He Zheng; Bo Zhao; Yixin Li; Baoming Yan; Rui Liang,; Wenjia Wang; Shipei Zhou; Guosen Lin; Yanwei Fu; Yizhou Wang; Yonggang Wang

arXiv:1711.06475·cs.CV·March 9, 2021

AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding

Jiahong Wu, He Zheng, Bo Zhao, Yixin Li, Baoming Yan, Rui Liang,, Wenjia Wang, Shipei Zhou, Guosen Lin, Yanwei Fu, Yizhou Wang, Yonggang Wang

PDF

3 Repos 1 Models

TL;DR

The paper introduces AIC, a large-scale, richly annotated dataset designed to advance complex computer vision tasks such as keypoint detection, attribute recognition, and image captioning, filling a gap in existing datasets.

Contribution

It presents a comprehensive dataset with multiple annotations for diverse vision tasks, enabling better training and evaluation of models beyond simple classification.

Findings

01

AIC dataset provides extensive annotations for multiple tasks.

02

The dataset serves as an effective benchmark for complex vision tasks.

03

It offers a resource for pre-training models in various computer vision applications.

Abstract

Significant progress has been achieved in Computer Vision by leveraging large-scale image datasets. However, large-scale datasets for complex Computer Vision tasks beyond classification are still limited. This paper proposed a large-scale dataset named AIC (AI Challenger) with three sub-datasets, human keypoint detection (HKD), large-scale attribute dataset (LAD) and image Chinese captioning (ICC). In this dataset, we annotate class labels (LAD), keypoint coordinate (HKD), bounding box (HKD and LAD), attribute (LAD) and caption (ICC). These rich annotations bridge the semantic gap between low-level images and high-level concepts. The proposed dataset is an effective benchmark to evaluate and improve different computational methods. In addition, for related tasks, others can also use our dataset as a new resource to pre-train their models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
vrg-prague/BBoxMaskPose
model· 417 dl· ♡ 4
417 dl♡ 4

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.