TL;DR
The paper introduces AIC, a large-scale, richly annotated dataset designed to advance complex computer vision tasks such as keypoint detection, attribute recognition, and image captioning, filling a gap in existing datasets.
Contribution
It presents a comprehensive dataset with multiple annotations for diverse vision tasks, enabling better training and evaluation of models beyond simple classification.
Findings
AIC dataset provides extensive annotations for multiple tasks.
The dataset serves as an effective benchmark for complex vision tasks.
It offers a resource for pre-training models in various computer vision applications.
Abstract
Significant progress has been achieved in Computer Vision by leveraging large-scale image datasets. However, large-scale datasets for complex Computer Vision tasks beyond classification are still limited. This paper proposed a large-scale dataset named AIC (AI Challenger) with three sub-datasets, human keypoint detection (HKD), large-scale attribute dataset (LAD) and image Chinese captioning (ICC). In this dataset, we annotate class labels (LAD), keypoint coordinate (HKD), bounding box (HKD and LAD), attribute (LAD) and caption (ICC). These rich annotations bridge the semantic gap between low-level images and high-level concepts. The proposed dataset is an effective benchmark to evaluate and improve different computational methods. In addition, for related tasks, others can also use our dataset as a new resource to pre-train their models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
