PACO: Parts and Attributes of Common Objects
Vignesh Ramanathan, Anmol Kalia, Vladan Petrovic, Yi Wen, Baixue, Zheng, Baishan Guo, Rui Wang, Aaron Marquez, Rama Kovvuri, Abhishek Kadian,, Amir Mousavi, Yiwen Song, Abhimanyu Dubey, Dhruv Mahajan

TL;DR
PACO introduces a comprehensive dataset with detailed part and attribute annotations for 75 object categories, enabling advanced object understanding tasks like segmentation, attribute prediction, and zero-shot detection.
Contribution
It provides a large-scale, richly annotated dataset spanning multiple object categories, parts, and attributes, along with benchmarks for key vision tasks.
Findings
Established baseline results for part mask segmentation.
Provided benchmarks for attribute prediction.
Enabled zero-shot instance detection evaluation.
Abstract
Object models are gradually progressing from predicting just category labels to providing detailed descriptions of object instances. This motivates the need for large datasets which go beyond traditional object masks and provide richer annotations such as part masks and attributes. Hence, we introduce PACO: Parts and Attributes of Common Objects. It spans 75 object categories, 456 object-part categories and 55 attributes across image (LVIS) and video (Ego4D) datasets. We provide 641K part masks annotated across 260K object boxes, with roughly half of them exhaustively annotated with attributes as well. We design evaluation metrics and provide benchmark results for three tasks on the dataset: part mask segmentation, object and part attribute prediction and zero-shot instance detection. Dataset, models, and code are open-sourced at https://github.com/facebookresearch/paco.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
