Annotation Methodologies for Vision and Language Dataset Creation
Gitit Kehat, James Pustejovsky

TL;DR
This paper discusses the challenges and issues encountered during the creation and validation of annotated datasets for vision and language tasks, highlighting common problems in data selection and annotation processes.
Contribution
It provides an analysis of difficulties faced in dataset creation for vision-language tasks, emphasizing the need for improved annotation methodologies.
Findings
Identifies common problems in dataset annotation processes
Highlights issues in data validation for vision-language datasets
Suggests areas for improving annotation methodologies
Abstract
Annotated datasets are commonly used in the training and evaluation of tasks involving natural language and vision (image description generation, action recognition and visual question answering). However, many of the existing datasets reflect problems that emerge in the process of data selection and annotation. Here we point out some of the difficulties and problems one confronts when creating and validating annotated vision and language datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
