A semantics-driven methodology for high-quality image annotation
Fausto Giunchiglia, Mayukh Bagchi, Xiaolei Diao

TL;DR
This paper introduces vTelos, a methodology that leverages NLP, knowledge representation, and computer vision to explicitly define annotation semantics, reducing subjectivity and systematic flaws in image datasets.
Contribution
The paper presents vTelos, a novel approach that uses WordNet to improve the clarity and consistency of image annotations by making semantics explicit.
Findings
Reduces subjective judgment in image annotation
Improves consistency of annotations in ImageNet subset
Minimizes systematic flaws in benchmark datasets
Abstract
Recent work in Machine Learning and Computer Vision has highlighted the presence of various types of systematic flaws inside ground truth object recognition benchmark datasets. Our basic tenet is that these flaws are rooted in the many-to-many mappings which exist between the visual information encoded in images and the intended semantics of the labels annotating them. The net consequence is that the current annotation process is largely under-specified, thus leaving too much freedom to the subjective judgment of annotators. In this paper, we propose vTelos, an integrated Natural Language Processing, Knowledge Representation, and Computer Vision methodology whose main goal is to make explicit the (otherwise implicit) intended annotation semantics, thus minimizing the number and role of subjective choices. A key element of vTelos is the exploitation of the WordNet lexico-semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
