Automatic Annotation of Structured Facts in Images

Mohamed Elhoseiny; Scott Cohen; Walter Chang; Brian Price; Ahmed; Elgammal

arXiv:1604.00466·cs.CL·April 11, 2016

Automatic Annotation of Structured Facts in Images

Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed, Elgammal

PDF

TL;DR

This paper introduces an automatic method for extracting structured visual facts from images with captions, enabling large-scale fact annotation with high accuracy and efficiency.

Contribution

The authors propose a novel language-based approach that automatically collects and localizes hundreds of thousands of visual facts from images with captions, significantly advancing data collection for image understanding.

Findings

01

Collected over 380,000 visual fact annotations

02

Achieved 83% accuracy in fact annotation

03

Processed data in less than one day on standard CPU

Abstract

Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions. Example structured facts include attributed objects (e.g., <flower, red>), actions (e.g., <baby, smile>), interactions (e.g., <man, walking, dog>), and positional information (e.g., <vase, on, table>). The collected annotations are in the form of fact-image pairs (e.g.,<man, walking, dog> and an image region containing this fact). With a language approach, the proposed method is able to collect hundreds of thousands of visual fact annotations with accuracy of 83% according to human judgment. Our method automatically collected more than 380,000 visual fact annotations and more than 110,000 unique visual facts from images with captions and localized them in images in less than one day of processing time on standard CPU…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.