Zero-shot object prediction using semantic scene knowledge
Rene Grzeszick, Gernot A. Fink

TL;DR
This paper proposes a zero-shot object prediction method leveraging semantic scene-object relations extracted from external text sources, enabling object recognition without extensive labeled data, especially in cluttered scenes.
Contribution
It introduces a novel approach that uses semantic relations from large text corpora for zero-shot object prediction based on scene recognition.
Findings
Scene knowledge improves object prediction in cluttered scenes.
Semantic relations can be obtained from web text corpora.
Method reduces reliance on manual labeling.
Abstract
This work focuses on the semantic relations between scenes and objects for visual object recognition. Semantic knowledge can be a powerful source of information especially in scenarios with few or no annotated training samples. These scenarios are referred to as zero-shot or few-shot recognition and often build on visual attributes. Here, instead of relying on various visual attributes, a more direct way is pursued: after recognizing the scene that is depicted in an image, semantic relations between scenes and objects are used for predicting the presence of objects in an unsupervised manner. Most importantly, relations between scenes and objects can easily be obtained from external sources such as large scale text corpora from the web and, therefore, do not require tremendous manual labeling efforts. It will be shown that in cluttered scenes, where visual recognition is difficult, scene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
