How a General-Purpose Commonsense Ontology can Improve Performance of Learning-Based Image Retrieval
Rodrigo Toro Icarte, Jorge A. Baier, Cristian Ruz, Alvaro Soto

TL;DR
This paper demonstrates that integrating a general-purpose commonsense ontology, specifically ConceptNet, with deep learning-based object detectors enhances image retrieval performance by providing meaningful visual relations.
Contribution
It shows how filtering ConceptNet with the ESPGAME dataset improves visual reasoning in image retrieval tasks, combining rule-based knowledge with deep learning.
Findings
ConceptNet improves image retrieval accuracy.
Filtering relations with ESPGAME enhances relevance.
Knowledge integration boosts visual recognition performance.
Abstract
The knowledge representation community has built general-purpose ontologies which contain large amounts of commonsense knowledge over relevant aspects of the world, including useful visual information, e.g.: "a ball is used by a football player", "a tennis player is located at a tennis court". Current state-of-the-art approaches for visual recognition do not exploit these rule-based knowledge sources. Instead, they learn recognition models directly from training examples. In this paper, we study how general-purpose ontologies---specifically, MIT's ConceptNet ontology---can improve the performance of state-of-the-art vision systems. As a testbed, we tackle the problem of sentence-based image retrieval. Our retrieval approach incorporates knowledge from ConceptNet on top of a large pool of object detectors derived from a deep learning technique. In our experiments, we show that ConceptNet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
