Understanding the Gist of Images - Ranking of Concepts for Multimedia Indexing
Lydia Weiland, Simone Paolo Ponzetto, Wolfgang Effelsberg, and Laura, Dietz

TL;DR
This paper presents a novel multimedia indexing approach that leverages external knowledge and a two-stage learning-to-rank framework to improve understanding of image content and enhance indexing accuracy.
Contribution
It introduces a two-stage learning-to-rank pipeline utilizing Wikipedia concepts and demonstrates its effectiveness on a large image dataset.
Findings
Achieved a MAP of 61.42 on MIRFlickr25k
Outperforms DBM in multimedia indexing
Competitively matches hashing-based methods
Abstract
Nowadays, where multimedia data is continuously generated, stored, and distributed, multimedia indexing, with its purpose of group- ing similar data, becomes more important than ever. Understanding the gist (=message) of multimedia instances is framed in related work as a ranking of concepts from a knowledge base, i.e., Wikipedia. We cast the task of multimedia indexing as a gist understanding problem. Our pipeline benefits from external knowledge and two subsequent learning- to-rank (l2r) settings. The first l2r produces a ranking of concepts rep- resenting the respective multimedia instance. The second l2r produces a mapping between the concept representation of an instance and the targeted class topic(s) for the multimedia indexing task. The evaluation on an established big size corpus (MIRFlickr25k, with 25,000 images), shows that multimedia indexing benefits from understanding the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
