Exploiting "Quantum-like Interference" in Decision Fusion for Ranking Multimodal Documents
Dimitris Gkoumas, Dawei Sogn

TL;DR
This paper introduces a quantum-inspired model for multimodal document ranking that captures inter-modal dependencies through quantum interference, improving fusion and ranking of visual and textual information.
Contribution
The paper proposes a novel quantum-inspired decision fusion method that models inter-modal dependencies for better multimodal document ranking.
Findings
Effective ranking on ImageCLEF2007photo dataset
Quantum interference improves multimodal fusion
Demonstrates theoretical and empirical benefits
Abstract
Fusing and ranking multimodal information remains always a challenging task. A robust decision-level fusion method should not only be dynamically adaptive for assigning weights to each representation but also incorporate inter-relationships among different modalities. In this paper, we propose a quantum-inspired model for fusing and ranking visual and textual information accounting for the dependency between the aforementioned modalities. At first, we calculate the text-based and image-based similarity individually. Two different approaches have been applied for computing each unimodal similarity. The first one makes use of the bag-of-words model. For the second one, a pre-trained VGG19 model on ImageNet has been used for calculating the image similarity, while a query expansion approach has been applied to the text-based query for improving the retrieval performance. Afterward, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Time Series Analysis and Forecasting
