A multimodal deep learning framework for scalable content based visual media retrieval
Ambareesh Ravi, Amith Nandakumar

TL;DR
This paper introduces a scalable deep learning framework for content-based visual media retrieval that works efficiently for both images and videos, with a new comparison metric and performance validation.
Contribution
It presents a novel, modular, and scalable deep learning-based system for visual media retrieval applicable to images and videos, including an efficient comparison metric.
Findings
Demonstrates improved retrieval efficiency over conventional methods
Validates scalability and flexibility for images and videos
Provides insights for future enhancements
Abstract
We propose a novel, efficient, modular and scalable framework for content based visual media retrieval systems by leveraging the power of Deep Learning which is flexible to work both for images and videos conjointly and we also introduce an efficient comparison and filtering metric for retrieval. We put forward our findings from critical performance tests comparing our method to the predominant conventional approach to demonstrate the feasibility and efficiency of the proposed solution with best practices, possible improvements that may further augment the ability of retrieval architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
