A multimodal deep learning framework for scalable content based visual   media retrieval

Ambareesh Ravi; Amith Nandakumar

arXiv:2105.08665·cs.LG·May 19, 2021·1 cites

A multimodal deep learning framework for scalable content based visual media retrieval

Ambareesh Ravi, Amith Nandakumar

PDF

Open Access 2 Repos

TL;DR

This paper introduces a scalable deep learning framework for content-based visual media retrieval that works efficiently for both images and videos, with a new comparison metric and performance validation.

Contribution

It presents a novel, modular, and scalable deep learning-based system for visual media retrieval applicable to images and videos, including an efficient comparison metric.

Findings

01

Demonstrates improved retrieval efficiency over conventional methods

02

Validates scalability and flexibility for images and videos

03

Provides insights for future enhancements

Abstract

We propose a novel, efficient, modular and scalable framework for content based visual media retrieval systems by leveraging the power of Deep Learning which is flexible to work both for images and videos conjointly and we also introduce an efficient comparison and filtering metric for retrieval. We put forward our findings from critical performance tests comparing our method to the predominant conventional approach to demonstrate the feasibility and efficiency of the proposed solution with best practices, possible improvements that may further augment the ability of retrieval architectures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques