Joint Audio-Video Fingerprint Media Retrieval Using Rate-Coverage Optimization
Guanghan Ning, Zhi Zhang, Xiaobo Ren, Haohong Wang, Zhihai He

TL;DR
This paper introduces a novel joint audio-video fingerprinting method for media retrieval that optimizes rate-coverage balance, significantly improving query accuracy and reducing bit-rate compared to existing approaches.
Contribution
It presents the first theoretical rate-coverage model and optimization framework for joint audio-video fingerprinting in media retrieval.
Findings
Up to 25% query accuracy improvement over reference algorithms.
Achieves 25% bit-rate reduction while maintaining 85% accuracy.
Significantly outperforms single-source fingerprinting methods.
Abstract
In this work, we propose a joint audio-video fingerprint Automatic Content Recognition (ACR) technology for media retrieval. The problem is focused on how to balance the query accuracy and the size of fingerprint, and how to allocate the bits of the fingerprint to video frames and audio frames to achieve the best query accuracy. By constructing a novel concept called Coverage, which is highly correlated to the query accuracy, we are able to form a rate-coverage model to translate the original problem into an optimization problem that can be resolved by dynamic programming. To the best of our knowledge, this is the first work that uses joint audio-video fingerprint ACR technology for media retrieval with a theoretical problem formulation. Experimental results indicate that compared to reference algorithms, the proposed method has up to 25% query accuracy improvement while using 60%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Music and Audio Processing
