Efficient Media Retrieval from Non-Cooperative Queries
Kevin Shih, Wei Di, Vignesh Jagadeesh, Robinson Piramuthu

TL;DR
This paper introduces a large-scale book cover retrieval dataset and a method that combines noisy OCR text matching with visual features to improve media retrieval accuracy from non-cooperative queries.
Contribution
It presents a novel dataset for book cover retrieval and a combined text-visual matching approach that enhances retrieval performance over existing methods.
Findings
Significant improvement in retrieval accuracy using combined text and visual features.
Effective handling of noisy OCR readings in media retrieval.
Demonstrated superiority over using visual or text features alone.
Abstract
Text is ubiquitous in the artificial world and easily attainable when it comes to book title and author names. Using the images from the book cover set from the Stanford Mobile Visual Search dataset and additional book covers and metadata from openlibrary.org, we construct a large scale book cover retrieval dataset, complete with 100K distractor covers and title and author strings for each. Because our query images are poorly conditioned for clean text extraction, we propose a method for extracting a matching noisy and erroneous OCR readings and matching it against clean author and book title strings in a standard document look-up problem setup. Finally, we demonstrate how to use this text-matching as a feature in conjunction with popular retrieval features such as VLAD using a simple learning setup to achieve significant improvements in retrieval accuracy over that of either VLAD or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Handwritten Text Recognition Techniques
