TL;DR
This paper introduces a ranking-based learning approach for word spotting that optimizes retrieval metrics, demonstrating competitive results on handwritten and scene images by aligning relevance scores with string edit distance.
Contribution
It presents a novel ranking-based objective function for training word encoders that improves retrieval performance in word spotting tasks.
Findings
Competitive performance on query-by-string word spotting
Effective retrieval using relevance scores based on string edit distance
Applicable to handwritten and scene word images
Abstract
In this paper, we explore and evaluate the use of ranking-based objective functions for learning simultaneously a word string and a word image encoder. We consider retrieval frameworks in which the user expects a retrieval list ranked according to a defined relevance score. In the context of a word spotting problem, the relevance score has been set according to the string edit distance from the query string. We experimentally demonstrate the competitive performance of the proposed model on query-by-string word spotting for both, handwritten and real scene word images. We also provide the results for query-by-example word spotting, although it is not the main focus of this work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
