Pattern Spotting and Image Retrieval in Historical Documents using Deep Hashing
Caio da S. Dias, Alceu de S. Britto Jr., Jean P. Barddal, Laurent, Heutte, Alessandro L. Koerich

TL;DR
This paper introduces a deep learning-based method for pattern spotting and image retrieval in historical documents, utilizing deep hashing techniques to improve accuracy and significantly reduce search time and storage costs.
Contribution
It proposes a novel deep hashing approach with real-valued and binary representations for efficient pattern spotting and retrieval in historical document images.
Findings
Outperforms state-of-the-art methods by 2.56% in pattern spotting accuracy.
Reduces search time by up to 200 times.
Decreases storage costs by up to 6,000 times.
Abstract
This paper presents a deep learning approach for image retrieval and pattern spotting in digital collections of historical documents. First, a region proposal algorithm detects object candidates in the document page images. Next, deep learning models are used for feature extraction, considering two distinct variants, which provide either real-valued or binary code representations. Finally, candidate images are ranked by computing the feature similarity with a given input query. A robust experimental protocol evaluates the proposed approach considering each representation scheme (real-valued and binary code) on the DocExplore image database. The experimental results show that the proposed deep models compare favorably to the state-of-the-art image retrieval approaches for images of historical documents, outperforming other deep models by 2.56 percentage points using the same techniques…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Handwritten Text Recognition Techniques
