# Pattern Spotting in Historical Documents Using Convolutional Models

**Authors:** Ignacio \'Ubeda, Jose M. Saavedra, St\'ephane Nicolas, Caroline, Petitjean, Laurent Heutte

arXiv: 1906.08580 · 2019-06-21

## TL;DR

This paper introduces a convolutional neural network approach using RetinaNet for pattern spotting in historical documents, effectively locating patterns with less storage but struggling with multiple instances per page.

## Contribution

The paper proposes a novel CNN-based method for pattern spotting in historical documents that outperforms existing systems in pattern localization and storage efficiency.

## Key findings

- Better pattern localization than state-of-the-art
- Requires less storage for image indexing
- Struggles with pages containing multiple query instances

## Abstract

Pattern spotting consists of searching in a collection of historical document images for occurrences of a graphical object using an image query. Contrary to object detection, no prior information nor predefined class is given about the query so training a model of the object is not feasible. In this paper, a convolutional neural network approach is proposed to tackle this problem. We use RetinaNet as a feature extractor to obtain multiscale embeddings of the regions of the documents and also for the queries. Experiments conducted on the DocExplore dataset show that our proposal is better at locating patterns and requires less storage for indexing images than the state-of-the-art system, but fails at retrieving some pages containing multiple instances of the query.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.08580/full.md

## Figures

36 figures with captions in the complete paper: https://tomesphere.com/paper/1906.08580/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/1906.08580/full.md

---
Source: https://tomesphere.com/paper/1906.08580