# A Comprehensive Study of ImageNet Pre-Training for Historical Document   Image Analysis

**Authors:** Linda Studer, Michele Alberti, Vinaychandran Pondenkandath, Pinar, Goktepe, Thomas Kolonko, Andreas Fischer, Marcus Liwicki, Rolf Ingold

arXiv: 1905.09113 · 2019-05-23

## TL;DR

This paper empirically evaluates the impact of ImageNet pre-training on various historical document analysis tasks, finding it generally benefits classification and retrieval but shows mixed results for pixel-level segmentation.

## Contribution

It provides a comprehensive empirical survey on the effectiveness of ImageNet pre-training for diverse historical document analysis tasks.

## Key findings

- ImageNet pre-training improves classification accuracy.
- Pre-training enhances content-based retrieval performance.
- Mixed results observed for semantic segmentation tasks.

## Abstract

Automatic analysis of scanned historical documents comprises a wide range of image analysis tasks, which are often challenging for machine learning due to a lack of human-annotated learning samples. With the advent of deep neural networks, a promising way to cope with the lack of training data is to pre-train models on images from a different domain and then fine-tune them on historical documents. In the current research, a typical example of such cross-domain transfer learning is the use of neural networks that have been pre-trained on the ImageNet database for object recognition. It remains a mostly open question whether or not this pre-training helps to analyse historical documents, which have fundamentally different image properties when compared with ImageNet. In this paper, we present a comprehensive empirical survey on the effect of ImageNet pre-training for diverse historical document analysis tasks, including character recognition, style classification, manuscript dating, semantic segmentation, and content-based retrieval. While we obtain mixed results for semantic segmentation at pixel-level, we observe a clear trend across different network architectures that ImageNet pre-training has a positive effect on classification as well as content-based retrieval.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.09113/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1905.09113/full.md

---
Source: https://tomesphere.com/paper/1905.09113