# Towards Robust Named Entity Recognition for Historic German

**Authors:** Stefan Schweter, Johannes Baiter

arXiv: 1906.07592 · 2019-06-19

## TL;DR

This paper demonstrates that character-based pre-trained language models significantly improve named entity recognition performance on low-resource Historic German datasets, outperforming classical methods and previous neural models.

## Contribution

It introduces pre-trained character-based language models tailored for low-resource Historic German NER, showing substantial performance gains over prior approaches.

## Key findings

- Pre-trained character-based models outperform classical CRF methods.
- Models boost F1 scores by up to 6%.
- Pre-trained models are publicly available.

## Abstract

Recent advances in language modeling using deep neural networks have shown that these models learn representations, that vary with the network depth from morphology to semantic relationships like co-reference. We apply pre-trained language models to low-resource named entity recognition for Historic German. We show on a series of experiments that character-based pre-trained language models do not run into trouble when faced with low-resource datasets. Our pre-trained character-based language models improve upon classical CRF-based methods and previous work on Bi-LSTMs by boosting F1 score performance by up to 6%. Our pre-trained language and NER models are publicly available under https://github.com/stefan-it/historic-ner .

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.07592/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1906.07592/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1906.07592/full.md

---
Source: https://tomesphere.com/paper/1906.07592