An Evaluation of Neural Machine Translation Models on Historical Spelling Normalization
Gongbo Tang, Fabienne Cap, Eva Pettersson, Joakim Nivre

TL;DR
This paper evaluates various neural machine translation models for historical spelling normalization across five languages, demonstrating that NMT models outperform SMT, with specific architectures excelling under different data conditions.
Contribution
The study systematically compares multiple NMT architectures and attention mechanisms for spelling normalization, introducing a hybrid method that enhances performance.
Findings
NMT models outperform SMT in character error rate.
Transformer models need more data to outperform RNNs.
Subword models with small vocabularies are better for low-resource languages.
Abstract
In this paper, we apply different NMT models to the problem of historical spelling normalization for five languages: English, German, Hungarian, Icelandic, and Swedish. The NMT models are at different levels, have different attention mechanisms, and different neural network architectures. Our results show that NMT models are much better than SMT models in terms of character error rate. The vanilla RNNs are competitive to GRUs/LSTMs in historical spelling normalization. Transformer models perform better only when provided with more training data. We also find that subword-level models with a small subword vocabulary are better than character-level models for low-resource languages. In addition, we propose a hybrid method which further improves the performance of historical spelling normalization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
