# Better Character Language Modeling Through Morphology

**Authors:** Terra Blevins, Luke Zettlemoyer

arXiv: 1906.01037 · 2019-06-14

## TL;DR

This paper demonstrates that incorporating morphological supervision into character language models enhances performance across multiple languages, especially benefiting inflected words and low-resource scenarios.

## Contribution

It introduces a multitasking approach that integrates morphological information into CLMs, improving their accuracy even with disjoint data sources and enabling transfer across languages.

## Key findings

- Morphological supervision reduces bits-per-character across 24 languages.
- Inflected words benefit more from morphological modeling than uninflected words.
- Transfer of morphological supervision improves low-resource language modeling.

## Abstract

We incorporate morphological supervision into character language models (CLMs) via multitasking and show that this addition improves bits-per-character (BPC) performance across 24 languages, even when the morphology data and language modeling data are disjoint. Analyzing the CLMs shows that inflected words benefit more from explicitly modeling morphology than uninflected words, and that morphological supervision improves performance even as the amount of language modeling data grows. We then transfer morphological supervision across languages to improve language modeling performance in the low-resource setting.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.01037/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1906.01037/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1906.01037/full.md

---
Source: https://tomesphere.com/paper/1906.01037