Imitation Learning for Neural Morphological String Transduction

Peter Makarov; Simon Clematide

arXiv:1808.10701·cs.CL·September 3, 2018

Imitation Learning for Neural Morphological String Transduction

Peter Makarov, Simon Clematide

PDF

Open Access 1 Repo

TL;DR

This paper introduces an imitation learning approach for neural morphological string transduction that eliminates the need for character aligners or warm starts, achieving state-of-the-art results on multiple benchmarks.

Contribution

It presents a novel imitation learning method for training neural transition-based models without external aligners or warm starts, improving performance in morphological tasks.

Findings

01

Achieves state-of-the-art performance on multiple benchmarks.

02

Eliminates dependence on external character aligners.

03

Addresses biases inherent in maximum likelihood training.

Abstract

We employ imitation learning to train a neural transition-based string transducer for morphological tasks such as inflection generation and lemmatization. Previous approaches to training this type of model either rely on an external character aligner for the production of gold action sequences, which results in a suboptimal model due to the unwarranted dependence on a single gold action sequence despite spurious ambiguity, or require warm starting with an MLE model. Our approach only requires a simple expert policy, eliminating the need for a character aligner or warm start. It also addresses familiar MLE training biases and leads to strong and state-of-the-art performance on several benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ZurichNLP/emnlp2018-imitation-learning-for-neural-morphology
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications