A Three Step Training Approach with Data Augmentation for Morphological   Inflection

Gabor Szolnok; Botond Barta; Dorina Lakatos; Judit Acs

arXiv:2109.07006·cs.CL·September 16, 2021

A Three Step Training Approach with Data Augmentation for Morphological Inflection

Gabor Szolnok, Botond Barta, Dorina Lakatos, Judit Acs

PDF

Open Access

TL;DR

This paper introduces a three-step training method with data augmentation for morphological inflection across diverse languages, improving simplicity and applicability over existing models, though not surpassing Transformer baselines.

Contribution

It proposes a novel three-step training approach combined with data augmentation techniques tailored for morphological inflection in multiple languages.

Findings

01

Outperformed other submissions in the shared task

02

Data augmentation and training steps generally improve performance

03

Model remains simpler and more adaptable than Transformer baselines

Abstract

We present the BME submission for the SIGMORPHON 2021 Task 0 Part 1, Generalization Across Typologically Diverse Languages shared task. We use an LSTM encoder-decoder model with three step training that is first trained on all languages, then fine-tuned on each language families and finally finetuned on individual languages. We use a different type of data augmentation technique in the first two steps. Our system outperformed the only other submission. Although it remains worse than the Transformer baseline released by the organizers, our model is simpler and our data augmentation techniques are easily applicable to new languages. We perform ablation studies and show that the augmentation techniques and the three training steps often help but sometimes have a negative effect.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Label Smoothing · Adam · Residual Connection · Multi-Head Attention · Softmax