Modeling the Graphotactics of Low-Resource Languages Using Sequential GANs
Isaac Wasserman

TL;DR
This paper presents a GAN-based approach to generate graphotactically valid strings for low-resource languages, aiding morphological inflection modeling with minimal data.
Contribution
It introduces a novel sequential GAN model specifically designed to produce linguistically plausible data from very limited examples.
Findings
Successfully generated graphotactically compliant strings
Enhanced morphological inflection modeling for low-resource languages
Demonstrated effectiveness with only 100 example strings
Abstract
Generative Adversarial Networks (GANs) have been shown to aid in the creation of artificial data in situations where large amounts of real data are difficult to come by. This issue is especially salient in the computational linguistics space, where researchers are often tasked with modeling the complex morphologic and grammatical processes of low-resource languages. This paper will discuss the implementation and testing of a GAN that attempts to model and reproduce the graphotactics of a language using only 100 example strings. These artificial, yet graphotactically compliant, strings are meant to aid in modeling the morphological inflection of low-resource languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Language and cultural evolution
