Automatic Extraction of Rules Governing Morphological Agreement

Aditi Chaudhary; Antonios Anastasopoulos; Adithya Pratapa; David R.; Mortensen; Zaid Sheikh; Yulia Tsvetkov; Graham Neubig

arXiv:2010.01160·cs.CL·October 7, 2020

Automatic Extraction of Rules Governing Morphological Agreement

Aditi Chaudhary, Antonios Anastasopoulos, Adithya Pratapa, David R., Mortensen, Zaid Sheikh, Yulia Tsvetkov, Graham Neubig

PDF

1 Repo

TL;DR

This paper presents an automated framework for extracting agreement rules from raw text to facilitate language documentation, achieving near expert-level accuracy across multiple languages without extensive annotations.

Contribution

It introduces a novel automated method for extracting morphosyntactic agreement rules from raw text, reducing manual effort in grammar creation.

Findings

01

Framework achieves 78% accuracy in rule extraction.

02

Effective cross-lingual transfer without expert annotations.

03

Promising results across all Universal Dependencies languages.

Abstract

Creating a descriptive grammar of a language is an indispensable step for language documentation and preservation. However, at the same time it is a tedious, time-consuming task. In this paper, we take steps towards automating this process by devising an automated framework for extracting a first-pass grammatical specification from raw text in a concise, human- and machine-readable format. We focus on extracting rules describing agreement, a morphosyntactic phenomenon at the core of the grammars of many of the world's languages. We apply our framework to all languages included in the Universal Dependencies project, with promising results. Using cross-lingual transfer, even with no expert annotations in the language of interest, our framework extracts a grammatical specification which is nearly equivalent to those created with large amounts of gold-standard annotated data. We confirm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Aditi138/LASE-Agreement
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.