NEMO: Frequentist Inference Approach to Constrained Linguistic Typology   Feature Prediction in SIGTYP 2020 Shared Task

Alexander Gutkin; Richard Sproat

arXiv:2010.05985·cs.CL·October 26, 2022

NEMO: Frequentist Inference Approach to Constrained Linguistic Typology Feature Prediction in SIGTYP 2020 Shared Task

Alexander Gutkin, Richard Sproat

PDF

1 Repo

TL;DR

This paper introduces NEMO, a frequentist inference method using ridge regression for predicting linguistic features across languages, achieving high accuracy in the SIGTYP 2020 shared task.

Contribution

It presents a novel application of frequentist inference with simple estimators for linguistic feature prediction, ranking highly in a competitive shared task.

Findings

01

Achieved 0.66 micro-averaged accuracy on 149 languages

02

Ranked second and third in the shared task

03

Demonstrated effectiveness of ridge regression configurations

Abstract

This paper describes the NEMO submission to SIGTYP 2020 shared task which deals with prediction of linguistic typological features for multiple languages using the data derived from World Atlas of Language Structures (WALS). We employ frequentist inference to represent correlations between typological features and use this representation to train simple multi-class estimators that predict individual features. We describe two submitted ridge regression-based configurations which ranked second and third overall in the constrained task. Our best configuration achieved the micro-averaged accuracy score of 0.66 on 149 test languages.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/google-research/tree/master/constrained_language_typology
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.