On Language Models for Creoles
Heather Lent, Emanuele Bugliarello, Miryam de Lhoneux, Chen Qiu and, Anders S{\o}gaard

TL;DR
This paper explores NLP models for under-resourced creole languages, releasing datasets and models, and comparing standard versus robust models, revealing creoles' linguistic stability and model performance nuances.
Contribution
It releases corpora and models for Haitian Creole, Nigerian Pidgin, and Singaporean Colloquial English, and compares standard and distributionally robust models on these languages.
Findings
Standard models outperform robust models for creoles.
Creoles exhibit relative linguistic stability despite complex origins.
Model performance is not solely driven by over-parameterization.
Abstract
Creole languages such as Nigerian Pidgin English and Haitian Creole are under-resourced and largely ignored in the NLP literature. Creoles typically result from the fusion of a foreign language with multiple local languages, and what grammatical and lexical features are transferred to the creole is a complex process. While creoles are generally stable, the prominence of some features may be much stronger with certain demographics or in some linguistic situations. This paper makes several contributions: We collect existing corpora and release models for Haitian Creole, Nigerian Pidgin English, and Singaporean Colloquial English. We evaluate these models on intrinsic and extrinsic tasks. Motivated by the above literature, we compare standard language models with distributionally robust ones and find that, somewhat surprisingly, the standard language models are superior to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Syntax, Semantics, Linguistic Variation · Language and cultural evolution
