Moly\'e: A Corpus-based Approach to Language Contact in Colonial France
Rasul Dent, Juliette Jan\`es, Thibault Cl\'erice, Pedro Ortiz Suarez,, Beno\^it Sagot

TL;DR
This paper introduces the Molyé corpus, a comprehensive dataset combining European language variations and early French-based Creole attestations over 400 years, to explore language contact and evolution in colonial France.
Contribution
It provides a new open corpus that enables research on the linguistic continuity between European contact languages and Creoles, addressing gaps in historical evidence.
Findings
The corpus covers 400 years of language data.
It facilitates analysis of language contact and evolution.
Supports future research on colonial language development.
Abstract
Whether or not several Creole languages which developed during the early modern period can be considered genetic descendants of European languages has been the subject of intense debate. This is in large part due to the absence of evidence of intermediate forms. This work introduces a new open corpus, the Moly\'e corpus, which combines stereotypical representations of three kinds of language variation in Europe with early attestations of French-based Creole languages across a period of 400 years. It is intended to facilitate future research on the continuity between contact situations in Europe and Creolophone (former) colonies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
