Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in ESL
Yevgeni Berzak, Roi Reichart, Boris Katz

TL;DR
This paper introduces a typology-driven computational framework based on Contrastive Analysis to predict grammatical error distributions in ESL texts from native language properties, even for low-resource languages.
Contribution
It formalizes the theory of Contrastive Analysis within a computational model to predict ESL error patterns from native language typology, including a bootstrapping method for low-resource languages.
Findings
Accurately predicts error distributions without ESL data for target languages.
Provides a bootstrapping approach for low-resource languages.
Facilitates linguistic analysis of factors affecting second language acquisition.
Abstract
This work examines the impact of cross-linguistic transfer on grammatical errors in English as Second Language (ESL) texts. Using a computational framework that formalizes the theory of Contrastive Analysis (CA), we demonstrate that language specific error distributions in ESL writing can be predicted from the typological properties of the native language and their relation to the typology of English. Our typology driven model enables to obtain accurate estimates of such distributions without access to any ESL data for the target languages. Furthermore, we present a strategy for adjusting our method to low-resource languages that lack typological documentation using a bootstrapping approach which approximates native language typology from ESL texts. Finally, we show that our framework is instrumental for linguistic inquiry seeking to identify first language factors that contribute to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling
