Local Structure Matters Most in Most Languages
Louis Clou\^atre, Prasanna Parthasarathi, Amal Zouaq, Sarath, Chandar

TL;DR
This study investigates the significance of local versus global linguistic structures in multilingual NLP, confirming that local structure is generally more crucial across over 120 languages, similar to findings in English.
Contribution
It replicates and extends previous English NLP studies to a multilingual context, demonstrating the importance of local structure across diverse languages.
Findings
Local structure is more important than global structure in most languages.
The phenomenon observed in English generalizes to over 120 languages.
Some caveats exist in the multilingual generalization.
Abstract
Many recent perturbation studies have found unintuitive results on what does and does not matter when performing Natural Language Understanding (NLU) tasks in English. Coding properties, such as the order of words, can often be removed through shuffling without impacting downstream performances. Such insight may be used to direct future research into English NLP models. As many improvements in multilingual settings consist of wholesale adaptation of English approaches, it is important to verify whether those studies replicate or not in multilingual settings. In this work, we replicate a study on the importance of local structure, and the relative unimportance of global structure, in a multilingual setting. We find that the phenomenon observed on the English language broadly translates to over 120 languages, with a few caveats.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
