Improving Zero-Shot Cross-Lingual Transfer via Progressive Code-Switching
Zhuoran Li, Chunming Hu, Junfan Chen, Zhijun Chen, Xiaohui Guo and, Richong Zhang

TL;DR
This paper introduces Progressive Code-Switching (PCS), a novel data augmentation method that gradually generates more challenging code-switching examples to improve zero-shot cross-lingual transfer performance.
Contribution
The paper proposes a new PCS method with a difficulty measurer and training scheduler to enhance cross-lingual transfer by controlling code-switching difficulty levels.
Findings
Achieves state-of-the-art results on three cross-lingual tasks
Effectively balances easy and hard code-switching data during training
Improves model generalization across ten languages
Abstract
Code-switching is a data augmentation scheme mixing words from multiple languages into source lingual text. It has achieved considerable generalization performance of cross-lingual transfer tasks by aligning cross-lingual contextual word representations. However, uncontrolled and over-replaced code-switching would augment dirty samples to model training. In other words, the excessive code-switching text samples will negatively hurt the models' cross-lingual transferability. To this end, we propose a Progressive Code-Switching (PCS) method to gradually generate moderately difficult code-switching examples for the model to discriminate from easy to hard. The idea is to incorporate progressively the preceding learned multilingual knowledge using easier code-switching data to guide model optimization on succeeding harder code-switching data. Specifically, we first design a difficulty…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling
