TransMI: A Framework to Create Strong Baselines from Multilingual Pretrained Language Models for Transliterated Data
Yihong Liu, Chunlan Ma, Haotian Ye, Hinrich Sch\"utze

TL;DR
TransMI is a framework that leverages existing multilingual pretrained language models to effectively handle transliterated data across scripts without retraining, significantly improving crosslingual transfer performance.
Contribution
It introduces a simple, training-free method to adapt mPLMs for transliterated data by transliterating vocabularies and merging embeddings, enabling effective cross-script transfer.
Findings
TransMI improves crosslingual transfer by 3% to 34%.
It preserves the ability to handle non-transliterated data.
The framework is applicable to multiple strong mPLMs.
Abstract
Transliterating related languages that use different scripts into a common script is effective for improving crosslingual transfer in downstream tasks. However, this methodology often makes pretraining a model from scratch unavoidable, as transliteration brings about new subwords not covered in existing multilingual pretrained language models (mPLMs). This is undesirable because it requires a large computation budget. A more promising way is to make full use of available mPLMs. To this end, this paper proposes a simple but effective framework: Transliterate-Merge-Initialize (TransMI). TransMI can create strong baselines for data that is transliterated into a common script by exploiting an existing mPLM and its tokenizer without any training. TransMI has three stages: (a) transliterate the vocabulary of an mPLM into a common script; (b) merge the new vocabulary with the original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
