Predicting the Performance of Multilingual NLP Models
Anirudh Srinivasan, Sunayana Sitaram, Tanuja Ganu, Sandipan Dandapat,, Kalika Bali, Monojit Choudhury

TL;DR
This paper introduces a method to predict multilingual NLP model performance across languages using existing evaluation data, addressing the challenge of limited language coverage in current benchmarks.
Contribution
It proposes a performance predictor trained on known language scores to estimate model performance on untested languages, enhancing evaluation coverage.
Findings
Predictor effectively estimates performance on known languages.
Method shows potential but needs improvement for unseen languages.
Addresses evaluation limitations in multilingual NLP models.
Abstract
Recent advancements in NLP have given us models like mBERT and XLMR that can serve over 100 languages. The languages that these models are evaluated on, however, are very few in number, and it is unlikely that evaluation datasets will cover all the languages that these models support. Potential solutions to the costly problem of dataset creation are to translate datasets to new languages or use template-filling based techniques for creation. This paper proposes an alternate solution for evaluating a model across languages which make use of the existing performance scores of the model on languages that a particular task has test sets for. We train a predictor on these performance scores and use this predictor to predict the model's performance in different evaluation settings. Our results show that our method is effective in filling the gaps in the evaluation for an existing set of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
MethodsTest · mBERT
