Contextual Code Switching for Machine Translation using Language Models
Arshad Kaji, Manan Shah

TL;DR
This paper investigates the performance of various large language models on code switching in machine translation, revealing that smaller, task-specific models can outperform larger multilingual models due to training methodology differences.
Contribution
The study provides an extensive comparison of LLMs on code switching for machine translation, highlighting the advantages of smaller, fine-tuned models over larger multilingual models.
Findings
Smaller models outperform larger multilingual models in code switching translation tasks.
Training methodology significantly impacts LLM performance in contextual code switching.
Multilingual LLMs have limited efficacy in code switching due to their training approaches.
Abstract
Large language models (LLMs) have exerted a considerable impact on diverse language-related tasks in recent years. Their demonstrated state-of-the-art performance is achieved through methodologies such as zero-shot or few-shot prompting. These models undergo training on extensive datasets that encompass segments of the Internet and subsequently undergo fine-tuning tailored to specific tasks. Notably, they exhibit proficiency in tasks such as translation, summarization, question answering, and creative writing, even in the absence of explicit training for those particular tasks. While they have shown substantial improvement in the multilingual tasks their performance in the code switching, especially for machine translation remains relatively uncharted. In this paper, we present an extensive study on the code switching task specifically for the machine translation task comparing multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
