Code-Mixer Ya Nahi: Novel Approaches to Measuring Multilingual LLMs' Code-Mixing Capabilities
Ayushman Gupta, Akhil Bhogal, Kripabandhu Ghosh

TL;DR
This paper introduces Rule-Based Prompting to evaluate and generate code-mixed sentences in multilingual LLMs, comparing their translation abilities across multiple language pairs and creating a code-mixed chatbot application.
Contribution
It proposes a novel Rule-Based Prompting technique for measuring and generating code-mixed sentences in multilingual LLMs, expanding evaluation methods beyond traditional k-shot prompting.
Findings
k-shot prompting often yields better results
Rule-Based prompting generates diverse code-mixed sentences
Multilingual LLMs show varying translation abilities in code-mixed contexts
Abstract
Multilingual Large Language Models (LLMs) have demonstrated exceptional performance in Machine Translation (MT) tasks. However, their MT abilities in the context of code-switching (the practice of mixing two or more languages in an utterance) remain under-explored. In this paper, we introduce Rule-Based Prompting, a novel prompting technique to generate code-mixed sentences. We measure and compare the code-mixed MT abilities of 3 popular multilingual LLMs: GPT-3.5-turbo, GPT-4, and Gemini Pro across five language pairs: English-{Hindi, Bengali, Gujarati, French, Spanish} using -shot prompting () and Rule-Based Prompting. Our findings suggest that though -shot prompting often leads to the best results, Rule-Based prompting shows promise in generating unique code-mixed sentences that vary in their style of code-mixing. We also use -shot prompting to gauge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Library Science and Information Systems · linguistics and terminology studies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Position-Wise Feed-Forward Layer · Cosine Annealing · Absolute Position Encodings · Label Smoothing · Transformer · Residual Connection · Dropout · Layer Normalization
