Bridging Dialects: Translating Standard Bangla to Regional Variants Using Neural Models
Md. Arafat Alam Khandaker, Ziyan Shirin Raha, Bidyarthi Paul, Tashreef, Muhammad

TL;DR
This paper develops neural machine translation models to convert standard Bangla into regional dialects, aiming to preserve linguistic diversity and enhance communication for dialect speakers.
Contribution
It introduces fine-tuned neural models, especially BanglaT5, for translating Bangla into regional dialects, filling a gap in language technology for these variants.
Findings
BanglaT5 achieved the lowest CER and WER among models tested.
The models effectively captured dialectal nuances in translation.
The study promotes inclusive language technology for regional dialects.
Abstract
The Bangla language includes many regional dialects, adding to its cultural richness. The translation of Bangla Language into regional dialects presents a challenge due to significant variations in vocabulary, pronunciation, and sentence structure across regions like Chittagong, Sylhet, Barishal, Noakhali, and Mymensingh. These dialects, though vital to local identities, lack of representation in technological applications. This study addresses this gap by translating standard Bangla into these dialects using neural machine translation (NMT) models, including BanglaT5, mT5, and mBART50. The work is motivated by the need to preserve linguistic diversity and improve communication among dialect speakers. The models were fine-tuned using the "Vashantor" dataset, containing 32,500 sentences across various dialects, and evaluated through Character Error Rate (CER) and Word Error Rate (WER)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Byte Pair Encoding · Gated Linear Unit · Residual Connection · Dropout · SentencePiece · Softmax · Linear Layer · Attention Is All You Need · Inverse Square Root Schedule
