Hindi to English: Transformer-Based Neural Machine Translation
Kavit Gangar, Hardik Ruparel, Shreyas Lele

TL;DR
This paper presents a Transformer-based neural machine translation system for Hindi to English, utilizing back-translation and subword tokenization to improve translation quality, achieving a BLEU score of 24.53.
Contribution
The study introduces a Transformer NMT model for Hindi-English translation with data augmentation and subword tokenization, achieving state-of-the-art results.
Findings
Achieved a BLEU score of 24.53 on IIT Bombay corpus.
Implemented back-translation for data augmentation.
Compared word and subword tokenization methods.
Abstract
Machine Translation (MT) is one of the most prominent tasks in Natural Language Processing (NLP) which involves the automatic conversion of texts from one natural language to another while preserving its meaning and fluency. Although the research in machine translation has been going on since multiple decades, the newer approach of integrating deep learning techniques in natural language processing has led to significant improvements in the translation quality. In this paper, we have developed a Neural Machine Translation (NMT) system by training the Transformer model to translate texts from Indian Language Hindi to English. Hindi being a low resource language has made it difficult for neural networks to understand the language thereby leading to a slow growth in the development of neural machine translators. Thus, to address this gap, we implemented back-translation to augment the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Label Smoothing · Dropout · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Linear Layer · Softmax
