BITS Pilani at HinglishEval: Quality Evaluation for Code-Mixed Hinglish Text Using Transformers
Shaz Furniturewala, Vijay Kumari, Amulya Ratna Dash, Hriday Kedia,, Yashvardhan Sharma

TL;DR
This paper presents a method using multi-lingual BERT to evaluate the quality of synthetic Hinglish sentences by measuring their similarity to human-generated sentences, addressing challenges in code-mixed language processing.
Contribution
It introduces a novel approach leveraging multi-lingual BERT for quality assessment of code-mixed Hinglish text, specifically for the HinglishEval task.
Findings
Multi-lingual BERT effectively measures similarity between synthetic and human Hinglish sentences.
The proposed model achieves promising results in the HinglishEval quality evaluation task.
The approach helps improve understanding and processing of code-mixed language data.
Abstract
Code-Mixed text data consists of sentences having words or phrases from more than one language. Most multi-lingual communities worldwide communicate using multiple languages, with English usually one of them. Hinglish is a Code-Mixed text composed of Hindi and English but written in Roman script. This paper aims to determine the factors influencing the quality of Code-Mixed text data generated by the system. For the HinglishEval task, the proposed model uses multi-lingual BERT to find the similarity between synthetically generated and human-generated sentences to predict the quality of synthetically generated Hinglish sentences.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Text Readability and Simplification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Residual Connection · Linear Warmup With Linear Decay · Weight Decay · Adam · Layer Normalization · WordPiece
