An Attention Based Neural Network for Code Switching Detection: English   & Roman Urdu

Aizaz Hussain; Muhammad Umair Arshad

arXiv:2103.02252·cs.CL·March 4, 2021

An Attention Based Neural Network for Code Switching Detection: English & Roman Urdu

Aizaz Hussain, Muhammad Umair Arshad

PDF

Open Access

TL;DR

This paper introduces an attention-based neural network model for detecting language switches in code-switched English and Roman Urdu text, outperforming traditional models in accuracy and precision.

Contribution

The study proposes a novel attention-enhanced RNN model specifically designed for low-resource Roman Urdu, improving language identification in code-switched data.

Findings

01

Attention mechanism improves classification accuracy

02

Model outperforms HMM, CRF, and BiLSTM

03

Enhanced precision and recall in language detection

Abstract

Code-switching is a common phenomenon among people with diverse lingual background and is widely used on the internet for communication purposes. In this paper, we present a Recurrent Neural Network combined with the Attention Model for Language Identification in Code-Switched Data in English and low resource Roman Urdu. The attention model enables the architecture to learn the important features of the languages hence classifying the code switched data. We demonstrated our approach by comparing the results with state of the art models i.e. Hidden Markov Models, Conditional Random Field and Bidirectional LSTM. The models evaluation, using confusion matrix metrics, showed that the attention mechanism provides improved the precision and accuracy as compared to the other models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Communication and Language · Multilingual Education and Policy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory