A Multi-cascaded Deep Model for Bilingual SMS Classification

Muhammad Haroon Shakeel; Asim Karim; Imdadullah Khan

arXiv:1911.13066·cs.CL·December 16, 2019

A Multi-cascaded Deep Model for Bilingual SMS Classification

Muhammad Haroon Shakeel, Asim Karim, Imdadullah Khan

PDF

1 Repo

TL;DR

This paper introduces McM, a novel deep learning model for bilingual SMS classification that effectively handles multilingual, informal, and noisy short texts without external resources, outperforming previous models.

Contribution

The paper presents a multi-cascaded deep learning model that learns bilingual SMS classification without code-switching cues or external knowledge, addressing a gap in multilingual short text classification.

Findings

01

Achieves high accuracy on a new bilingual SMS dataset

02

Outperforms previous multilingual text classification models

03

Demonstrates language independence of the proposed approach

Abstract

Most studies on text classification are focused on the English language. However, short texts such as SMS are influenced by regional languages. This makes the automatic text classification task challenging due to the multilingual, informal, and noisy nature of language in the text. In this work, we propose a novel multi-cascaded deep learning model called McM for bilingual SMS classification. McM exploits $n$ -gram level information as well as long-term dependencies of text for learning. Our approach aims to learn a model without any code-switching indication, lexical normalization, language translation, or language transliteration. The model relies entirely upon the text as no external knowledge base is utilized for learning. For this purpose, a 12 class bilingual text dataset is developed from SMS feedbacks of citizens on public services containing mixed Roman Urdu and English…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haroonshakeel/bilingual_sms_classification
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.