HIT: A Hierarchically Fused Deep Attention Network for Robust Code-mixed   Language Representation

Ayan Sengupta; Sourabh Kumar Bhattacharjee; Tanmoy Chakraborty; Md; Shad Akhtar

arXiv:2105.14600·cs.CL·June 1, 2021

HIT: A Hierarchically Fused Deep Attention Network for Robust Code-mixed Language Representation

Ayan Sengupta, Sourabh Kumar Bhattacharjee, Tanmoy Chakraborty, Md, Shad Akhtar

PDF

1 Repo

TL;DR

HIT is a hierarchical transformer-based model that improves code-mixed language representations by fusing multiple attention mechanisms, leading to better performance across multiple NLP tasks and languages.

Contribution

The paper introduces HIT, a novel hierarchical transformer framework with fused attention modules specifically designed for robust code-mixed language representation.

Findings

01

Significant performance improvements over state-of-the-art systems.

02

Effective across multiple languages and NLP tasks.

03

Demonstrates adaptability in transfer learning scenarios.

Abstract

Understanding linguistics and morphology of resource-scarce code-mixed texts remains a key challenge in text processing. Although word embedding comes in handy to support downstream tasks for low-resource languages, there are plenty of scopes in improving the quality of language representation particularly for code-mixed languages. In this paper, we propose HIT, a robust representation learning method for code-mixed texts. HIT is a hierarchical transformer-based framework that captures the semantic relationship among words and hierarchically learns the sentence-level semantics using a fused attention mechanism. HIT incorporates two attention modules, a multi-headed self-attention and an outer product attention module, and computes their weighted sum to obtain the attention weights. Our evaluation of HIT on one European (Spanish) and five Indic (Hindi, Bengali, Tamil, Telugu, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LCS2-IIITD/HIT-ACL2021-Codemixed-Representation
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.