Are Language Models Agnostic to Linguistically Grounded Perturbations? A   Case Study of Indic Languages

Poulami Ghosh; Raj Dabre; Pushpak Bhattacharyya

arXiv:2412.10805·cs.CL·December 17, 2024

Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages

Poulami Ghosh, Raj Dabre, Pushpak Bhattacharyya

PDF

Open Access 1 Video

TL;DR

This study examines the vulnerability of pre-trained language models to linguistically grounded perturbations across Indic languages, revealing that such models are somewhat resistant but still significantly affected by subtle linguistic attacks.

Contribution

First comprehensive analysis of PLMs' susceptibility to linguistically grounded perturbations in multiple Indic languages and downstream tasks.

Findings

01

PLMs are susceptible to linguistic perturbations.

02

PLMs show slightly lower susceptibility to linguistic attacks compared to non-linguistic ones.

03

Linguistic attacks remain effective despite constraints.

Abstract

Pre-trained language models (PLMs) are known to be susceptible to perturbations to the input text, but existing works do not explicitly focus on linguistically grounded attacks, which are subtle and more prevalent in nature. In this paper, we study whether PLMs are agnostic to linguistically grounded attacks or not. To this end, we offer the first study addressing this, investigating different Indic languages and various downstream tasks. Our findings reveal that although PLMs are susceptible to linguistic perturbations, when compared to non-linguistic attacks, PLMs exhibit a slightly lower susceptibility to linguistic attacks. This highlights that even constrained attacks are effective. Moreover, we investigate the implications of these outcomes across a range of languages, encompassing diverse language families and different scripts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages· underline

Taxonomy

TopicsNatural Language Processing Techniques · Language and cultural evolution

MethodsFocus