Evaluating Text Classification Robustness to Part-of-Speech Adversarial Examples
Anahita Samadi, Allison Sullivan

TL;DR
This paper investigates the robustness of text classification models against adversarial examples, revealing biases in CNNs related to parts of speech that expose vulnerabilities in linguistic understanding.
Contribution
It identifies specific parts of speech that significantly impact CNN-based classifiers, highlighting a critical vulnerability in their linguistic processing.
Findings
CNNs show bias against certain parts of speech in review datasets
Part-of-speech tokens significantly influence classifier decisions
Vulnerabilities in CNN linguistic processing are uncovered
Abstract
As machine learning systems become more widely used, especially for safety critical applications, there is a growing need to ensure that these systems behave as intended, even in the face of adversarial examples. Adversarial examples are inputs that are designed to trick the decision making process, and are intended to be imperceptible to humans. However, for text-based classification systems, changes to the input, a string of text, are always perceptible. Therefore, text-based adversarial examples instead focus on trying to preserve semantics. Unfortunately, recent work has shown this goal is often not met. To improve the quality of text-based adversarial examples, we need to know what elements of the input text are worth focusing on. To address this, in this paper, we explore what parts of speech have the highest impact of text-based classifiers. Our experiments highlight a distinct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Adversarial Robustness in Machine Learning · Hate Speech and Cyberbullying Detection
MethodsFocus
