TCAB: A Large-Scale Text Classification Attack Benchmark
Kalyani Asthana, Zhouhang Xie, Wencong You, Adam Noack, Jonathan, Brophy, Sameer Singh, Daniel Lowd

TL;DR
TCAB is a comprehensive large-scale dataset for analyzing, detecting, and understanding adversarial attacks on text classifiers, facilitating research in attack detection, localization, and characterization.
Contribution
Introduces TCAB, a large, automated dataset of 1.5 million adversarial attack instances on text classifiers, enabling diverse attack analysis tasks.
Findings
Contains 1.5 million attack instances across multiple classifiers and datasets
Includes human-labeled subset for semantic preservation analysis
Supports attack detection, localization, and target labeling tasks
Abstract
We introduce the Text Classification Attack Benchmark (TCAB), a dataset for analyzing, understanding, detecting, and labeling adversarial attacks against text classifiers. TCAB includes 1.5 million attack instances, generated by twelve adversarial attacks targeting three classifiers trained on six source datasets for sentiment analysis and abuse detection in English. Unlike standard text classification, text attacks must be understood in the context of the target classifier that is being attacked, and thus features of the target classifier are important as well. TCAB includes all attack instances that are successful in flipping the predicted label; a subset of the attacks are also labeled by human annotators to determine how frequently the primary semantics are preserved. The process of generating attacks is automated, so that TCAB can easily be extended to incorporate new text attacks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Terrorism, Counterterrorism, and Political Violence · Misinformation and Its Impacts
