OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for   Effective Adversarial Text Generation

W{\l}odzimierz Lewoniewski; Piotr Stolarski; Milena Str\'o\.zyna,; Elzbieta Lewa\'nska; Aleksandra Wojewoda; Ewelina Ksi\k{e}\.zniak; Marcin; Sawi\'nski

arXiv:2409.02649·cs.CL·September 6, 2024·2 cites

OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation

W{\l}odzimierz Lewoniewski, Piotr Stolarski, Milena Str\'o\.zyna,, Elzbieta Lewa\'nska, Aleksandra Wojewoda, Ewelina Ksi\k{e}\.zniak, Marcin, Sawi\'nski

PDF

Open Access

TL;DR

This paper explores combining multiple adversarial attack methods to generate more effective adversarial examples, thereby testing and improving the robustness of NLP models used in credibility assessment across various misinformation domains.

Contribution

It introduces modified and hybrid adversarial attack techniques that outperform existing methods, advancing the development of more resilient NLP models against adversarial attacks.

Findings

01

Enhanced attack effectiveness through method combination

02

Significant improvements over baseline attacks

03

Demonstrated robustness testing across multiple datasets

Abstract

This paper presents the experiments and results for the CheckThat! Lab at CLEF 2024 Task 6: Robustness of Credibility Assessment with Adversarial Examples (InCrediblAE). The primary objective of this task was to generate adversarial examples in five problem domains in order to evaluate the robustness of widely used text classification methods (fine-tuned BERT, BiLSTM, and RoBERTa) when applied to credibility assessment issues. This study explores the application of ensemble learning to enhance adversarial attacks on natural language processing (NLP) models. We systematically tested and refined several adversarial attack methods, including BERT-Attack, Genetic algorithms, TextFooler, and CLARE, on five datasets across various misinformation tasks. By developing modified versions of BERT-Attack and hybrid methods, we achieved significant improvements in attack effectiveness. Our results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Hate Speech and Cyberbullying Detection

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Softmax · Dropout · Layer Normalization · Linear Layer · Adam · Weight Decay · Dense Connections · Sigmoid Activation