Reformulation is All You Need: Addressing Malicious Text Features in DNNs

Yi Jiang; Oubo Ma; Yong Yang; Tong Zhang; Shouling Ji

arXiv:2502.00652·cs.LG·September 26, 2025

Reformulation is All You Need: Addressing Malicious Text Features in DNNs

Yi Jiang, Oubo Ma, Yong Yang, Tong Zhang, Shouling Ji

PDF

Open Access

TL;DR

This paper introduces a unified, adaptive defense framework for NLP models that detects and mitigates malicious textual features exploited in adversarial and backdoor attacks, improving robustness without compromising semantics.

Contribution

The authors propose a novel reformulation-based defense method that effectively counters both adversarial and backdoor attacks by addressing malicious features during input encoding.

Findings

01

Outperforms existing defenses across various malicious features

02

Effective against both adversarial and backdoor attacks

03

Preserves semantic integrity of inputs

Abstract

Human language encompasses a wide range of intricate and diverse implicit features, which attackers can exploit to launch adversarial or backdoor attacks, compromising DNN models for NLP tasks. Existing model-oriented defenses often require substantial computational resources as model size increases, whereas sample-oriented defenses typically focus on specific attack vectors or schemes, rendering them vulnerable to adaptive attacks. We observe that the root cause of both adversarial and backdoor attacks lies in the encoding process of DNN models, where subtle textual features, negligible for human comprehension, are erroneously assigned significant weight by less robust or trojaned models. Based on it we propose a unified and adaptive defense framework that is effective against both adversarial and backdoor attacks. Our approach leverages reformulation modules to address potential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Digital and Cyber Forensics · Access Control and Trust

MethodsFocus