Towards Imperceptible Document Manipulations against Neural Ranking Models
Xuanang Chen, Ben He, Zheng Ye, Le Sun, Yingfei Sun

TL;DR
This paper introduces IDEM, a novel framework for creating imperceptible adversarial documents that can fool neural ranking models without introducing detectable errors, improving robustness and practicality.
Contribution
We propose IDEM, a new method that uses generative language models and a merging strategy to produce subtle adversarial texts, reducing detectability and dependence on surrogate models.
Findings
IDEM outperforms existing baselines in effectiveness.
Generated adversarial documents maintain fluency and correctness.
IDEM is more robust and less reliant on surrogate models.
Abstract
Adversarial attacks have gained traction in order to identify potential vulnerabilities in neural ranking models (NRMs), but current attack methods often introduce grammatical errors, nonsensical expressions, or incoherent text fragments, which can be easily detected. Additionally, current methods rely heavily on the use of a well-imitated surrogate NRM to guarantee the attack effect, which makes them difficult to use in practice. To address these issues, we propose a framework called Imperceptible DocumEnt Manipulation (IDEM) to produce adversarial documents that are less noticeable to both algorithms and humans. IDEM instructs a well-established generative language model, such as BART, to generate connection sentences without introducing easy-to-detect errors, and employs a separate position-wise merging strategy to balance relevance and coherence of the perturbed text. Experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling
MethodsAttention Is All You Need · Softmax · Adam · Layer Normalization · Linear Layer · Dropout · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Residual Connection
