TL;DR
This paper introduces a novel text evasion attack called ZeW that manipulates the indexing stage of MLaaS, deceiving major providers, and proposes a simple validation defense to counter it.
Contribution
It presents the ZeW attack exploiting non-readable characters in text indexing, revealing vulnerabilities in popular MLaaS providers, and offers a straightforward defense mechanism.
Findings
ZeW successfully deceives 11 out of 12 MLaaS services.
Most services are vulnerable to manipulation at the indexing stage.
A simple input validation can prevent the ZeW attack.
Abstract
The increased demand for machine learning applications made companies offer Machine-Learning-as-a-Service (MLaaS). In MLaaS (a market estimated 8000M USD by 2025), users pay for well-performing ML models without dealing with the complicated training procedure. Among MLaaS, text-based applications are the most popular ones (e.g., language translators). Given this popularity, MLaaS must provide resiliency to adversarial manipulations. For example, a wrong translation might lead to a misunderstanding between two parties. In the text domain, state-of-the-art attacks mainly focus on strategies that leverage ML models' weaknesses. Unfortunately, not much attention has been given to the other pipeline' stages, such as the indexing stage (i.e., when a sentence is converted from a textual to a numerical representation) that, if manipulated, can significantly affect the final performance of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
