Mondrian: Prompt Abstraction Attack Against Large Language Models for Cheaper API Pricing
Wai Man Si, Michael Backes, Yang Zhang

TL;DR
This paper introduces Mondrian, a prompt abstraction attack that reduces token usage in LLM API queries, enabling cost savings and profit generation without degrading model performance.
Contribution
The paper presents a novel prompt abstraction attack method, Mondrian, which effectively lowers API query costs while maintaining response quality, revealing new vulnerabilities in LLM API pricing models.
Findings
Reduces token length by 13-23% across tasks
Maintains task utility despite abstraction
Enables profit without API development costs
Abstract
The Machine Learning as a Service (MLaaS) market is rapidly expanding and becoming more mature. For example, OpenAI's ChatGPT is an advanced large language model (LLM) that generates responses for various queries with associated fees. Although these models can deliver satisfactory performance, they are far from perfect. Researchers have long studied the vulnerabilities and limitations of LLMs, such as adversarial attacks and model toxicity. Inevitably, commercial ML models are also not exempt from such issues, which can be problematic as MLaaS continues to grow. In this paper, we discover a new attack strategy against LLM APIs, namely the prompt abstraction attack. Specifically, we propose Mondrian, a simple and straightforward method that abstracts sentences, which can lower the cost of using LLM APIs. In this approach, the adversary first creates a pseudo API (with a lower established…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
Methodstravel james
