WHITE PAPER: A Brief Exploration of Data Exfiltration using GCG Suffixes

Victor Valbuena

arXiv:2408.00925·cs.CR·August 5, 2024

WHITE PAPER: A Brief Exploration of Data Exfiltration using GCG Suffixes

Victor Valbuena

PDF

Open Access

TL;DR

This paper explores how GCG suffixes can be exploited in cross-prompt injection attacks to enhance data exfiltration risks in large language models, demonstrating a viable attack model with significant success rate increase.

Contribution

It introduces a novel GCG suffix attack method for data exfiltration in LLMs, showing its effectiveness through a simulated attack scenario.

Findings

01

GCG suffixes can increase data exfiltration success odds by nearly 20%.

02

The attack model is viable and demonstrates increased risk in LLM security.

03

Highlights the need for improved defenses against gradient-based injection attacks.

Abstract

The cross-prompt injection attack (XPIA) is an effective technique that can be used for data exfiltration, and that has seen increasing use. In this attack, the attacker injects a malicious instruction into third party data which an LLM is likely to consume when assisting a user, who is the victim. XPIA is often used as a means for data exfiltration, and the estimated cost of the average data breach for a business is nearly $4.5 million, which includes breaches such as compromised enterprise credentials. With the rise of gradient-based attacks such as the GCG suffix attack, the odds of an XPIA occurring which uses a GCG suffix are worryingly high. As part of my work in Microsoft's AI Red Team, I demonstrated a viable attack model using a GCG suffix paired with an injection in a simulated XPIA scenario. The results indicate that the presence of a GCG suffix can increase the odds of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression