Well, that escalated quickly: The Single-Turn Crescendo Attack (STCA)

Alan Aqrawi; Arian Abbasi

arXiv:2409.03131·cs.CR·September 12, 2024

Well, that escalated quickly: The Single-Turn Crescendo Attack (STCA)

Alan Aqrawi, Arian Abbasi

PDF

Open Access 1 Repo 1 Datasets

TL;DR

The paper presents the Single-Turn Crescendo Attack (STCA), a novel adversarial technique that efficiently provokes harmful responses from large language models in a single prompt, exposing vulnerabilities in current AI safety measures.

Contribution

It introduces the STCA, a new single-turn attack method that mimics multi-turn escalation to bypass moderation filters in LLMs, highlighting the need for improved safeguards.

Findings

01

STCA effectively provokes harmful responses in LLMs.

02

STCA bypasses existing moderation filters.

03

Highlights vulnerabilities in current LLM safety measures.

Abstract

This paper introduces a new method for adversarial attacks on large language models (LLMs) called the Single-Turn Crescendo Attack (STCA). Building on the multi-turn crescendo attack method introduced by Russinovich, Salem, and Eldan (2024), which gradually escalates the context to provoke harmful responses, the STCA achieves similar outcomes in a single interaction. By condensing the escalation into a single, well-crafted prompt, the STCA bypasses typical moderation filters that LLMs use to prevent inappropriate outputs. This technique reveals vulnerabilities in current LLMs and emphasizes the importance of stronger safeguards in responsible AI (RAI). The STCA offers a novel method that has not been previously explored.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alanaqrawi/stca
noneOfficial

Datasets

ari-abb/STCA
dataset· 38 dl
38 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSeismology and Earthquake Studies