SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage

Xiaoning Dong; Wenbo Hu; Wei Xu; Tianxing He

arXiv:2412.15289·cs.CR·November 25, 2025

SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage

Xiaoning Dong, Wenbo Hu, Wei Xu, Tianxing He

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces SATA, a new paradigm for bypassing LLM safety measures using simple assistive tasks to encode malicious intent, achieving high success rates in jailbreak experiments.

Contribution

SATA is a novel jailbreak method that links masked queries with assistive tasks, outperforming existing approaches in effectiveness and efficiency.

Findings

01

Achieves 85% attack success rate with MLM assistive task.

02

Outperforms baselines significantly on AdvBench dataset.

03

Effectively encodes malicious intent using simple assistive tasks.

Abstract

Large language models (LLMs) have made significant advancements across various tasks, but their safety alignment remain a major concern. Exploring jailbreak prompts can expose LLMs' vulnerabilities and guide efforts to secure them. Existing methods primarily design sophisticated instructions for the LLM to follow, or rely on multiple iterations, which could hinder the performance and efficiency of jailbreaks. In this work, we propose a novel jailbreak paradigm, Simple Assistive Task Linkage (SATA), which can effectively circumvent LLM safeguards and elicit harmful responses. Specifically, SATA first masks harmful keywords within a malicious query to generate a relatively benign query containing one or multiple [MASK] special tokens. It then employs a simple assistive task such as a masked language model task or an element lookup by position task to encode the semantics of the masked…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xndong/sata
noneOfficial

Videos

SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage· underline

Taxonomy

TopicsDigital and Cyber Forensics · Privacy-Preserving Technologies in Data · Data Quality and Management