STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming

MinJae Jung; YongTaek Lim; Chaeyun Kim; Junghwan Kim; Kihyun Kim; Minwoo Kim

arXiv:2604.18976·cs.CL·April 22, 2026

STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming

MinJae Jung, YongTaek Lim, Chaeyun Kim, Junghwan Kim, Kihyun Kim, Minwoo Kim

PDF

1 Repo

TL;DR

STAR-Teaming is a novel black-box framework that uses a multiplex network approach to automate red teaming of LLMs, improving attack success rates and interpretability while reducing computational costs.

Contribution

It introduces a strategy-response multiplex network to enhance the efficiency and explainability of automated LLM red teaming, outperforming existing methods.

Findings

01

Achieves higher attack success rate (ASR) than existing methods.

02

Reduces computational cost of red teaming.

03

Provides better interpretability of LLM vulnerabilities.

Abstract

While Large Language Models (LLMs) are widely used, they remain susceptible to jailbreak prompts that can elicit harmful or inappropriate responses. This paper introduces STAR-Teaming, a novel black-box framework for automated red teaming that effectively generates such prompts. STAR-Teaming integrates a Multi-Agent System (MAS) with a Strategy-Response Multiplex Network and employs network-driven optimization to sample effective attack strategies. This network-based approach recasts the intractable high-dimensional embedding space into a tractable structure, yielding two key advantages: it enhances the interpretability of the LLM's strategic vulnerabilities, and it streamlines the search for effective strategies by organizing the search space into semantic communities, thereby preventing redundant exploration. Empirical results demonstrate that STAR-Teaming significantly surpasses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

selectstar-ai/STAR-Teaming-paper
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.