SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models
Huining Cui, Wei Liu

TL;DR
SecReEvalBench is a comprehensive benchmark designed to evaluate large language models' resilience against multi-turn, intent-driven adversarial prompts across various security domains, revealing their strengths and vulnerabilities.
Contribution
The paper introduces a novel multi-turned security resilience benchmark with new metrics, diverse attack sequences, and a specialized dataset for evaluating LLM security in real-world scenarios.
Findings
Identified vulnerabilities of current LLMs to multi-turn adversarial prompts.
Provided a systematic evaluation of five state-of-the-art LLMs' security resilience.
Published a publicly available dataset for future research in LLM security.
Abstract
The increasing deployment of large language models in security-sensitive domains necessitates rigorous evaluation of their resilience against adversarial prompt-based attacks. While previous benchmarks have focused on security evaluations with limited and predefined attack domains, such as cybersecurity attacks, they often lack a comprehensive assessment of intent-driven adversarial prompts and the consideration of real-life scenario-based multi-turn attacks. To address this gap, we present SecReEvalBench, the Security Resilience Evaluation Benchmark, which defines four novel metrics: Prompt Attack Resilience Score, Prompt Attack Refusal Logic Score, Chain-Based Attack Resilience Score and Chain-Based Attack Rejection Time Score. Moreover, SecReEvalBench employs six questioning sequences for model assessment: one-off attack, successive attack, successive reverse attack, alternative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Graph Neural Networks
