CIBER: A Comprehensive Benchmark for Security Evaluation of Code Interpreter Agents
Lei Ba, Qinbin Li, Songze Li

TL;DR
CIBER is a comprehensive benchmark designed to evaluate the security vulnerabilities of code interpreter agents against various adversarial attacks, revealing insights into model robustness and security gaps.
Contribution
This paper introduces CIBER, a novel automated benchmark that assesses security risks of code interpreter agents through dynamic attack scenarios and state-aware evaluation.
Findings
Structural integration improves security performance.
High intelligence increases vulnerability to complex prompts.
Natural language disguises are more effective than code snippets.
Abstract
LLM-based code interpreter agents are increasingly deployed in critical workflows, yet their robustness against risks introduced by their code execution capabilities remains underexplored. Existing benchmarks are limited to static datasets or simulated environments, failing to capture the security risks arising from dynamic code execution, tool interactions, and multi-turn context. To bridge this gap, we introduce CIBER, an automated benchmark that combines dynamic attack generation, isolated secure sandboxing, and state-aware evaluation to systematically assess the vulnerability of code interpreter agents against four major types of adversarial attacks: Direct/Indirect Prompt Injection, Memory Poisoning, and Prompt-based Backdoor. We evaluate six foundation models across two representative code interpreter agents (OpenInterpreter and OpenCodeInterpreter), incorporating a controlled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Adversarial Robustness in Machine Learning · Advanced Malware Detection Techniques
