LPS-Bench: Benchmarking Safety Awareness of Computer-Use Agents in Long-Horizon Planning under Benign and Adversarial Scenarios
Tianyu Chen, Chujia Hu, Ge Gao, Dongrui Liu, Xia Hu, Wenjie Wang

TL;DR
LPS-Bench is a comprehensive benchmark designed to evaluate and improve the safety awareness of computer-use agents in long-horizon planning tasks, addressing both benign and adversarial scenarios.
Contribution
This work introduces LPS-Bench, a novel benchmark for assessing planning-time safety in MCP-based agents across diverse scenarios and proposes mitigation strategies for safety risks.
Findings
Existing agents show significant safety deficiencies.
Benchmark covers 65 scenarios across 7 domains and 9 risk types.
Proposed mitigation strategies enhance safety awareness.
Abstract
Computer-use agents (CUAs) that interact with real computer systems can perform automated tasks but face critical safety risks. Ambiguous instructions may trigger harmful actions, and adversarial users can manipulate tool execution to achieve malicious goals. Existing benchmarks mostly focus on short-horizon or GUI-based tasks, evaluating on execution-time errors but overlooking the ability to anticipate planning-time risks. To fill this gap, we present LPS-Bench, a benchmark that evaluates the planning-time safety awareness of MCP-based CUAs under long-horizon tasks, covering both benign and adversarial interactions across 65 scenarios of 7 task domains and 9 risk types. We introduce a multi-agent automated pipeline for scalable data generation and adopt an LLM-as-a-judge evaluation protocol to assess safety awareness through the planning trajectory. Experiments reveal substantial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSafety Systems Engineering in Autonomy · Adversarial Robustness in Machine Learning · AI-based Problem Solving and Planning
