VerificAgent: Domain-Specific Memory Verification for Scalable Oversight of Aligned Computer-Use Agents

Thong Q. Nguyen; Shubhang Desai; Raja Hasnain Anwar; Firoz Shaik; Vishwas Suryanarayanan; Vishal Chowdhary

arXiv:2506.02539·cs.LG·August 11, 2025

VerificAgent: Domain-Specific Memory Verification for Scalable Oversight of Aligned Computer-Use Agents

Thong Q. Nguyen, Shubhang Desai, Raja Hasnain Anwar, Firoz Shaik, Vishwas Suryanarayanan, Vishal Chowdhary

PDF

Open Access

TL;DR

VerificAgent is a scalable framework that enhances computer-using agents with verified, domain-specific memory, improving safety, reliability, and interpretability without additional model training.

Contribution

It introduces a novel oversight method combining expert knowledge, iterative memory growth, and human verification to ensure safe and aligned agent behavior.

Findings

01

Reduces hallucination-induced failures in agents.

02

Improves task reliability in productivity benchmarks.

03

Maintains interpretable and auditable guidance.

Abstract

Continual memory augmentation lets computer-using agents (CUAs) learn from prior interactions, but unvetted memories can encode domain-inappropriate or unsafe heuristics--spurious rules that drift from user intent and safety constraints. We introduce VerificAgent, a scalable oversight framework that treats persistent memory as an explicit alignment surface. VerificAgent combines (1) an expert-curated seed of domain knowledge, (2) iterative, trajectory-based memory growth during training, and (3) a post-hoc human fact-checking pass to sanitize accumulated memories before deployment. Evaluated on OSWorld productivity tasks and additional adversarial stress tests, VerificAgent improves task reliability, reduces hallucination-induced failures, and preserves interpretable, auditable guidance--without additional model fine-tuning. By letting humans correct high-impact errors once, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation