ARGUS: Policy-Adaptive Ad Governance via Evolving Reinforcement with Adversarial Umpiring

Deyi Ji; Junyu Lu; Xuanyi Liu; Liqun Liu; Hailong Zhang; Peng Shu; Huan Yu; Jie Jiang; Tianru Chen; Lanyun Zhu

arXiv:2605.02200·cs.CL·May 5, 2026

ARGUS: Policy-Adaptive Ad Governance via Evolving Reinforcement with Adversarial Umpiring

Deyi Ji, Junyu Lu, Xuanyi Liu, Liqun Liu, Hailong Zhang, Peng Shu, Huan Yu, Jie Jiang, Tianru Chen, Lanyun Zhu

PDF

TL;DR

ARGUS is a system for online ad governance that adaptively manages evolving policies using multi-agent adversarial reasoning and reinforcement learning, effectively handling label inconsistencies and regulatory changes.

Contribution

It introduces a three-stage framework with adversarial label rectification and knowledge discovery to improve policy adaptation in dynamic regulatory environments.

Findings

01

ARGUS outperforms traditional fine-tuning methods on industrial and public datasets.

02

The system effectively resolves conflicts between stale labels and new policies.

03

ARGUS achieves superior policy adaptation with minimal gold data.

Abstract

Online advertising governance faces significant challenges due to the non-stationary nature of regulatory policies, where emerging mandates (e.g., restrictions on education or aesthetic anxiety) create severe label inconsistencies and reasoning ambiguities in historical datasets. In this paper, we propose ARGUS, a policy-adaptive governance system that enables evolving reinforcement through multi-agent adversarial umpiring. ARGUS addresses the sparsity of new policy data by employing a three-stage framework: (1) Policy Seeding for initial perception; (2) Adversarial Label Rectification, which utilizes a ``Prosecutor-Defender-Umpire'' architecture to resolve conflicts between stale labels and new mandates; and (3) Latent Knowledge Discovery, which employs a tripartite dialectical discussion to unearth sophisticated, ``gray-area'' violations. By leveraging RAG-enhanced policy knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.