Debate is efficient with your time
Jonah Brown-Cohen, Geoffrey Irving, Simon C. Marshall, Ilan Newman, Georgios Piliouras, Mario Szegedy

TL;DR
This paper introduces Debate Query Complexity (DQC), showing that debate-based AI safety can achieve highly query-efficient verification, with logarithmic oversight sufficing for complex problems, linking it to circuit complexity bounds.
Contribution
The paper formally characterizes the query complexity of debate, revealing it is logarithmic for problems in PSPACE/poly and connecting debate efficiency to circuit complexity lower bounds.
Findings
DQC is exactly the class of functions decidable with O(log n) queries.
Functions depending on all input bits require Omega(log n) queries.
Functions computable by size-s circuits have DQC <= log(s) + 3.
Abstract
AI safety via debate uses two competing models to help a human judge verify complex computational tasks. Previous work has established what problems debate can solve in principle, but has not analysed the practical cost of human oversight: how many queries must the judge make to the debate transcript? We introduce Debate Query Complexity}(DQC), the minimum number of bits a verifier must inspect to correctly decide a debate. Surprisingly, we find that PSPACE/poly (the class of problems which debate can efficiently decide) is precisely the class of functions decidable with O(log n) queries. This characterisation shows that debate is remarkably query-efficient: even for highly complex problems, logarithmic oversight suffices. We also establish that functions depending on all their input bits require Omega(log n) queries, and that any function computable by a circuit of size s satisfies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Machine Learning and Algorithms · Adversarial Robustness in Machine Learning
