Interrogating the Black Box: Transparency through Information-Seeking Dialogues
Andrea Aler Tubella, Andreas Theodorou, Juan Carlos Nieves

TL;DR
This paper introduces a formal dialogue framework using argumentation to enable transparent investigation of opaque learning systems' adherence to ethical policies through information-seeking interactions.
Contribution
It proposes a novel formal dialogue model with modular components for compliance checking, leveraging argumentation semantics instead of quantitative aggregation.
Findings
Framework enables systematic compliance verification
Modular design allows customization for different systems
Leverages argumentation semantics for consistent property analysis
Abstract
This paper is preoccupied with the following question: given a (possibly opaque) learning system, how can we understand whether its behaviour adheres to governance constraints? The answer can be quite simple: we just need to "ask" the system about it. We propose to construct an investigator agent to query a learning agent -- the suspect agent -- to investigate its adherence to a given ethical policy in the context of an information-seeking dialogue, modeled in formal argumentation settings. This formal dialogue framework is the main contribution of this paper. Through it, we break down compliance checking mechanisms into three modular components, each of which can be tailored to various needs in a vast amount of ways: an investigator agent, a suspect agent, and an acceptance protocol determining whether the responses of the suspect agent comply with the policy. This acceptance protocol…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Logic, Reasoning, and Knowledge · Auction Theory and Applications
