Differential Assessment of Black-Box AI Agents
Rashmeet Kaur Nayyar, Pulkit Verma, Siddharth Srivastava

TL;DR
This paper introduces a novel method for efficiently assessing black-box AI agents that have changed their behavior over time, using sparse observations and active querying to update their models.
Contribution
It proposes a differential assessment approach for drifting black-box AI agents, enabling efficient updates based on limited observations and active querying.
Findings
The method is more efficient than re-learning from scratch.
Assessment cost is proportional to the amount of agent drift.
Effective in fully observable, deterministic settings.
Abstract
Much of the research on learning symbolic models of AI agents focuses on agents with stationary models. This assumption fails to hold in settings where the agent's capabilities may change as a result of learning, adaptation, or other post-deployment modifications. Efficient assessment of agents in such settings is critical for learning the true capabilities of an AI system and for ensuring its safe usage. In this work, we propose a novel approach to "differentially" assess black-box AI agents that have drifted from their previously known models. As a starting point, we consider the fully observable and deterministic setting. We leverage sparse observations of the drifted agent's current behavior and knowledge of its initial model to generate an active querying policy that selectively queries the agent and computes an updated model of its functionality. Empirical evaluation shows that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Stream Mining Techniques · Optimization and Search Problems · Reinforcement Learning in Robotics
