Discovering and Learning Probabilistic Models of Black-Box AI Capabilities
Daniel Bramblett, Rushang Karia, Adrian Ciotinga, Ruthvick Suresh, Pulkit Verma, YooJung Choi, Siddharth Srivastava

TL;DR
This paper introduces a method to efficiently learn interpretable probabilistic models of black-box AI systems' capabilities using PDDL representations and Monte-Carlo tree search, ensuring safety and understanding.
Contribution
It presents a novel approach combining PDDL-style models and Monte-Carlo tree search to learn and verify AI capabilities with theoretical guarantees.
Findings
Models accurately describe AI capabilities and outcomes
Method is efficient and scalable to multiple systems
Theoretical guarantees of soundness, completeness, and convergence
Abstract
Black-box AI (BBAI) systems such as foundational models are increasingly being used for sequential decision making. To ensure that such systems are safe to operate and deploy, it is imperative to develop efficient methods that can provide a sound and interpretable representation of the BBAI's capabilities. This paper shows that PDDL-style representations can be used to efficiently learn and model an input BBAI's planning capabilities. It uses the Monte-Carlo tree search paradigm to systematically create test tasks, acquire data, and prune the hypothesis space of possible symbolic models. Learned models describe a BBAI's capabilities, the conditions under which they can be executed, and the possible outcomes of executing them along with their associated probabilities. Theoretical results show soundness, completeness and convergence of the learned models. Empirical results with multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Bayesian Modeling and Causal Inference · Reinforcement Learning in Robotics
