Locally Pareto-Optimal Interpretations for Black-Box Machine Learning Models
Aniruddha Joshi, Supratik Chakraborty, S Akshay, Shetal Shah, Hazem Torfah, Sanjit Seshia

TL;DR
This paper introduces a scalable framework for generating Pareto-optimal interpretations of black-box models with local optimality guarantees, balancing accuracy and explainability effectively.
Contribution
The authors propose a novel approach combining multi-objective search with SAT-based verification to produce locally Pareto-optimal interpretations efficiently.
Findings
Our method produces interpretations closely matching globally optimal solutions.
It offers scalable synthesis with formal local optimality guarantees.
Demonstrated effectiveness on benchmark datasets.
Abstract
Creating meaningful interpretations for black-box machine learning models involves balancing two often conflicting objectives: accuracy and explainability. Exploring the trade-off between these objectives is essential for developing trustworthy interpretations. While many techniques for multi-objective interpretation synthesis have been developed, they typically lack formal guarantees on the Pareto-optimality of the results. Methods that do provide such guarantees, on the other hand, often face severe scalability limitations when exploring the Pareto-optimal space. To address this, we develop a framework based on local optimality guarantees that enables more scalable synthesis of interpretations. Specifically, we consider the problem of synthesizing a set of Pareto-optimal interpretations with local optimality guarantees, within the immediate neighborhood of each solution. Our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Advanced Graph Neural Networks
