Enhancing Performance of Explainable AI Models with Constrained Concept Refinement
Geyu Liang, Senne Michielssen, Salar Fattahi

TL;DR
This paper introduces a novel constrained concept refinement framework that improves the accuracy of interpretable AI models without sacrificing interpretability, validated through theoretical proofs and practical image classification benchmarks.
Contribution
The paper presents a new framework that optimizes concept embeddings under interpretability-preserving constraints, achieving zero loss and enhanced accuracy in explainable models.
Findings
Improves prediction accuracy while maintaining interpretability.
Achieves zero loss in concept embedding optimization.
Reduces computational cost compared to existing methods.
Abstract
The trade-off between accuracy and interpretability has long been a challenge in machine learning (ML). This tension is particularly significant for emerging interpretable-by-design methods, which aim to redesign ML algorithms for trustworthy interpretability but often sacrifice accuracy in the process. In this paper, we address this gap by investigating the impact of deviations in concept representations-an essential component of interpretable models-on prediction performance and propose a novel framework to mitigate these effects. The framework builds on the principle of optimizing concept embeddings under constraints that preserve interpretability. Using a generative model as a test-bed, we rigorously prove that our algorithm achieves zero loss while progressively enhancing the interpretability of the resulting model. Additionally, we evaluate the practical performance of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Time Series Analysis and Forecasting · Data Stream Mining Techniques
