ALEX: Active Learning based Enhancement of a Model's Explainability
Ishani Mondal, Debasis Ganguly

TL;DR
This paper proposes an active learning method that not only aims to build effective classifiers with minimal labeled data but also enhances their interpretability by training an explainer model during data selection.
Contribution
It introduces a novel AL selection function that considers interpretability by using an explainer model to select instances that improve model transparency.
Findings
Initial experiments show improved interpretability.
Heuristic leads to more explainable classifiers.
Encouraging trends in model effectiveness.
Abstract
An active learning (AL) algorithm seeks to construct an effective classifier with a minimal number of labeled examples in a bootstrapping manner. While standard AL heuristics, such as selecting those points for annotation for which a classification model yields least confident predictions, there has been no empirical investigation to see if these heuristics lead to models that are more interpretable to humans. In the era of data-driven learning, this is an important research direction to pursue. This paper describes our work-in-progress towards developing an AL selection function that in addition to model effectiveness also seeks to improve on the interpretability of a model during the bootstrapping steps. Concretely speaking, our proposed selection function trains an `explainer' model in addition to the classifier model, and favours those instances where a different part of the data is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Data Stream Mining Techniques
MethodsInterpretability
