Label-Descriptive Patterns and Their Application to Characterizing Classification Errors
Michael Hedderich, Jonas Fischer, Dietrich Klakow, Jilles Vreeken

TL;DR
This paper introduces a method to identify interpretable feature-value patterns that correlate with classifier errors, providing insights into systematic mistakes in deep learning models.
Contribution
It formulates the pattern discovery as a label description problem using the MDL principle and develops the Premise algorithm for efficient pattern set discovery.
Findings
Premise effectively recovers ground truth patterns in synthetic data.
It performs well on real-world data, even with class imbalance.
Provides actionable insights into classifier errors in NLP tasks.
Abstract
State-of-the-art deep learning methods achieve human-like performance on many tasks, but make errors nevertheless. Characterizing these errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors, but also gives a way to act and improve the classifier. We propose to discover those feature-value combinations (i.e., patterns) that strongly correlate with correct resp. erroneous predictions to obtain a global and interpretable description for arbitrary classifiers. We show this is an instance of the more general label description problem, which we formulate in terms of the Minimum Description Length principle. To discover a good pattern set, we develop the efficient Premise algorithm. Through an extensive set of experiments we show it performs very well in practice on both synthetic and real-world data. Unlike existing solutions, it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Natural Language Processing Techniques · Topic Modeling
