Label-Descriptive Patterns and Their Application to Characterizing   Classification Errors

Michael Hedderich; Jonas Fischer; Dietrich Klakow; Jilles Vreeken

arXiv:2110.09599·cs.LG·June 20, 2022·1 cites

Label-Descriptive Patterns and Their Application to Characterizing Classification Errors

Michael Hedderich, Jonas Fischer, Dietrich Klakow, Jilles Vreeken

PDF

Open Access 2 Repos

TL;DR

This paper introduces a method to identify interpretable feature-value patterns that correlate with classifier errors, providing insights into systematic mistakes in deep learning models.

Contribution

It formulates the pattern discovery as a label description problem using the MDL principle and develops the Premise algorithm for efficient pattern set discovery.

Findings

01

Premise effectively recovers ground truth patterns in synthetic data.

02

It performs well on real-world data, even with class imbalance.

03

Provides actionable insights into classifier errors in NLP tasks.

Abstract

State-of-the-art deep learning methods achieve human-like performance on many tasks, but make errors nevertheless. Characterizing these errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors, but also gives a way to act and improve the classifier. We propose to discover those feature-value combinations (i.e., patterns) that strongly correlate with correct resp. erroneous predictions to obtain a global and interpretable description for arbitrary classifiers. We show this is an instance of the more general label description problem, which we formulate in terms of the Minimum Description Length principle. To discover a good pattern set, we develop the efficient Premise algorithm. Through an extensive set of experiments we show it performs very well in practice on both synthetic and real-world data. Unlike existing solutions, it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Natural Language Processing Techniques · Topic Modeling