Error Slice Discovery via Manifold Compactness
Han Yu, Hao Zou, Jiashuo Liu, Renzhe Xu, Yue He, Xingxuan Zhang, Peng Cui

TL;DR
This paper introduces a new metric called manifold compactness for identifying semantically coherent error slices in deep learning models without relying on extra metadata, and proposes an algorithm that optimizes for both risk and coherence.
Contribution
It proposes a novel coherence metric based on data geometry and develops an algorithm that directly optimizes for error risk and slice coherence.
Findings
The manifold compactness metric effectively measures slice coherence without extra metadata.
The MCSD algorithm outperforms existing methods on benchmark datasets.
Experiments validate the rationality and effectiveness of the proposed approach.
Abstract
Despite the great performance of deep learning models in many areas, they still make mistakes and underperform on certain subsets of data, i.e. error slices. Given a trained model, it is important to identify its semantically coherent error slices that are easy to interpret, which is referred to as the error slice discovery problem. However, there is no proper metric of slice coherence without relying on extra information like predefined slice labels. Current evaluation of slice coherence requires access to predefined slices formulated by metadata like attributes or subclasses. Its validity heavily relies on the quality and abundance of metadata, where some possible patterns could be ignored. Besides, current algorithms cannot directly incorporate the constraint of coherence into their optimization objective due to absence of an explicit coherence metric, which could potentially hinder…
Peer Reviews
Decision·Submitted to ICLR 2025
The authors mainly proposed the concept of manifold compactness and a coherence metric.
The paper is difficult to understand and follow as the source code is not provided.
1. This is a relatively new and interesting field that merits further exploration. 2. The organization and writing of this paper are reader-friendly. 3. Extensive case studies support the authors' views, accompanied by new metrics and algorithms.
1. Although extensive case studies support the authors' viewpoints, the lack of theoretical analysis is regrettable. 2. There is no analysis of time and space complexity. For a dataset of size $n$, constructing a graph requires at least $O(n^2) $ and solving Equation 2 takes $O(n^3)$. In fact, as seen in Table 8, efficiency is indeed a drawback. The authors might consider explaining why this efficiency loss is worthwhile. 3. The paper suggests that Euclidean distance is inferior to manifold, but
The authors show the results of many experiments.
There are a number of problems that the author needs to address, as detailed in the Questions section.
Videos
Taxonomy
TopicsImage and Object Detection Techniques · Image Processing and 3D Reconstruction · Advanced Numerical Analysis Techniques
