A classification performance evaluation measure considering data separability
Lingyan Xue, Xinyu Zhang, Weidong Jiang, Kai Huo

TL;DR
This paper introduces a new data separability measure called the rate of separability (RS), based on data coding rate, to better evaluate classification performance considering data characteristics.
Contribution
It proposes the RS measure, demonstrating its effectiveness compared to existing distance-based measures and its correlation with classification accuracy.
Findings
RS correlates positively with recognition accuracy.
RS outperforms traditional distance-based measures on synthetic data.
The paper discusses methods to evaluate classification considering data separability.
Abstract
Machine learning and deep learning classification models are data-driven, and the model and the data jointly determine their classification performance. It is biased to evaluate the model's performance only based on the classifier accuracy while ignoring the data separability. Sometimes, the model exhibits excellent accuracy, which might be attributed to its testing on highly separable data. Most of the current studies on data separability measures are defined based on the distance between sample points, but this has been demonstrated to fail in several circumstances. In this paper, we propose a new separability measure--the rate of separability (RS), which is based on the data coding rate. We validate its effectiveness as a supplement to the separability measure by comparing it to four other distance-based measures on synthetic datasets. Then, we demonstrate the positive correlation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
Methodsfail
