Two-sample test based on Self-Organizing Maps
Alejandro \'Alvarez-Ayll\'on, Manuel Palomo-Duarte, Juan-Manuel Dodero

TL;DR
This paper proposes using Self-Organizing Maps as a two-sample test that not only detects differences between samples but also provides insights into how they differ, combining classification with interpretability.
Contribution
It introduces a novel approach leveraging Self-Organizing Maps for two-sample testing that offers both discrimination and interpretability.
Findings
SOM-based test can distinguish different populations effectively.
Provides insights into sample differences through visualization.
Combines classification accuracy with interpretability.
Abstract
Machine-learning classifiers can be leveraged as a two-sample statistical test. Suppose each sample is assigned a different label and that a classifier can obtain a better-than-chance result discriminating them. In this case, we can infer that both samples originate from different populations. However, many types of models, such as neural networks, behave as a black-box for the user: they can reject that both samples originate from the same population, but they do not offer insight into how both samples differ. Self-Organizing Maps are a dimensionality reduction initially devised as a data visualization tool that displays emergent properties, being also useful for classification tasks. Since they can be used as classifiers, they can be used also as a two-sample statistical test. But since their original purpose is visualization, they can also offer insights.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
