Trustworthy Dimensionality Reduction

Subhrajyoty Roy

arXiv:2405.05868·stat.ME·May 10, 2024·2 cites

Trustworthy Dimensionality Reduction

Subhrajyoty Roy

PDF

Open Access

TL;DR

This paper introduces LSDR, a new dimensionality reduction method that balances trustability and generalizability, outperforming existing algorithms like tSNE and UMAP in preserving data structure.

Contribution

It formally models trustability and generalizability in dimensionality reduction and proposes LSDR, an algorithm that optimally balances these aspects, with extensions for broader applicability.

Findings

01

LSDR outperforms tSNE and UMAP in global structure preservation.

02

LSDR effectively balances local detail and overall data integrity.

03

Proposed indices measure trustability and generalizability in dimensionality reduction.

Abstract

Different unsupervised models for dimensionality reduction like PCA, LLE, Shannon's mapping, tSNE, UMAP, etc. work on different principles, hence, they are difficult to compare on the same ground. Although they are usually good for visualisation purposes, they can produce spurious patterns that are not present in the original data, losing its trustability (or credibility). On the other hand, information about some response variable (or knowledge of class labels) allows us to do supervised dimensionality reduction such as SIR, SAVE, etc. which work to reduce the data dimension without hampering its ability to explain the particular response at hand. Therefore, the reduced dataset cannot be used to further analyze its relationship with some other kind of responses, i.e., it loses its generalizability. To make a better dimensionality reduction algorithm with a better balance between these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications