Borel Isomorphic Dimensionality Reduction of Data and Supervised   Learning

Stan Hatko

arXiv:1307.8333·stat.ML·August 1, 2013·1 cites

Borel Isomorphic Dimensionality Reduction of Data and Supervised Learning

Stan Hatko

PDF

Open Access

TL;DR

This paper explores using Borel isomorphisms for dimensionality reduction in data, demonstrating minimal accuracy loss in supervised learning tasks like phoneme recognition.

Contribution

It provides concrete examples of Borel isomorphisms for reducing data dimensions and tests their effectiveness in supervised learning scenarios.

Findings

01

Borel isomorphic reduction to dimension 16 retains most accuracy.

02

Orthogonal matrices combined with Borel isomorphisms effectively reduce dimensions.

03

Minimal accuracy drop observed in phoneme recognition dataset.

Abstract

In this project we further investigate the idea of reducing the dimensionality of datasets using a Borel isomorphism with the purpose of subsequently applying supervised learning algorithms, as originally suggested by my supervisor V. Pestov (in 2011 Dagstuhl preprint). Any consistent learning algorithm, for example kNN, retains universal consistency after a Borel isomorphism is applied. A series of concrete examples of Borel isomorphisms that reduce the number of dimensions in a dataset is provided, based on multiplying the data by orthogonal matrices before the dimensionality reducing Borel isomorphism is applied. We test the accuracy of the resulting classifier in a lower dimensional space with various data sets. Working with a phoneme voice recognition dataset, of dimension 256 with 5 classes (phonemes), we show that a Borel isomorphic reduction to dimension 16 leads to a minimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Medical Image Segmentation Techniques · Statistical Methods and Inference