TL;DR
Treelets are an adaptive multi-scale basis method designed for high-dimensional, noisy, and unordered data, enabling effective dimensionality reduction and feature selection by capturing the data's internal hierarchical structure.
Contribution
The paper introduces treelets, a novel adaptive hierarchical basis construction that extends wavelets to nonsmooth signals and is tailored for sparse, unordered high-dimensional data.
Findings
Treelets outperform PCA in various data scenarios.
Treelets effectively identify variable groupings and structures.
The method is simple to implement and has solid theoretical foundations.
Abstract
In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered--with no particular meaning to the given order of the variables. Yet, successful learning is often possible due to sparsity: the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this paper we present treelets--a novel construction of multi-scale bases that extends wavelets to nonsmooth signals. The method is fully adaptive, as it returns a hierarchical tree and an orthonormal basis which both reflect the internal structure of the data. Treelets are especially well-suited as a dimensionality reduction and feature selection tool prior to regression and classification, in situations where sample sizes are small and the data are sparse with unknown groupings of correlated or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
