Subjects classification from high-dimensional and small-sample size   datasets using a strategy based on Clustering Variables around Latent   Components (CLV) method

Dimitri Marques Abramov

arXiv:1706.04633·stat.AP·June 16, 2017·1 cites

Subjects classification from high-dimensional and small-sample size datasets using a strategy based on Clustering Variables around Latent Components (CLV) method

Dimitri Marques Abramov

PDF

Open Access

TL;DR

This paper introduces a CLV-based clustering method for classifying subjects in high-dimensional, small-sample datasets, achieving 80-95% accuracy by recovering latent factors.

Contribution

The study presents a novel CLV-based approach tailored for small samples in high-dimensional data, improving classification accuracy over existing methods.

Findings

01

Achieved 80-95% classification agreement.

02

Positive correlation between classifier precision and variable-to-subject ratio.

03

Method effectively recovers latent factors for subject classification.

Abstract

High-dimensional complex systems can be studied through multivariate analysis, as Principal Component Analysis, however large samples of observations frequently are needed for it. Here it is examined a method for small samples based on clustering variables around latent variables (CLV) to subject classification in two presumed groups. For it, a predictive model was developed to generate datasets with two groups of cases whose variables show randomness features (up to 30% of variables manifest difference between groups, and up to 7% of those are correlated between them). The method recovered the information of the latent factors to classify the subjects with 80 to 95% of agreement, with positive relationship between the classifier precision and the rate [number of variables / number of subjects].

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsE-commerce and Technology Innovations · Technology and Data Analysis