A New Covariance Estimator for Sufficient Dimension Reduction in   High-Dimensional and Undersized Sample Problems

Kabir Opeyemi Olorede; Waheed Babatunde Yahya

arXiv:1909.13017·stat.ME·October 1, 2019

A New Covariance Estimator for Sufficient Dimension Reduction in High-Dimensional and Undersized Sample Problems

Kabir Opeyemi Olorede, Waheed Babatunde Yahya

PDF

Open Access

TL;DR

This paper introduces the Maximum Entropy Covariance (MEC) estimator, a novel method for covariance estimation in high-dimensional, undersized sample problems, improving sufficient dimension reduction techniques like SIR and SAVE.

Contribution

The paper proposes the MEC estimator that effectively handles covariance matrix singularity and instability in high-dimensional data, enhancing dimension reduction methods without complex optimization.

Findings

01

MEC improves covariance estimation in high-dimensional settings.

02

MEC enhances the performance of SIR and SAVE methods.

03

Real-world data experiments validate MEC's effectiveness.

Abstract

The application of standard sufficient dimension reduction methods for reducing the dimension space of predictors without losing regression information requires inverting the covariance matrix of the predictors. This has posed a number of challenges especially when analyzing high-dimensional data sets in which the number of predictors $p$ is much larger than number of samples $n, (n ≪ p)$ . A new covariance estimator, called the \textit{Maximum Entropy Covariance} (MEC) that addresses loss of covariance information when similar covariance matrices are linearly combined using \textit{Maximum Entropy} (ME) principle is proposed in this work. By benefitting naturally from slicing or discretizing range of the response variable, y into \textit{H} non-overlapping categories, $h_{1}, \dots, h_{H}$ , MEC first combines covariance matrices arising from samples in each y slice…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Fault Detection and Control Systems · Machine Learning and Data Classification