Foundational principles for large scale inference: Illustrations through   correlation mining

Alfred O. Hero; Bala Rajaratnam

arXiv:1505.02475·math.ST·May 19, 2015

Foundational principles for large scale inference: Illustrations through correlation mining

Alfred O. Hero, Bala Rajaratnam

PDF

TL;DR

This paper develops a unified framework to understand the sample complexity of correlation mining in large-scale, high-dimensional data, addressing fundamental limits in inference when sample sizes are limited.

Contribution

It introduces a comprehensive statistical framework for analyzing sample complexity across different asymptotic regimes in large-scale inference, especially for correlation mining.

Findings

01

Identifies distinct asymptotic regimes relevant to big data inference.

02

Quantifies sample complexity for correlation mining under various models.

03

Highlights the importance of high-dimensional regimes with fixed sample size.

Abstract

When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics the dataset is often variable-rich but sample-starved: a regime where the number $n$ of acquired samples (statistical replicates) is far fewer than the number $p$ of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data." Sample complexity however has received relatively less attention, especially in the setting when the sample size $n$ is fixed, and the dimension $p$ grows without bound. To address this gap, we develop a unified statistical framework that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.