Merging Two Cultures: Deep and Statistical Learning

Anindya Bhadra; Jyotishka Datta; Nick Polson; Vadim Sokolov; Jianeng; Xu

arXiv:2110.11561·stat.ME·October 25, 2021·1 cites

Merging Two Cultures: Deep and Statistical Learning

Anindya Bhadra, Jyotishka Datta, Nick Polson, Vadim Sokolov, Jianeng, Xu

PDF

Open Access

TL;DR

This paper unifies deep learning and statistical modeling by framing deep architectures as nonlinear feature generators with probabilistic output layers, enabling scalable prediction and uncertainty quantification.

Contribution

It introduces a general framework combining sparse regularization, stochastic gradient optimization, and probabilistic output layers to merge deep and statistical learning benefits.

Findings

01

Deep models generate nonlinear features for statistical methods.

02

Probabilistic output layers enable uncertainty quantification.

03

Framework applies to regression, classification, and interpolation.

Abstract

Merging the two cultures of deep and statistical learning provides insights into structured high-dimensional data. Traditional statistical modeling is still a dominant strategy for structured tabular data. Deep learning can be viewed through the lens of generalized linear models (GLMs) with composite link functions. Sufficient dimensionality reduction (SDR) and sparsity performs nonlinear feature engineering. We show that prediction, interpolation and uncertainty quantification can be achieved using probabilistic methods at the output layer of the model. Thus a general framework for machine learning arises that first generates nonlinear features (a.k.a factors) via sparse regularization and stochastic gradient optimisation and second uses a stochastic output layer for predictive uncertainty. Rather than using shallow additive architectures as in many statistical models, deep learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Neural Networks and Applications · Machine Learning and Data Classification

MethodsTanh Activation · Sigmoid Activation · Principal Components Analysis · Long Short-Term Memory · Gaussian Process