On Statistical Efficiency in Learning

Jie Ding; Enmao Diao; Jiawei Zhou; Vahid Tarokh

arXiv:2012.13307·math.ST·December 25, 2020

On Statistical Efficiency in Learning

Jie Ding, Enmao Diao, Jiawei Zhou, Vahid Tarokh

PDF

1 Repo

TL;DR

This paper introduces a generalized information criterion for model selection that asymptotically achieves optimal prediction performance, applicable to complex, high-dimensional, and streaming data scenarios, with proven theoretical guarantees.

Contribution

It proposes a new generalized Takeuchi's information criterion with proven asymptotic optimality and an online algorithm for streaming data, advancing model selection theory and practice.

Findings

01

Achieves asymptotic optimal out-sample prediction loss.

02

Reduces computational cost compared to cross-validation.

03

Effective for high-dimensional and streaming data models.

Abstract

A central issue of many statistical learning problems is to select an appropriate model from a set of candidate models. Large models tend to inflate the variance (or overfitting), while small models tend to cause biases (or underfitting) for a given fixed dataset. In this work, we address the critical challenge of model selection to strike a balance between model fitting and model complexity, thus gaining reliable predictive power. We consider the task of approaching the theoretical limit of statistical learning, meaning that the selected model has the predictive performance that is as good as the best possible model given a class of potentially misspecified candidate models. We propose a generalized notion of Takeuchi's information criterion and prove that the proposed method can asymptotically achieve the optimal out-sample prediction loss under reasonable assumptions. It is the first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JieGroup/On-Statistical-Efficiency-in-Learning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.