A Modular Theory of Feature Learning

Daniel McNamara; Cheng Soon Ong; Robert C. Williamson

arXiv:1611.03125·cs.LG·November 11, 2016

A Modular Theory of Feature Learning

Daniel McNamara, Cheng Soon Ong, Robert C. Williamson

PDF

Open Access

TL;DR

This paper introduces a theoretical framework for understanding when representation learning improves prediction, based on a risk gap measure and conditions involving data structure, with practical examples for manifold and clustering scenarios.

Contribution

It proposes a modular, risk-based approach to analyze the effectiveness of unsupervised representation learning, decomposing the problem into verifiable conditions.

Findings

01

Conditions for benefit depend on data structure and distribution

02

Analysis applies to manifold and clustering data scenarios

03

Provides a theoretical basis for semi-supervised learning effectiveness

Abstract

Learning representations of data, and in particular learning features for a subsequent prediction task, has been a fruitful area of research delivering impressive empirical results in recent years. However, relatively little is understood about what makes a representation `good'. We propose the idea of a risk gap induced by representation learning for a given prediction context, which measures the difference in the risk of some learner using the learned features as compared to the original inputs. We describe a set of sufficient conditions for unsupervised representation learning to provide a benefit, as measured by this risk gap. These conditions decompose the problem of when representation learning works into its constituent parts, which can be separately evaluated using an unlabeled sample, suitable domain-specific assumptions about the joint distribution, and analysis of the feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications