Long-Tailed Learning Requires Feature Learning

Thomas Laurent; James H. von Brecht; and Xavier Bresson

arXiv:2205.14553·cs.LG·January 2, 2023·1 cites

Long-Tailed Learning Requires Feature Learning

Thomas Laurent, James H. von Brecht, and Xavier Bresson

PDF

Open Access 1 Video

TL;DR

This paper introduces a data model inspired by natural data to study how feature learning impacts generalization in long-tailed distributions, emphasizing the necessity of feature learning for successful learning.

Contribution

It provides a theoretical framework and error bounds demonstrating that learning the correct features is essential for good generalization in long-tailed data distributions.

Findings

01

Feature learning is crucial for success in long-tailed data scenarios.

02

The paper derives explicit generalization error bounds related to feature learning.

03

Learning the wrong features leads to significant penalties in generalization.

Abstract

We propose a simple data model inspired from natural data such as text or images, and use it to study the importance of learning features in order to achieve good generalization. Our data model follows a long-tailed distribution in the sense that some rare subcategories have few representatives in the training set. In this context we provide evidence that a learner succeeds if and only if it identifies the correct features, and moreover derive non-asymptotic generalization error bounds that precisely quantify the penalty that one must pay for not learning features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Long-Tailed Learning Requires Feature Learning· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Text and Document Classification Technologies