Lazy vs hasty: linearization in deep networks impacts learning schedule   based on example difficulty

Thomas George; Guillaume Lajoie; Aristide Baratin

arXiv:2209.09658·cs.LG·April 19, 2024·1 cites

Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty

Thomas George, Guillaume Lajoie, Aristide Baratin

PDF

Open Access 1 Repo

TL;DR

This paper compares lazy (linear) and feature learning (non-linear) regimes in deep networks, showing that non-linear dynamics prioritize easier examples, leading to faster learning of simpler data before harder ones.

Contribution

It reveals how non-linear training dynamics influence the order and speed of learning examples based on difficulty, contrasting with linearized models.

Findings

01

Easier examples are learned faster in non-linear regimes.

02

Non-linear dynamics sequentialize learning from easy to hard examples.

03

The phenomenon is consistent across various measures of difficulty.

Abstract

Among attempts at giving a theoretical account of the success of deep neural networks, a recent line of work has identified a so-called lazy training regime in which the network can be well approximated by its linearization around initialization. Here we investigate the comparative effect of the lazy (linear) and feature learning (non-linear) regimes on subgroups of examples based on their difficulty. Specifically, we show that easier examples are given more weight in feature learning mode, resulting in faster training compared to more difficult ones. In other words, the non-linear dynamics tends to sequentialize the learning of examples of increasing difficulty. We illustrate this phenomenon across different ways to quantify example difficulty, including c-score, label noise, and in the presence of easy-to-learn spurious correlations. Our results reveal a new understanding of how deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tfjgeorge/lazy_vs_hasty
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Neural Networks and Applications