The Tunnel Effect: Building Data Representations in Deep Neural Networks

Wojciech Masarczyk; Mateusz Ostaszewski; Ehsan Imani; Razvan Pascanu,; Piotr Mi{\l}o\'s; Tomasz Trzci\'nski

arXiv:2305.19753·cs.LG·October 31, 2023·1 cites

The Tunnel Effect: Building Data Representations in Deep Neural Networks

Wojciech Masarczyk, Mateusz Ostaszewski, Ehsan Imani, Razvan Pascanu,, Piotr Mi{\l}o\'s, Tomasz Trzci\'nski

PDF

Open Access 1 Video

TL;DR

This paper reveals that deep neural networks develop a 'tunnel' in their later layers that compresses data representations, affecting generalization and learning dynamics, with the tunnel forming early and depending on network capacity and task complexity.

Contribution

It introduces the concept of the 'tunnel' in deep networks, showing its emergence, properties, and impact on generalization and continual learning.

Findings

01

The tunnel forms early during training.

02

Deeper networks have a more pronounced tunnel.

03

The tunnel's depth relates to network capacity and task complexity.

Abstract

Deep neural networks are widely known for their remarkable effectiveness across various tasks, with the consensus that deeper networks implicitly learn more complex data representations. This paper shows that sufficiently deep networks trained for supervised image classification split into two distinct parts that contribute to the resulting data representations differently. The initial layers create linearly-separable representations, while the subsequent layers, which we refer to as \textit{the tunnel}, compress these representations and have a minimal impact on the overall performance. We explore the tunnel's behavior through comprehensive empirical studies, highlighting that it emerges early in the training process. Its depth depends on the relation between the network's capacity and task complexity. Furthermore, we show that the tunnel degrades out-of-distribution generalization and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The Tunnel Effect: Building Data Representations in Deep Neural Networks· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning