On the Origins of the Block Structure Phenomenon in Neural Network   Representations

Thao Nguyen; Maithra Raghu; Simon Kornblith

arXiv:2202.07184·cs.LG·February 16, 2022·6 cites

On the Origins of the Block Structure Phenomenon in Neural Network Representations

Thao Nguyen, Maithra Raghu, Simon Kornblith

PDF

Open Access 1 Repo

TL;DR

This paper investigates the block structure phenomenon in neural networks, revealing it originates from dominant dataset features and varies across models, with implications for understanding neural representations and training effects.

Contribution

It uncovers the origin of block structures from dominant dataset features and analyzes their evolution and dependence on training methods and randomness.

Findings

01

Block structure arises from dominant dataset features like image statistics.

02

The dominant datapoints and shared features vary across random seeds.

03

Interventions can eliminate the block structure, affecting training dynamics.

Abstract

Recent work has uncovered a striking phenomenon in large-capacity neural networks: they contain blocks of contiguous hidden layers with highly similar representations. This block structure has two seemingly contradictory properties: on the one hand, its constituent layers exhibit highly similar dominant first principal components (PCs), but on the other hand, their representations, and their common first PC, are highly dissimilar across different random seeds. Our work seeks to reconcile these discrepant properties by investigating the origin of the block structure in relation to the data and training methods. By analyzing properties of the dominant PCs, we find that the block structure arises from dominant datapoints - a small group of examples that share similar image statistics (e.g. background color). However, the set of dominant datapoints, and the precise shared image statistic,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/google-research/tree/master/do_wide_and_deep_networks_learn_the_same_things
jax

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks