On Implications of Scaling Laws on Feature Superposition

Pavan Katta

arXiv:2407.01459·cs.LG·July 2, 2024

On Implications of Scaling Laws on Feature Superposition

Pavan Katta

PDF

Open Access

TL;DR

This paper explores the theoretical implications of scaling laws on feature representations in neural networks, arguing that certain assumptions about feature superposition and universality cannot both hold simultaneously.

Contribution

It presents a theoretical analysis showing the incompatibility of the superposition hypothesis with the universality of features in scaled models.

Findings

01

Superposition hypothesis and feature universality cannot both be true.

02

Scaling laws impose fundamental constraints on feature representations.

03

Theoretical insights challenge existing assumptions in neural network interpretability.

Abstract

Using results from scaling laws, this theoretical note argues that the following two statements cannot be simultaneously true: 1. Superposition hypothesis where sparse features are linearly represented across a layer is a complete theory of feature representation. 2. Features are universal, meaning two models trained on the same data and achieving equal performance will learn identical features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Image Retrieval and Classification Techniques · Graph Theory and Algorithms