I Dropped a Neural Net

Hyunwoo Park

arXiv:2602.19845·cs.LG·February 24, 2026

I Dropped a Neural Net

Hyunwoo Park

PDF

Open Access

TL;DR

This paper demonstrates that, despite an enormous search space, neural network layers can be accurately reordered after shuffling by leveraging stability conditions and simple heuristics, revealing insights into layer structure recovery.

Contribution

The authors introduce a method to recover the exact order of shuffled neural network layers using stability conditions and heuristic hill-climbing, addressing a complex permutation problem.

Findings

01

Successfully recovered layer order in a neural network after shuffling

02

Stability conditions like dynamic isometry facilitate layer pairing

03

Heuristic initialization combined with hill-climbing achieves accurate reordering

Abstract

A recent Dwarkesh Patel podcast with John Collison and Elon Musk featured an interesting puzzle from Jane Street: they trained a neural net, shuffled all 96 layers, and asked to put them back in order. Given unlabelled layers of a Residual Network and its training dataset, we recover the exact ordering of the layers. The problem decomposes into pairing each block's input and output projections ( $48!$ possibilities) and ordering the reassembled blocks ( $48!$ possibilities), for a combined search space of $(48!)^{2} \approx 1 0^{122}$ , which is more than the atoms in the observable universe. We show that stability conditions during training like dynamic isometry leave the product $W_{out} W_{in}$ for correctly paired layers with a negative diagonal structure, allowing us to use diagonal dominance ratio as a signal for pairing. For ordering, we seed-initialize with a rough…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Model Reduction and Neural Networks