Loading paper
Extending $\mu$P: Spectral Conditions for Feature Learning Across Optimizers | Tomesphere