Loading paper
A Proof of Learning Rate Transfer under $\mu$P | Tomesphere