Loading paper
Residual Matrix Transformers: Scaling the Size of the Residual Stream | Tomesphere