Loading paper
TEON: Tensorized Orthonormalization Beyond Layer-Wise Muon for Large Language Model Pre-Training | Tomesphere