Loading paper
From SGD to Muon: Adaptive Optimization via Schatten-p Norms | Tomesphere