Loading paper
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition | Tomesphere