Loading paper
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization | Tomesphere