Loading paper
A Hardware-Aware, Per-Layer Methodology for Post-Training Quantization of Large Language Models | Tomesphere