Loading paper
Quant.npu: Enabling Efficient Mobile NPU Inference for on-device LLMs via Fully Static Quantization | Tomesphere