Loading paper
Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism | Tomesphere