Loading paper
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration | Tomesphere