Loading paper
Garbage Attention in Large Language Models: BOS Sink Heads and Sink-aware Pruning | Tomesphere