Loading paper
S2-Attention: Hardware-Aware Context Sharding Among Attention Heads | Tomesphere