Loading paper
Max-Margin Token Selection in Attention Mechanism | Tomesphere