Loading paper
Mechanics of Next Token Prediction with Self-Attention | Tomesphere