Loading paper
Next-token prediction capacity: general upper bounds and a lower bound for transformers | Tomesphere