Loading paper
From Small to Large: Generalization Bounds for Transformers on Variable-Size Inputs | Tomesphere