Loading paper
Training data-efficient image transformers & distillation through attention | Tomesphere