Loading paper
Are Pre-trained Convolutions Better than Pre-trained Transformers? | Tomesphere