Loading paper
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling | Tomesphere