Loading paper
Token Dropping for Efficient BERT Pretraining | Tomesphere