Loading paper
MILAN: Masked Image Pretraining on Language Assisted Representation | Tomesphere