Loading paper
ESsEN: Training Compact Discriminative Vision-Language Transformers in a Low-Resource Setting | Tomesphere