Loading paper
ELIP: Efficient Discriminative Language-Image Pre-training with Fewer Vision Tokens | Tomesphere