Loading paper
Text encoders bottleneck compositionality in contrastive vision-language models | Tomesphere