Loading paper
Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining? | Tomesphere