Loading paper
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks | Tomesphere