Loading paper
MVP: Multimodality-guided Visual Pre-training | Tomesphere