Loading paper
PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting | Tomesphere