Loading paper
Self-supervised pre-training and contrastive representation learning for multiple-choice video QA | Tomesphere