Loading paper
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training | Tomesphere