Loading paper
Learning Video Representations using Contrastive Bidirectional Transformer | Tomesphere