Loading paper
Fine-tuned CLIP Models are Efficient Video Learners | Tomesphere