Loading paper
CenterCLIP: Token Clustering for Efficient Text-Video Retrieval | Tomesphere