Loading paper
Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data | Tomesphere