Loading paper
Localizing Events in Videos with Multimodal Queries | Tomesphere