Loading paper
Learning to Combine the Modalities of Language and Video for Temporal Moment Localization | Tomesphere