Loading paper
VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding | Tomesphere