Loading paper
ViGT: Proposal-free Video Grounding with Learnable Token in Transformer | Tomesphere