Loading paper
Video Question Answering Using CLIP-Guided Visual-Text Attention | Tomesphere