Loading paper
Object-centric Video Question Answering with Visual Grounding and Referring | Tomesphere