Loading paper
Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning | Tomesphere