Loading paper
Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning | Tomesphere