Loading paper
Temporally-Grounded Language Generation: A Benchmark for Real-Time Vision-Language Models | Tomesphere