Loading paper
Video Active Perception: Effective Inference-Time Long-Form Video Understanding with Vision-Language Models | Tomesphere