QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design
Benjamin Schneider, Dongfu Jiang, Chao Du, Tianyu Pang, Wenhu Chen

TL;DR
QuickVideo introduces a system-algorithm co-design that significantly accelerates long-video understanding, enabling real-time processing for applications like surveillance and sports broadcasting by optimizing decoding and inference pipelines.
Contribution
It presents three novel components—QuickDecoder, QuickPrefill, and an overlapping scheme—that together reduce inference latency and memory usage for long videos.
Findings
Inference time reduced by up to a minute on long videos
Achieves 2-3x speedup in video decoding
Generalizes across different video durations and sampling rates
Abstract
Long-video understanding has emerged as a crucial capability in real-world applications such as video surveillance, meeting summarization, educational lecture analysis, and sports broadcasting. However, it remains computationally prohibitive for VideoLLMs, primarily due to two bottlenecks: 1) sequential video decoding, the process of converting the raw bit stream to RGB frames can take up to a minute for hour-long video inputs, and 2) costly prefilling of up to several million tokens for LLM inference, resulting in high latency and memory use. To address these challenges, we propose QuickVideo, a system-algorithm co-design that substantially accelerates long-video understanding to support real-time downstream applications. It comprises three key innovations: QuickDecoder, a parallelized CPU-based video decoder that achieves 2-3 times speedup by splitting videos into keyframe-aligned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Video Surveillance and Tracking Methods
MethodsPruning
