LAVA: Language Driven Scalable and Versatile Traffic Video Analytics
Yanrui Yu, Tianfei Zhou, Jiaxin Sun, Lianpeng Qiao, Lizhong Ding, Ye Yuan, Guoren Wang

TL;DR
LAVA introduces a natural language-driven system for scalable, flexible traffic video analytics that outperforms traditional SQL-based methods in accuracy and speed by leveraging novel sampling, detection, and trajectory extraction techniques.
Contribution
The paper presents LAVA, a novel system enabling natural language queries for traffic video analysis, with new components for efficient sampling, open-world detection, and trajectory extraction.
Findings
Improves F1-scores for selection queries by 14%.
Reduces MPAE for aggregation queries by 0.39.
Achieves 86% top-k precision and 9.6x faster processing than baselines.
Abstract
In modern urban environments, camera networks generate massive amounts of operational footage -- reaching petabytes each day -- making scalable video analytics essential for efficient processing. Many existing approaches adopt an SQL-based paradigm for querying such large-scale video databases; however, this constrains queries to rigid patterns with predefined semantic categories, significantly limiting analytical flexibility. In this work, we explore a language-driven video analytics paradigm aimed at enabling flexible and efficient querying of high-volume video data driven by natural language. Particularly, we build \textsc{Lava}, a system that accepts natural language queries and retrieves traffic targets across multiple levels of granularity and arbitrary categories. \textsc{Lava} comprises three main components: 1) a multi-armed bandit-based efficient sampling method for video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Advanced Neural Network Applications
