LE-NeuS: Latency-Efficient Neuro-Symbolic Video Understanding via Adaptive Temporal Verification

Shawn Liang; Sahil Shah; Chengwei Zhou; SP Sharan; Harsh Goel; Arnab Sanyal; Sandeep Chinchali; Gourav Datta

arXiv:2602.23553·cs.CV·March 2, 2026

LE-NeuS: Latency-Efficient Neuro-Symbolic Video Understanding via Adaptive Temporal Verification

Shawn Liang, Sahil Shah, Chengwei Zhou, SP Sharan, Harsh Goel, Arnab Sanyal, Sandeep Chinchali, Gourav Datta

PDF

Open Access

TL;DR

LE-NeuS introduces a latency-efficient neuro-symbolic framework for long-form video question answering that maintains accuracy while drastically reducing inference latency through adaptive sampling and parallel proposition detection.

Contribution

The paper proposes LE-NeuS, a novel framework that significantly reduces latency in neuro-symbolic video understanding by optimizing proposition detection and sampling strategies.

Findings

01

Reduces latency gap from 90x to 10x on benchmarks.

02

Maintains over 10% accuracy gains on complex queries.

03

Provides theoretical latency bounds based on video and proposition complexity.

Abstract

Neuro-symbolic approaches to long-form video question answering (LVQA) have demonstrated significant accuracy improvements by grounding temporal reasoning in formal verification. However, existing methods incur prohibitive latency overheads, up to 90x slower than base VLM prompting, rendering them impractical for latency-sensitive edge deployments. We present LE-NeuS, a latency-efficient neuro-symbolic framework that preserves the accuracy benefits of temporal logic-guided video understanding while drastically reducing inference latency. Our key insight is that the dominant computational bottleneck arises from sequential and dense proposition detection across video frames during automaton construction. We address this through two principled optimizations: (1) CLIP guided two-stage adaptive sampling that exploits visual redundancy to skip semantically similar frames while preserving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Ferroelectric and Negative Capacitance Devices · Topic Modeling