A Challenge to Build Neuro-Symbolic Video Agents

Sahil Shah; Harsh Goel; Sai Shankar Narasimhan; Minkyu Choi; S P Sharan; Oguzhan Akcin; Sandeep Chinchali

arXiv:2505.13851·cs.AI·May 21, 2025

A Challenge to Build Neuro-Symbolic Video Agents

Sahil Shah, Harsh Goel, Sai Shankar Narasimhan, Minkyu Choi, S P Sharan, Oguzhan Akcin, Sandeep Chinchali

PDF

Open Access 1 Repo

TL;DR

This paper challenges the research community to develop neuro-symbolic video agents capable of reasoning about events over time, integrating search, interaction, and content generation for more intelligent and trustworthy video understanding.

Contribution

It introduces a grand challenge for creating neuro-symbolic video agents that combine perception, reasoning, and action, emphasizing temporal reasoning and structured event understanding.

Findings

01

Highlights the limitations of deep learning in temporal reasoning.

02

Proposes a neuro-symbolic framework for structured event analysis.

03

Calls for developing autonomous, interactive, and content-generating video agents.

Abstract

Modern video understanding systems excel at tasks such as scene classification, object detection, and short video retrieval. However, as video analysis becomes increasingly central to real-world applications, there is a growing need for proactive video agents for the systems that not only interpret video streams but also reason about events and take informed actions. A key obstacle in this direction is temporal reasoning: while deep learning models have made remarkable progress in recognizing patterns within individual frames or short clips, they struggle to understand the sequencing and dependencies of events over time, which is critical for action-driven decision-making. Addressing this limitation demands moving beyond conventional deep learning approaches. We posit that tackling this challenge requires a neuro-symbolic perspective, where video queries are decomposed into atomic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

utaustin-swarmlab/neuro-symbolic-agent-challenge
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games