Probabilistic Semantic Retrieval for Surveillance Videos with Activity Graphs
Yuting Chen, Joseph Wang, Yannan Bai, Gregory Casta\~n\'on, and, Venkatesh Saligrama

TL;DR
This paper introduces a probabilistic framework for retrieving complex activities in cluttered surveillance videos by matching semantic activity graphs, addressing challenges like data scarcity and detection errors.
Contribution
It proposes a novel CRF-based activity localization method and a subgraph matching algorithm for effective retrieval without extensive activity annotations.
Findings
Outperforms existing retrieval methods on benchmark datasets.
Effectively handles detection errors and data scarcity.
Provides a flexible semantic graph-based activity description framework.
Abstract
We present a novel framework for finding complex activities matching user-described queries in cluttered surveillance videos. The wide diversity of queries coupled with unavailability of annotated activity data limits our ability to train activity models. To bridge the semantic gap we propose to let users describe an activity as a semantic graph with object attributes and inter-object relationships associated with nodes and edges, respectively. We learn node/edge-level visual predictors during training and, at test-time, propose to retrieve activity by identifying likely locations that match the semantic graph. We formulate a novel CRF based probabilistic activity localization objective that accounts for mis-detections, mis-classifications and track-losses, and outputs a likelihood score for a candidate grounded location of the query in the video. We seek groundings that maximize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConditional Random Field
