Joint Inference of Groups, Events and Human Roles in Aerial Videos
Tianmin Shu, Dan Xie, Brandon Rothrock, Sinisa Todorovic, Song-Chun, Zhu

TL;DR
This paper presents a joint inference framework for analyzing low-resolution aerial videos, enabling the simultaneous grouping, event recognition, and role assignment of people in large outdoor scenes, which is crucial for drone-based surveillance.
Contribution
It introduces a novel spatiotemporal inference method using AND-OR graphs and templates, specifically designed for low-resolution aerial video analysis, and provides a new dataset for evaluation.
Findings
Effective joint inference of groups, events, and roles in aerial videos
Improved accuracy over isolated inference methods
Demonstrated robustness in challenging low-resolution conditions
Abstract
With the advent of drones, aerial video analysis becomes increasingly important; yet, it has received scant attention in the literature. This paper addresses a new problem of parsing low-resolution aerial videos of large spatial areas, in terms of 1) grouping, 2) recognizing events and 3) assigning roles to people engaged in events. We propose a novel framework aimed at conducting joint inference of the above tasks, as reasoning about each in isolation typically fails in our setting. Given noisy tracklets of people and detections of large objects and scene surfaces (e.g., building, grass), we use a spatiotemporal AND-OR graph to drive our joint inference, using Markov Chain Monte Carlo and dynamic programming. We also introduce a new formalism of spatiotemporal templates characterizing latent sub-events. For evaluation, we have collected and released a new aerial videos dataset using a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications
