ViewBirdiformer: Learning to recover ground-plane crowd trajectories and ego-motion from a single ego-centric view
Mai Nishimura, Shohei Nobuhara, Ko Nishino

TL;DR
ViewBirdiformer is a Transformer-based method that accurately recovers ground-plane crowd trajectories and ego-motion from a single ego-centric video, enabling real-time situational awareness in dense crowds.
Contribution
It introduces a novel Transformer model that decouples pedestrian and ego-motion trajectories from ego-centric videos in a single forward pass, improving speed and accuracy.
Findings
Achieves accuracy comparable or superior to state-of-the-art methods.
Reduces execution time by three orders of magnitude.
Enables real-time crowd trajectory recovery from a single view.
Abstract
We introduce a novel learning-based method for view birdification, the task of recovering ground-plane trajectories of pedestrians of a crowd and their observer in the same crowd just from the observed ego-centric video. View birdification becomes essential for mobile robot navigation and localization in dense crowds where the static background is hard to see and reliably track. It is challenging mainly for two reasons; i) absolute trajectories of pedestrians are entangled with the movement of the observer which needs to be decoupled from their observed relative movements in the ego-centric video, and ii) a crowd motion model describing the pedestrian movement interactions is specific to the scene yet unknown a priori. For this, we introduce a Transformer-based network referred to as ViewBirdiformer which implicitly models the crowd motion through self-attention and decomposes relative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Human Pose and Action Recognition
