Spatial-Temporal Deep Embedding for Vehicle Trajectory Reconstruction from High-Angle Video
Tianya T. Zhang Ph.D., Peter J. Jin Ph.D., Han Zhou, Benedetto, Piccoli, Ph.D

TL;DR
This paper introduces a novel spatial-temporal deep embedding model for vehicle trajectory reconstruction from high-angle videos, outperforming baselines and demonstrating robustness and scalability.
Contribution
The paper proposes a new deep embedding approach with parity constraints for vehicle segmentation in STMaps, improving accuracy and robustness over existing methods.
Findings
Outperforms five baseline models in segmentation metrics
Robust against shadows, static noise, and overlaps
Successfully reconstructs vehicle trajectories from public videos
Abstract
Spatial-temporal Map (STMap)-based methods have shown great potential to process high-angle videos for vehicle trajectory reconstruction, which can meet the needs of various data-driven modeling and imitation learning applications. In this paper, we developed Spatial-Temporal Deep Embedding (STDE) model that imposes parity constraints at both pixel and instance levels to generate instance-aware embeddings for vehicle stripe segmentation on STMap. At pixel level, each pixel was encoded with its 8-neighbor pixels at different ranges, and this encoding is subsequently used to guide a neural network to learn the embedding mechanism. At the instance level, a discriminative loss function is designed to pull pixels belonging to the same instance closer and separate the mean value of different instances far apart in the embedding space. The output of the spatial-temporal affinity is then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Video Surveillance and Tracking Methods · Human Pose and Action Recognition
