Synthetic Data Are as Good as the Real for Association Knowledge Learning in Multi-object Tracking
Yuchi Liu, Zhongdao Wang, Xiangxin Zhou, Liang Zheng

TL;DR
This paper demonstrates that synthetic 3D data can effectively replace real videos for training association modules in multi-object tracking, achieving comparable performance without domain adaptation.
Contribution
Introduces MOTX, a large-scale synthetic data engine for training multi-object tracking association modules, showing synthetic data can match real data performance in real-world tests.
Findings
Synthetic data achieves similar tracking performance as real data.
Motion factors are well simulated in synthetic videos, aiding association learning.
Appearance domain gap has minimal impact on association training.
Abstract
Association, aiming to link bounding boxes of the same identity in a video sequence, is a central component in multi-object tracking (MOT). To train association modules, e.g., parametric networks, real video data are usually used. However, annotating person tracks in consecutive video frames is expensive, and such real data, due to its inflexibility, offer us limited opportunities to evaluate the system performance w.r.t changing tracking scenarios. In this paper, we study whether 3D synthetic data can replace real-world videos for association training. Specifically, we introduce a large-scale synthetic data engine named MOTX, where the motion characteristics of cameras and objects are manually configured to be similar to those in real-world datasets. We show that compared with real data, association knowledge obtained from synthetic data can achieve very similar performance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Face recognition and analysis
