Synthetic Data Are as Good as the Real for Association Knowledge   Learning in Multi-object Tracking

Yuchi Liu; Zhongdao Wang; Xiangxin Zhou; Liang Zheng

arXiv:2106.16100·cs.CV·October 26, 2021

Synthetic Data Are as Good as the Real for Association Knowledge Learning in Multi-object Tracking

Yuchi Liu, Zhongdao Wang, Xiangxin Zhou, Liang Zheng

PDF

Open Access

TL;DR

This paper demonstrates that synthetic 3D data can effectively replace real videos for training association modules in multi-object tracking, achieving comparable performance without domain adaptation.

Contribution

Introduces MOTX, a large-scale synthetic data engine for training multi-object tracking association modules, showing synthetic data can match real data performance in real-world tests.

Findings

01

Synthetic data achieves similar tracking performance as real data.

02

Motion factors are well simulated in synthetic videos, aiding association learning.

03

Appearance domain gap has minimal impact on association training.

Abstract

Association, aiming to link bounding boxes of the same identity in a video sequence, is a central component in multi-object tracking (MOT). To train association modules, e.g., parametric networks, real video data are usually used. However, annotating person tracks in consecutive video frames is expensive, and such real data, due to its inflexibility, offer us limited opportunities to evaluate the system performance w.r.t changing tracking scenarios. In this paper, we study whether 3D synthetic data can replace real-world videos for association training. Specifically, we introduce a large-scale synthetic data engine named MOTX, where the motion characteristics of cameras and objects are manually configured to be similar to those in real-world datasets. We show that compared with real data, association knowledge obtained from synthetic data can achieve very similar performance on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Face recognition and analysis