TL;DR
This paper introduces a new, realistic dataset for evaluating 6-DOF object trackers using a motion capture system, and improves a deep tracking architecture to enhance robustness and generalization.
Contribution
It provides a novel dataset with accurate ground truth for 6-DOF tracking and advances a deep architecture that generalizes to unseen objects.
Findings
The dataset contains 297 sequences in varied scenarios.
Enhanced architecture shows improved robustness and generalization.
The trained tracker outperforms previous methods on unseen objects.
Abstract
We present a challenging and realistic novel dataset for evaluating 6-DOF object tracking algorithms. Existing datasets show serious limitations---notably, unrealistic synthetic data, or real data with large fiducial markers---preventing the community from obtaining an accurate picture of the state-of-the-art. Using a data acquisition pipeline based on a commercial motion capture system for acquiring accurate ground truth poses of real objects with respect to a Kinect V2 camera, we build a dataset which contains a total of 297 calibrated sequences. They are acquired in three different scenarios to evaluate the performance of trackers: stability, robustness to occlusion and accuracy during challenging interactions between a person and the object. We conduct an extensive study of a deep 6-DOF tracking architecture and determine a set of optimal parameters. We enhance the architecture and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
