TCDCaps: Visual Tracking via Cascaded Dense Capsules
Ding Ma, Xiangqian Wu

TL;DR
TCDCaps introduces a cascaded dense capsule architecture to enhance visual tracking by improving feature robustness and candidate quality, effectively addressing drift and IoU threshold challenges in tracking-by-detection methods.
Contribution
The paper proposes a novel cascaded dense capsule network (CDCaps) that captures appearance variations and improves candidate quality in visual tracking.
Findings
Demonstrates robustness on three popular benchmarks.
Effectively handles appearance variations and drift.
Improves candidate quality through sequential IoU threshold training.
Abstract
The critical challenge in tracking-by-detection framework is how to avoid drift problem during online learning, where the robust features for a variety of appearance changes are difficult to be learned and a reasonable intersection over union (IoU) threshold that defines the true/false positives is hard to set. This paper presents the TCDCaps method to address the problems above via a cascaded dense capsule architecture. To get robust features, we extend original capsules with dense-connected routing, which are referred as DCaps. Depending on the preservation of part-whole relationships in the Capsule Networks, our dense-connected capsules can capture a variety of appearance variations. In addition, to handle the issue of IoU threshold, a cascaded DCaps model (CDCaps) is proposed to improve the quality of candidates, it consists of sequential DCaps trained with increasing IoU thresholds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Image Enhancement Techniques · Human Pose and Action Recognition
