YASMOT: Yet another stereo image multi-object tracker

Ketil Malde

arXiv:2506.17186·cs.CV·June 23, 2025

YASMOT: Yet another stereo image multi-object tracker

Ketil Malde

PDF

Open Access

TL;DR

YASMOT is a lightweight, flexible stereo and mono object tracker that enhances detection accuracy by tracking objects over time and generating consensus detections from multiple detectors.

Contribution

It introduces a novel, adaptable object tracking method capable of handling stereo and mono images, improving detection consistency and ensemble-based detection accuracy.

Findings

01

Effective tracking over time improves detection stability.

02

Supports both stereo and mono camera configurations.

03

Enables ensemble detection consensus generation.

Abstract

There now exists many popular object detectors based on deep learning that can analyze images and extract locations and class labels for occurrences of objects. For image time series (i.e., video or sequences of stills), tracking objects over time and preserving object identity can help to improve object detection performance, and is necessary for many downstream tasks, including classifying and predicting behaviors, and estimating total abundances. Here we present yasmot, a lightweight and flexible object tracker that can process the output from popular object detectors and track objects over time from either monoscopic or stereoscopic camera configurations. In addition, it includes functionality to generate consensus detections from ensembles of object detectors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition