CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking   and Segmentation

Yiming Cui; Cheng Han; Dongfang Liu

arXiv:2311.00987·cs.CV·November 3, 2023·1 cites

CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking and Segmentation

Yiming Cui, Cheng Han, Dongfang Liu

PDF

Open Access

TL;DR

This paper introduces CML-MOTS, a collaborative multi-task learning framework that simultaneously performs object detection, segmentation, and tracking in videos, improving multi-object video analysis for applications like autonomous driving.

Contribution

The paper proposes a novel end-to-end CNN with associative connections enabling collaborative learning across detection, segmentation, and tracking tasks, enhancing multi-object video analysis.

Findings

01

Achieved encouraging results on KITTI MOTS and MOTS Challenge datasets.

02

Demonstrated improved performance through information sharing among tasks.

03

Validated effectiveness of the collaborative multi-task learning approach.

Abstract

The advancement of computer vision has pushed visual analysis tasks from still images to the video domain. In recent years, video instance segmentation, which aims to track and segment multiple objects in video frames, has drawn much attention for its potential applications in various emerging areas such as autonomous driving, intelligent transportation, and smart retail. In this paper, we propose an effective framework for instance-level visual analysis on video frames, which can simultaneously conduct object detection, instance segmentation, and multi-object tracking. The core idea of our method is collaborative multi-task learning which is achieved by a novel structure, named associative connections among detection, segmentation, and tracking task heads in an end-to-end learnable CNN. These additional connections allow information propagation across multiple related tasks, so as to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Visual Attention and Saliency Detection