Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection,   Segmentation, and Depth Estimation

Hanrong Ye; Dan Xu

arXiv:2304.00971·cs.CV·April 7, 2023·1 cites

Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation

Hanrong Ye, Dan Xu

PDF

Open Access 1 Repo

TL;DR

This paper introduces TaskPrompter, a unified multi-task learning framework for joint 2D-3D perception tasks on Cityscapes-3D, achieving state-of-the-art results in 3D detection, segmentation, and depth estimation.

Contribution

It proposes a novel multi-task prompting framework that unifies learning objectives, reducing design complexity and enhancing multi-task representation learning in 3D perception.

Findings

01

Achieves new state-of-the-art in 3D detection and depth estimation.

02

Demonstrates strong multi-task performance on Cityscapes-3D.

03

Unifies multiple perception tasks in a single model.

Abstract

This report serves as a supplementary document for TaskPrompter, detailing its implementation on a new joint 2D-3D multi-task learning benchmark based on Cityscapes-3D. TaskPrompter presents an innovative multi-task prompting framework that unifies the learning of (i) task-generic representations, (ii) task-specific representations, and (iii) cross-task interactions, as opposed to previous approaches that separate these learning objectives into different network modules. This unified approach not only reduces the need for meticulous empirical structure design but also significantly enhances the multi-task network's representation learning capability, as the entire model capacity is devoted to optimizing the three objectives simultaneously. TaskPrompter introduces a new multi-task benchmark based on Cityscapes-3D dataset, which requires the multi-task model to concurrently generate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

prismformore/multi-task-transformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Vision and Imaging · Domain Adaptation and Few-Shot Learning