Generative Modeling for Multi-task Visual Learning

Zhipeng Bao; Martial Hebert; Yu-Xiong Wang

arXiv:2106.13409·cs.CV·June 28, 2021·1 cites

Generative Modeling for Multi-task Visual Learning

Zhipeng Bao, Martial Hebert, Yu-Xiong Wang

PDF

Open Access

TL;DR

This paper introduces a multi-task generative modeling framework that synthesizes images with weak annotations to enhance performance across various visual perception tasks, demonstrating significant improvements on benchmark datasets.

Contribution

It proposes a novel multi-task generative modeling framework coupling discriminative and generative networks for shared visual feature learning.

Findings

01

Improves performance on multi-task benchmarks like NYUv2 and Taskonomy.

02

Outperforms state-of-the-art multi-task approaches.

03

Enables training with weak image-level annotations.

Abstract

Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. Correspondingly, we propose a general multi-task oriented generative modeling (MGM) framework, by coupling a discriminative multi-task network with a generative network. While it is challenging to synthesize both RGB images and pixel-level annotations in multi-task scenarios, our framework enables us to use synthesized images paired with only weak annotations (i.e., image-level scene labels) to facilitate multiple visual tasks. Experimental evaluation on challenging multi-task benchmarks, including NYUv2 and Taskonomy, demonstrates that our MGM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques