TL;DR
This paper introduces CompositeTasking, a novel approach for spatially distributed image understanding that uses a single network conditioned on pixel-wise task requests, enabling multi-tasking with sparse supervision and task editing.
Contribution
The paper proposes a unified encoder-decoder model for multi-task image understanding conditioned on spatial task requests, allowing task composition and sparse supervision.
Findings
Achieves performance comparable to dense supervision baselines.
Supports task editing and composition within a single network.
Operates efficiently with sparse labels across tasks.
Abstract
We define the concept of CompositeTasking as the fusion of multiple, spatially distributed tasks, for various aspects of image understanding. Learning to perform spatially distributed tasks is motivated by the frequent availability of only sparse labels across tasks, and the desire for a compact multi-tasking network. To facilitate CompositeTasking, we introduce a novel task conditioning model -- a single encoder-decoder network that performs multiple, spatially varying tasks at once. The proposed network takes an image and a set of pixel-wise dense task requests as inputs, and performs the requested prediction task for each pixel. Moreover, we also learn the composition of tasks that needs to be performed according to some CompositeTasking rules, which includes the decision of where to apply which task. It not only offers us a compact network for multi-tasking, but also allows for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
