PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing
Dan Xu, Wanli Ouyang, Xiaogang Wang, Nicu Sebe

TL;DR
PAD-Net is a multi-task network that predicts auxiliary tasks to enhance depth estimation and scene parsing, using multi-modal distillation for improved accuracy in complex scenes.
Contribution
The paper introduces PAD-Net, a novel multi-task guided network that leverages auxiliary task predictions and multi-modal distillation to improve joint depth estimation and scene parsing.
Findings
Effective on NYUD-v2 and Cityscapes datasets.
Improves accuracy of depth and scene parsing tasks.
Demonstrates robustness through extensive experiments.
Abstract
Depth estimation and scene parsing are two particularly important tasks in visual scene understanding. In this paper we tackle the problem of simultaneous depth estimation and scene parsing in a joint CNN. The task can be typically treated as a deep multi-task learning problem [42]. Different from previous methods directly optimizing multiple tasks given the input training data, this paper proposes a novel multi-task guided prediction-and-distillation network (PAD-Net), which first predicts a set of intermediate auxiliary tasks ranging from low level to high level, and then the predictions from these intermediate auxiliary tasks are utilized as multi-modal input via our proposed multi-modal distillation modules for the final tasks. During the joint learning, the intermediate tasks not only act as supervision for learning more robust deep representations but also provide rich multi-modal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Image Processing Techniques and Applications
