Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer   Ordering

Elliot Meyerson; Risto Miikkulainen

arXiv:1711.00108·cs.LG·February 14, 2018·20 cites

Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering

Elliot Meyerson, Risto Miikkulainen

PDF

Open Access

TL;DR

This paper introduces a soft layer ordering method for deep multitask learning, allowing flexible sharing of layers across tasks, which improves performance over traditional parallel sharing approaches.

Contribution

It proposes a novel soft ordering approach that learns task-specific arrangements of shared layers, enhancing the flexibility and effectiveness of deep MTL models.

Findings

01

Soft ordering outperforms parallel ordering in multiple domains.

02

Flexible layer arrangements enable more effective sharing.

03

Deep MTL benefits from learning generalizable building blocks.

Abstract

Existing deep multitask learning (MTL) approaches align layers shared between tasks in a parallel ordering. Such an organization significantly constricts the types of shared structure that can be learned. The necessity of parallel ordering for deep MTL is first tested by comparing it with permuted ordering of shared layers. The results indicate that a flexible ordering can enable more effective sharing, thus motivating the development of a soft ordering approach, which learns how shared layers are applied in different ways for different tasks. Deep MTL with soft ordering outperforms parallel ordering methods across a series of domains. These results suggest that the power of deep MTL comes from learning highly general building blocks that can be assembled to meet the demands of each task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Stochastic Gradient Optimization Techniques