Learning Compact Neural Networks with Deep Overparameterised Multitask   Learning

Shen Ren; Haosen Shi

arXiv:2308.13300·cs.LG·August 28, 2023

Learning Compact Neural Networks with Deep Overparameterised Multitask Learning

Shen Ren, Haosen Shi

PDF

Open Access

TL;DR

This paper introduces a novel overparameterisation approach for multitask neural networks that improves training efficiency and generalisation by sharing overparameterised models across tasks, demonstrated on challenging datasets.

Contribution

It proposes a simple and effective overparameterised neural network design for multitask learning that enhances optimisation and performance.

Findings

01

Improved performance on NYUv2 and COCO datasets.

02

Effective across various convolutional architectures.

03

Enhances training efficiency and model generalisation.

Abstract

Compact neural network offers many benefits for real-world applications. However, it is usually challenging to train the compact neural networks with small parameter sizes and low computational costs to achieve the same or better model performance compared to more complex and powerful architecture. This is particularly true for multitask learning, with different tasks competing for resources. We present a simple, efficient and effective multitask learning overparameterisation neural network design by overparameterising the model architecture in training and sharing the overparameterised model parameters more effectively across tasks, for better optimisation and generalisation. Experiments on two challenging multitask datasets (NYUv2 and COCO) demonstrate the effectiveness of the proposed method across various convolutional networks and parameter sizes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning