Less is More -- Towards parsimonious multi-task models using structured   sparsity

Richa Upadhyay; Ronald Phlypo; Rajkumar Saini; Marcus Liwicki

arXiv:2308.12114·cs.CV·December 1, 2023

Less is More -- Towards parsimonious multi-task models using structured sparsity

Richa Upadhyay, Ronald Phlypo, Rajkumar Saini, Marcus Liwicki

PDF

Open Access 1 Repo

TL;DR

This paper introduces a structured sparsity approach using channel-wise l1/l2 regularization to create highly sparse, efficient multi-task deep learning models that can outperform dense models on standard vision datasets.

Contribution

The work proposes a novel structured sparsity method with channel-wise l1/l2 regularization for multi-task models, achieving high sparsity and improved performance.

Findings

01

Multi-task models with ~70% sparsity outperform dense counterparts.

02

Sparsification reduces inference time significantly.

03

Model performance varies with degree of sparsity.

Abstract

Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model's memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ricupa/less-is-more-towards-parsimonious-multi-task-models-using-structured-sparsity
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and Data Classification