Less is More -- Towards parsimonious multi-task models using structured sparsity
Richa Upadhyay, Ronald Phlypo, Rajkumar Saini, Marcus Liwicki

TL;DR
This paper introduces a structured sparsity approach using channel-wise l1/l2 regularization to create highly sparse, efficient multi-task deep learning models that can outperform dense models on standard vision datasets.
Contribution
The work proposes a novel structured sparsity method with channel-wise l1/l2 regularization for multi-task models, achieving high sparsity and improved performance.
Findings
Multi-task models with ~70% sparsity outperform dense counterparts.
Sparsification reduces inference time significantly.
Model performance varies with degree of sparsity.
Abstract
Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model's memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and Data Classification
