Adding New Tasks to a Single Network with Weight Transformations using   Binary Masks

Massimiliano Mancini; Elisa Ricci; Barbara Caputo; Samuel Rota Bul\`o

arXiv:1805.11119·cs.CV·June 15, 2018

Adding New Tasks to a Single Network with Weight Transformations using Binary Masks

Massimiliano Mancini, Elisa Ricci, Barbara Caputo, Samuel Rota Bul\`o

PDF

TL;DR

This paper introduces a method for incrementally adapting deep neural networks to new tasks using learned binary masks and affine transformations, achieving high performance with minimal additional parameters.

Contribution

It proposes a novel approach combining binary masks and affine transformations for task adaptation, outperforming previous methods and setting new state-of-the-art results on benchmarks.

Findings

01

Achieves higher adaptation levels with about 1 bit per parameter per task.

02

Outperforms existing methods on Visual Decathlon Challenge.

03

Enables scalable incremental learning without catastrophic forgetting.

Abstract

Visual recognition algorithms are required today to exhibit adaptive abilities. Given a deep model trained on a specific, given task, it would be highly desirable to be able to adapt incrementally to new tasks, preserving scalability as the number of new tasks increases, while at the same time avoiding catastrophic forgetting issues. Recent work has shown that masking the internal weights of a given original conv-net through learned binary variables is a promising strategy. We build upon this intuition and take into account more elaborated affine transformations of the convolutional weights that include learned binary masks. We show that with our generalization it is possible to achieve significantly higher levels of adaptation to new tasks, enabling the approach to compete with fine tuning strategies by requiring slightly more than 1 bit per network parameter per additional task.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.