Medusa: Universal Feature Learning via Attentional Multitasking
Jaime Spencer, Richard Bowden, Simon Hadfield

TL;DR
Medusa introduces a universal feature learning framework with dual attention mechanisms, enabling flexible multi-task learning and better feature sharing across tasks, leading to improved performance and efficiency.
Contribution
Medusa proposes a novel multitasking architecture with dual attention heads for universal feature learning, reducing retraining needs and enhancing task generalization.
Findings
+13.18% improvement in universal feature learning
Maintains multi-task learning performance
25% more efficient than previous methods
Abstract
Recent approaches to multi-task learning (MTL) have focused on modelling connections between tasks at the decoder level. This leads to a tight coupling between tasks, which need retraining if a new task is inserted or removed. We argue that MTL is a stepping stone towards universal feature learning (UFL), which is the ability to learn generic features that can be applied to new tasks without retraining. We propose Medusa to realize this goal, designing task heads with dual attention mechanisms. The shared feature attention masks relevant backbone features for each task, allowing it to learn a generic representation. Meanwhile, a novel Multi-Scale Attention head allows the network to better combine per-task features from different scales when making the final prediction. We show the effectiveness of Medusa in UFL (+13.18% improvement), while maintaining MTL performance and being 25%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Advanced Neural Network Applications
