Adapting Auxiliary Losses Using Gradient Similarity

Yunshu Du; Wojciech M. Czarnecki; Siddhant M. Jayakumar; Mehrdad; Farajtabar; Razvan Pascanu; Balaji Lakshminarayanan

arXiv:1812.02224·stat.ML·November 30, 2020·95 cites

Adapting Auxiliary Losses Using Gradient Similarity

Yunshu Du, Wojciech M. Czarnecki, Siddhant M. Jayakumar, Mehrdad, Farajtabar, Razvan Pascanu, Balaji Lakshminarayanan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method that uses gradient cosine similarity to adaptively weight auxiliary losses, improving neural network training by identifying when auxiliary tasks are beneficial.

Contribution

The paper proposes a gradient similarity-based adaptive weighting scheme for auxiliary losses, ensuring convergence and practical effectiveness across multiple domains.

Findings

01

Improves multi-task learning on ImageNet subsets

02

Enhances reinforcement learning performance in gridworld and Atari

03

Guarantees convergence to critical points of the main task

Abstract

One approach to deal with the statistical inefficiency of neural networks is to rely on auxiliary losses that help to build useful representations. However, it is not always trivial to know if an auxiliary task will be helpful for the main task and when it could start hurting. We propose to use the cosine similarity between gradients of tasks as an adaptive weight to detect when an auxiliary loss is helpful to the main loss. We show that our approach is guaranteed to converge to critical points of the main task and demonstrate the practical usefulness of the proposed algorithm in a few domains: multi-task supervised learning on subsets of ImageNet, reinforcement learning on gridworld, and reinforcement learning on Atari games.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AvivNavon/AuxiLearn
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning