MTAdam: Automatic Balancing of Multiple Training Loss Terms

Itzik Malkiel; Lior Wolf

arXiv:2006.14683·cs.LG·June 29, 2020

MTAdam: Automatic Balancing of Multiple Training Loss Terms

Itzik Malkiel, Lior Wolf

PDF

Open Access 1 Repo

TL;DR

MTAdam is a novel optimization algorithm that automatically balances multiple loss terms during neural network training by dynamically adjusting gradient magnitudes per layer, reducing manual tuning effort.

Contribution

The paper introduces MTAdam, a generalized Adam optimizer that balances multiple loss terms automatically, adapting to training dynamics and layer-specific needs.

Findings

01

MTAdam achieves comparable or better training results than traditional methods.

02

It reduces the need for manual hyperparameter tuning of loss weights.

03

The method adapts dynamically to changing loss trade-offs during training.

Abstract

When training neural models, it is common to combine multiple loss terms. The balancing of these terms requires considerable human effort and is computationally demanding. Moreover, the optimal trade-off between the loss term can change as training progresses, especially for adversarial terms. In this work, we generalize the Adam optimization algorithm to handle multiple loss terms. The guiding principle is that for every layer, the gradient magnitude of the terms should be balanced. To this end, the Multi-Term Adam (MTAdam) computes the derivative of each loss term separately, infers the first and second moments per parameter and loss term, and calculates a first moment for the magnitude per layer of the gradients arising from each loss. This magnitude is used to continuously balance the gradients across all layers, in a manner that both varies from one layer to the next and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ItzikMalkiel/MTAdam
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning

MethodsAdam