Adaptive Mixing of Auxiliary Losses in Supervised Learning

Durga Sivasubramanian; Ayush Maheshwari; Pradeep Shenoy; Prathosh AP; and Ganesh Ramakrishnan

arXiv:2202.03250·cs.LG·December 8, 2022

Adaptive Mixing of Auxiliary Losses in Supervised Learning

Durga Sivasubramanian, Ayush Maheshwari, Pradeep Shenoy, Prathosh AP, and Ganesh Ramakrishnan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces AMAL, a meta-learning approach that adaptively combines auxiliary losses at the instance level in supervised learning, improving performance in knowledge distillation and rule-denoising tasks.

Contribution

It proposes a novel bi-level optimization framework for learning optimal loss mixing weights dynamically, with a practical meta-learning solution applicable across various supervised learning scenarios.

Findings

01

AMAL outperforms baseline methods in knowledge distillation tasks.

02

AMAL improves rule-denoising accuracy over existing approaches.

03

Empirical analysis reveals how adaptive loss mixing enhances learning performance.

Abstract

In several supervised learning scenarios, auxiliary losses are used in order to introduce additional information or constraints into the supervised learning objective. For instance, knowledge distillation aims to mimic outputs of a powerful teacher model; similarly, in rule-based approaches, weak labeling information is provided by labeling functions which may be noisy rule-based approximations to true labels. We tackle the problem of learning to combine these losses in a principled manner. Our proposal, AMAL, uses a bi-level optimization criterion on validation data to learn optimal mixing weights, at an instance level, over the training data. We describe a meta-learning approach towards solving this bi-level objective and show how it can be applied to different scenarios in supervised learning. Experiments in a number of knowledge distillation and rule-denoising domains show that AMAL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

durgas16/AMAL
pytorchOfficial

Videos

Adaptive mixing of auxiliary losses in supervised learning· underline

Taxonomy

TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Imbalanced Data Classification Techniques

MethodsKnowledge Distillation · Adaptive Robust Loss