Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Prashanth Vijayaraghavan; Hongzhi Wang; Luyao Shi; Tyler Baldwin,; David Beymer; Ehsan Degan

arXiv:2406.15476·cs.CL·June 25, 2024

Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Prashanth Vijayaraghavan, Hongzhi Wang, Luyao Shi, Tyler Baldwin,, David Beymer, Ehsan Degan

PDF

Open Access 1 Video

TL;DR

This paper introduces STRATANET, a novel framework for data-free knowledge amalgamation that enables training compact text classifiers from multiple pre-trained teachers without access to original data.

Contribution

It proposes a new data-free approach combining a steerable data generator and a self-regulative amalgamation module for effective knowledge transfer.

Findings

01

STRATANET outperforms baseline models on benchmark datasets.

02

The method effectively amalgamates knowledge from multiple teachers.

03

It works under both data-driven and data-free scenarios.

Abstract

Recently, there has been a growing availability of pre-trained text models on various model repositories. These models greatly reduce the cost of training new models from scratch as they can be fine-tuned for specific tasks or trained on large datasets. However, these datasets may not be publicly accessible due to the privacy, security, or intellectual property issues. In this paper, we aim to develop a lightweight student network that can learn from multiple teacher models without accessing their original training data. Hence, we investigate Data-Free Knowledge Amalgamation (DFKA), a knowledge-transfer task that combines insights from multiple pre-trained teacher models and transfers them effectively to a compact student network. To accomplish this, we propose STRATANET, a modeling framework comprising: (a) a steerable data generator that produces text data tailored to each teacher and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Self-Regulated Data-Free Knowledge Amalgamation for Text Classification· underline

Taxonomy

TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Data Stream Mining Techniques