FuseGen: PLM Fusion for Data-generation based Zero-shot Learning

Tianyuan Zou; Yang Liu; Peng Li; Jianqing Zhang; Jingjing Liu; Ya-Qin; Zhang

arXiv:2406.12527·cs.CL·June 19, 2024

FuseGen: PLM Fusion for Data-generation based Zero-shot Learning

Tianyuan Zou, Yang Liu, Peng Li, Jianqing Zhang, Jingjing Liu, Ya-Qin, Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

FuseGen introduces a multi-PLM, feedback-driven data generation framework that significantly improves zero-shot learning by enhancing synthetic dataset quality and reducing distribution bias.

Contribution

It presents a novel subset selection and iterative feedback mechanism using multiple PLMs and trained STMs to improve synthetic data quality for zero-shot learning.

Findings

01

Outperforms existing methods across diverse tasks.

02

Effectively reduces distribution bias in synthetic datasets.

03

Enhances STM performance in a PLM-agnostic manner.

Abstract

Data generation-based zero-shot learning, although effective in training Small Task-specific Models (STMs) via synthetic datasets generated by Pre-trained Language Models (PLMs), is often limited by the low quality of such synthetic datasets. Previous solutions have primarily focused on single PLM settings, where synthetic datasets are typically restricted to specific sub-spaces and often deviate from real-world distributions, leading to severe distribution bias. To mitigate such bias, we propose FuseGen, a novel data generation-based zero-shot learning framework that introduces a new criteria for subset selection from synthetic datasets via utilizing multiple PLMs and trained STMs. The chosen subset provides in-context feedback to each PLM, enhancing dataset quality through iterative data generation. Trained STMs are then used for sample re-weighting as well, further improving data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LindaLydia/FuseGen
pytorchOfficial

Videos

FuseGen: PLM Fusion for Data-generation based Zero-shot Learning· underline

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI