Towards Better Multi-task Learning: A Framework for Optimizing Dataset   Combinations in Large Language Models

Zaifu Zhan; Rui Zhang

arXiv:2412.11455·cs.CL·May 6, 2025

Towards Better Multi-task Learning: A Framework for Optimizing Dataset Combinations in Large Language Models

Zaifu Zhan, Rui Zhang

PDF

Open Access

TL;DR

This paper introduces a neural network-based framework to optimize dataset combinations for multi-task learning in large language models, significantly improving efficiency and effectiveness across diverse biomedical tasks.

Contribution

It presents a novel, model- and dataset-independent method for selecting optimal dataset combinations to enhance multi-task learning performance.

Findings

01

Effectively identifies better dataset combinations for biomedical tasks.

02

Improves efficiency in dataset selection process.

03

Validates the framework across multiple tasks and datasets.

Abstract

To efficiently select optimal dataset combinations for enhancing multi-task learning (MTL) performance in large language models, we proposed a novel framework that leverages a neural network to predict the best dataset combinations. The framework iteratively refines the selection, greatly improving efficiency, while being model-, dataset-, and domain-independent. Through experiments on 12 biomedical datasets across four tasks - named entity recognition, relation extraction, event extraction, and text classification-we demonstrate that our approach effectively identifies better combinations, even for tasks that may seem unpromising from a human perspective. This verifies that our framework provides a promising solution for maximizing MTL potential.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling