TapWeight: Reweighting Pretraining Objectives for Task-Adaptive   Pretraining

Ruiyi Zhang; Sai Ashish Somayajula; Pengtao Xie

arXiv:2410.10006·cs.LG·October 15, 2024

TapWeight: Reweighting Pretraining Objectives for Task-Adaptive Pretraining

Ruiyi Zhang, Sai Ashish Somayajula, Pengtao Xie

PDF

Open Access

TL;DR

TapWeight introduces an automated method to optimize the importance of multiple pretraining objectives for task-adaptive pretraining, improving performance on downstream tasks in NLP and molecular property prediction.

Contribution

It proposes a novel multi-level optimization framework that automatically reweights pretraining objectives based on downstream feedback, reducing manual tuning and computational costs.

Findings

01

Significantly outperforms baseline methods in molecular property prediction.

02

Achieves superior results in natural language understanding tasks.

03

Demonstrates robustness and generalizability across domains.

Abstract

Large-scale general domain pretraining followed by downstream-specific finetuning has become a predominant paradigm in machine learning. However, discrepancies between the pretraining and target domains can still lead to performance degradation in certain cases, underscoring the need for task-adaptive continued pretraining (TAP). TAP methods typically involve continued pretraining on task-specific unlabeled datasets or introducing additional unsupervised learning objectives to enhance model capabilities. While many TAP methods perform continued pretraining with multiple pretraining objectives, they often determine the tradeoff parameters between objectives manually, resulting in suboptimal outcomes and higher computational costs. In this paper, we propose TapWeight, a task-adaptive pretraining framework which automatically determines the optimal importance of each pretraining objective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research