Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training

Zheheng Luo; Xin Zhang; Xiao Liu; Haoling Li; Yeyun Gong; Chen Qi; Peng Cheng

arXiv:2411.14318·cs.CL·May 21, 2025

Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training

Zheheng Luo, Xin Zhang, Xiao Liu, Haoling Li, Yeyun Gong, Chen Qi, Peng Cheng

PDF

Open Access 1 Video

TL;DR

Velocitune introduces a dynamic domain reweighting framework for continual pre-training, improving language model performance by adaptively balancing domain learning velocities based on a scaling law.

Contribution

It proposes Velocitune, a novel method that dynamically adjusts domain data proportions during pre-training using learning velocity assessments and a scaling law, addressing domain-adaptive continual pre-training challenges.

Findings

01

Improves performance on math and code reasoning tasks.

02

Enhances command-line generation benchmarks.

03

Effective due to target loss prediction and data ordering.

Abstract

It is well-known that a diverse corpus is critical for training large language models, which are typically constructed from a mixture of various domains. In general, previous efforts resort to sampling training data from different domains with static proportions, as well as adjusting data proportions during training. However, few methods have addressed the complexities of domain-adaptive continual pre-training. To fill this gap, we propose Velocitune, a novel framework dynamically assesses learning velocity and adjusts data proportions accordingly, favoring slower-learning domains while shunning faster-learning ones, which is guided by a scaling law to indicate the desired learning goal for each domain with less associated cost. To evaluate the effectiveness of Velocitune, we conduct experiments in a reasoning-focused dataset with CodeLlama, as well as in a corpus specialised for system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training· underline

Taxonomy

TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Flow Measurement and Analysis