TimelyFreeze: Adaptive Parameter Freezing Mechanism for Pipeline Parallelism
Seonghye Cho, Jaemin Han, Hyunjin Kim, Euisoo Jung, Jae-Gil Lee

TL;DR
TimelyFreeze is an adaptive parameter freezing method that models pipeline schedules as a DAG and uses linear programming to optimize freeze ratios, significantly improving training throughput for large models without accuracy loss.
Contribution
It introduces a novel linear programming approach to determine optimal parameter freeze ratios, balancing throughput and accuracy in pipeline parallelism.
Findings
Achieves up to 40% training throughput improvement on LLaMA-8B.
Maintains comparable accuracy while reducing training time.
Generalizes across various pipeline-parallel configurations.
Abstract
Pipeline parallelism enables training models that exceed single-device memory, but practical throughput remains limited by pipeline bubbles. Although parameter freezing can improve training throughput by adaptively skipping backward computation, existing methods often over-freeze parameters, resulting in unnecessary accuracy degradation. To address this issue, we propose TimelyFreeze, which models the pipeline schedule as a directed acyclic graph and solves a linear program to compute optimal freeze ratios that minimize batch execution time under accuracy constraints. Experiments show that TimelyFreeze achieves up to 40% training throughput improvement on LLaMA-8B with comparable accuracy. Overall, it enables faster large-scale model training without compromising convergence and generalizes across diverse pipeline-parallel settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Neural Network Applications · Network Packet Processing and Optimization
