Hi-ZFO: Hierarchical Zeroth- and First-Order LLM Fine-Tuning via Importance-Guided Tensor Selection

Feihu Jin; Ying Tan

arXiv:2601.05501·cs.LG·January 12, 2026

Hi-ZFO: Hierarchical Zeroth- and First-Order LLM Fine-Tuning via Importance-Guided Tensor Selection

Feihu Jin, Ying Tan

PDF

Open Access

TL;DR

Hi-ZFO introduces a hierarchical hybrid optimization method combining zeroth- and first-order techniques, adaptively applying them to different model layers to improve fine-tuning efficiency and generalization of large language models.

Contribution

This paper proposes Hi-ZFO, a novel hybrid framework that adaptively combines zeroth- and first-order optimization for LLM fine-tuning, enhancing performance and reducing training time.

Findings

01

Achieves superior performance across diverse tasks.

02

Reduces training time significantly.

03

Effectively escapes local minima during training.

Abstract

Fine-tuning large language models (LLMs) using standard first-order (FO) optimization often drives training toward sharp, poorly generalizing minima. Conversely, zeroth-order (ZO) methods offer stronger exploratory behavior without relying on explicit gradients, yet suffer from slow convergence. More critically, our analysis reveals that in generative tasks, the vast output and search space significantly amplify estimation variance, rendering ZO methods both noisy and inefficient. To address these challenges, we propose \textbf{Hi-ZFO} (\textbf{Hi}erarchical \textbf{Z}eroth- and \textbf{F}irst-\textbf{O}rder optimization), a hybrid framework designed to synergize the precision of FO gradients with the exploratory capability of ZO estimation. Hi-ZFO adaptively partitions the model through layer-wise importance profiling, applying precise FO updates to critical layers while leveraging ZO…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Generative Adversarial Networks and Image Synthesis