Fine-Tuning without Performance Degradation

Han Wang; Adam White; Martha White

arXiv:2505.00913·cs.LG·May 5, 2025

Fine-Tuning without Performance Degradation

Han Wang, Adam White, Martha White

PDF

Open Access

TL;DR

This paper introduces a novel fine-tuning algorithm that minimizes performance degradation and accelerates learning during the transition from offline to online policy adaptation.

Contribution

The paper proposes a new fine-tuning method based on Jump Start that enables gradual exploration and reduces initial performance drops.

Findings

01

Significantly reduces performance degradation during fine-tuning

02

Achieves faster fine-tuning compared to existing algorithms

03

Demonstrates effectiveness across various settings

Abstract

Fine-tuning policies learned offline remains a major challenge in application domains. Monotonic performance improvement during \emph{fine-tuning} is often challenging, as agents typically experience performance degradation at the early fine-tuning stage. The community has identified multiple difficulties in fine-tuning a learned network online, however, the majority of progress has focused on improving learning efficiency during fine-tuning. In practice, this comes at a serious cost during fine-tuning: initially, agent performance degrades as the agent explores and effectively overrides the policy learned offline. We show across a range of settings, many offline-to-online algorithms exhibit either (1) performance degradation or (2) slow learning (sometimes effectively no improvement) during fine-tuning. We introduce a new fine-tuning algorithm, based on an algorithm called Jump Start,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVLSI and Analog Circuit Testing