PROFIT: A Specialized Optimizer for Deep Fine Tuning

Anirudh S Chakravarthy; Shuai Kyle Zheng; Xin Huang; Sachithra Hemachandra; Xiao Zhang; Yuning Chai; Zhao Chen

arXiv:2412.01930·cs.CV·November 3, 2025

PROFIT: A Specialized Optimizer for Deep Fine Tuning

Anirudh S Chakravarthy, Shuai Kyle Zheng, Xin Huang, Sachithra Hemachandra, Xiao Zhang, Yuning Chai, Zhao Chen

PDF

Open Access 1 Video

TL;DR

PROFIT is a novel optimizer specifically designed for fine-tuning pre-trained models, improving performance across diverse tasks by leveraging properties of converged models and employing a temporal gradient-orthogonalization process.

Contribution

It introduces PROFIT, a specialized optimizer for incremental fine-tuning that explicitly considers converged model properties, outperforming traditional optimizers in various applications.

Findings

01

PROFIT outperforms traditional optimizers in multiple tasks.

02

It effectively regularizes the fine-tuning process for better performance.

03

Easy integration into existing training pipelines.

Abstract

The fine-tuning of pre-trained models has become ubiquitous in generative AI, computer vision, and robotics. Although much attention has been paid to improving the efficiency of fine-tuning model, there has been less scholarship around fine-tuning specifically for improved model performance. To remedy this gap, we present PROFIT, one of the first optimizers designed to incrementally fine-tune converged models on new tasks and/or datasets. Unlike traditional optimizers such as SGD or Adam, which make minimal assumptions due to random initializations, PROFIT takes the properties of a converged model into account explicitly to regularize the optimization process. Employing a temporal gradient-orthogonalization process, PROFIT outperforms fine-tuning methods in various tasks, from image classification to multimodal language model training to large-scale motion prediction. Moreover, PROFIT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

PROFIT: A Specialized Optimizer for Deep Fine Tuning· slideslive

Taxonomy

TopicsExperimental Learning in Engineering

MethodsStochastic Gradient Descent · Focus · Adam