Iterative Foundation Model Fine-Tuning on Multiple Rewards

Pouya M. Ghari; Simone Sciabola; Ye Wang

arXiv:2511.00220·cs.LG·November 4, 2025

Iterative Foundation Model Fine-Tuning on Multiple Rewards

Pouya M. Ghari, Simone Sciabola, Ye Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces an iterative reinforcement learning method for fine-tuning foundation models using multiple reward signals, improving output quality across various domains like text, biology, and chemistry.

Contribution

It presents a novel iterative multi-reward RL fine-tuning approach with theoretical analysis and superior empirical performance over existing methods.

Findings

01

Effective across text, biological, and chemical domains

02

Outperforms state-of-the-art baselines

03

Provides theoretical insights into multi-reward RL

Abstract

Fine-tuning foundation models has emerged as a powerful approach for generating objects with specific desired properties. Reinforcement learning (RL) provides an effective framework for this purpose, enabling models to generate outputs that maximize a given reward function. However, in many applications such as text generation and drug discovery, it can be suboptimal to optimize using a single reward signal, as multiple evaluation criteria are often necessary. This paper proposes a novel reinforcement learning-based method for fine-tuning foundation models using multiple reward signals. By employing an iterative fine-tuning strategy across these rewards, our approach generalizes state-of-the-art RL-based methods. We further provide a theoretical analysis that offers insights into the performance of multi-reward RL fine-tuning. Experimental results across diverse domains including text,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Iterative Foundation Model Fine-Tuning on Multiple Rewards· slideslive

Taxonomy

TopicsMachine Learning in Materials Science · Machine Learning and Data Classification · Topic Modeling