RLHFless: Serverless Computing for Efficient RLHF

Rui Wei; Hanfei Yu; Shubham Jain; Yogarajan Sivakumar; Devesh Tiwari; Jian Li; Seung-Jong Park; Hao Wang

arXiv:2602.22718·cs.AI·February 27, 2026

RLHFless: Serverless Computing for Efficient RLHF

Rui Wei, Hanfei Yu, Shubham Jain, Yogarajan Sivakumar, Devesh Tiwari, Jian Li, Seung-Jong Park, Hao Wang

PDF

Open Access

TL;DR

RLHFless introduces a serverless training framework for reinforcement learning from human feedback, significantly improving efficiency and reducing costs by adapting to dynamic resource demands and optimizing workload distribution.

Contribution

It is the first scalable serverless framework for synchronous RLHF, addressing resource variability and overhead issues in traditional infrastructures.

Findings

01

Achieves up to 1.35x speedup over baseline.

02

Reduces training costs by 44.8%.

03

Effectively adapts to dynamic resource demands.

Abstract

Reinforcement Learning from Human Feedback (RLHF) has been widely applied to Large Language Model (LLM) post-training to align model outputs with human preferences. Recent models, such as DeepSeek-R1, have also shown RLHF's potential to improve LLM reasoning on complex tasks. In RL, inference and training co-exist, creating dynamic resource demands throughout the workflow. Compared to traditional RL, RLHF further challenges training efficiency due to expanding model sizes and resource consumption. Several RLHF frameworks aim to balance flexible abstraction and efficient execution. However, they rely on serverful infrastructures, which struggle with fine-grained resource variability. As a result, during synchronous RLHF training, idle time between or within RL components often causes overhead and resource wastage. To address these issues, we present RLHFless, the first scalable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Multimodal Machine Learning Applications