Combining Trained Models in Reinforcement Learning

Ujjwal Patil; Javad Ghofrani

arXiv:2605.02159·cs.LG·May 5, 2026

Combining Trained Models in Reinforcement Learning

Ujjwal Patil, Javad Ghofrani

PDF

TL;DR

This paper systematically reviews empirical studies on reusing pretrained models in deep reinforcement learning, highlighting patterns, limitations, and the need for standardized benchmarking.

Contribution

It provides a focused, qualitative synthesis of existing research on transfer and ensemble methods in DRL, proposing a provisional independence spectrum for future evaluation.

Findings

01

Positive results often occur when source and target tasks share structure.

02

Ensemble and federated methods show promise but are limited in scope.

03

Comparisons against from-scratch baselines are infrequent, affecting efficiency claims.

Abstract

Deep reinforcement learning (DRL) has delivered strong results in domains such as Atari and Go, but it still suffers from high sample cost and weak transfer beyond the training setting. A common response is to reuse information from previously trained models through transfer, distillation, ensemble methods, or federated training instead of learning each target task from random initialization. The literature on these mechanisms is fragmented, and published comparisons are hard to interpret because tasks, baselines, and compute budgets differ. This paper presents a PRISMA-guided systematic review of empirical studies on pretrained knowledge reuse in DRL. Starting from 589 records retrieved from IEEE Xplore, the ACM Digital Library, and citation tracing, we screened 570 unique records and assessed 89 full texts. After applying the final eligibility criteria, 15 empirical studies remained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.