Combining Trained Models in Reinforcement Learning
Ujjwal Patil, Javad Ghofrani

TL;DR
This paper systematically reviews empirical studies on reusing pretrained models in deep reinforcement learning, highlighting patterns, limitations, and the need for standardized benchmarking.
Contribution
It provides a focused, qualitative synthesis of existing research on transfer and ensemble methods in DRL, proposing a provisional independence spectrum for future evaluation.
Findings
Positive results often occur when source and target tasks share structure.
Ensemble and federated methods show promise but are limited in scope.
Comparisons against from-scratch baselines are infrequent, affecting efficiency claims.
Abstract
Deep reinforcement learning (DRL) has delivered strong results in domains such as Atari and Go, but it still suffers from high sample cost and weak transfer beyond the training setting. A common response is to reuse information from previously trained models through transfer, distillation, ensemble methods, or federated training instead of learning each target task from random initialization. The literature on these mechanisms is fragmented, and published comparisons are hard to interpret because tasks, baselines, and compute budgets differ. This paper presents a PRISMA-guided systematic review of empirical studies on pretrained knowledge reuse in DRL. Starting from 589 records retrieved from IEEE Xplore, the ACM Digital Library, and citation tracing, we screened 570 unique records and assessed 89 full texts. After applying the final eligibility criteria, 15 empirical studies remained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
