TL;DR
This paper presents an improved, data-efficient version of Rainbow DQN that maintains competitive performance on Atari games while drastically reducing training data and time requirements, making RL research more accessible.
Contribution
We propose modifications to Rainbow that significantly reduce data and compute needs without sacrificing performance, validated through extensive experiments and ablation studies.
Findings
Achieves median human-normalized scores close to original Rainbow
Uses 20 times less data than standard Rainbow
Requires only 7.5 hours of training on a single GPU
Abstract
Across the Arcade Learning Environment, Rainbow achieves a level of performance competitive with humans and modern RL algorithms. However, attaining this level of performance requires large amounts of data and hardware resources, making research in this area computationally expensive and use in practical applications often infeasible. This paper's contribution is threefold: We (1) propose an improved version of Rainbow, seeking to drastically reduce Rainbow's data, training time, and compute requirements while maintaining its competitive performance; (2) we empirically demonstrate the effectiveness of our approach through experiments on the Arcade Learning Environment, and (3) we conduct a number of ablation studies to investigate the effect of the individual proposed modifications. Our improved version of Rainbow reaches a median human normalized score close to classic Rainbow's, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
