A Review for Deep Reinforcement Learning in Atari:Benchmarks,   Challenges, and Solutions

Jiajun Fan

arXiv:2112.04145·cs.AI·February 28, 2023·5 cites

A Review for Deep Reinforcement Learning in Atari:Benchmarks, Challenges, and Solutions

Jiajun Fan

PDF

Open Access

TL;DR

This paper reviews deep reinforcement learning in Atari, critiques current evaluation metrics, proposes a new benchmark based on human world records, and discusses challenges and solutions for surpassing human performance.

Contribution

It introduces a novel Atari benchmark based on human world records and analyzes the limitations of current evaluation criteria in RL research.

Findings

01

Current evaluation metrics underestimate human performance.

02

Proposed benchmark raises the bar for RL agents.

03

Identified four key challenges hindering superhuman performance.

Abstract

The Arcade Learning Environment (ALE) is proposed as an evaluation platform for empirically assessing the generality of agents across dozens of Atari 2600 games. ALE offers various challenging problems and has drawn significant attention from the deep reinforcement learning (RL) community. From Deep Q-Networks (DQN) to Agent57, RL agents seem to achieve superhuman performance in ALE. However, is this the case? In this paper, to explore this problem, we first review the current evaluation metrics in the Atari benchmarks and then reveal that the current evaluation criteria of achieving superhuman performance are inappropriate, which underestimated the human performance relative to what is possible. To handle those problems and promote the development of RL research, we propose a novel Atari benchmark based on human world records (HWR), which puts forward higher requirements for RL agents…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Digital Games and Media · Artificial Intelligence in Games