Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement   Learning

Hongpeng Cao; Mirco Theile; Federico G. Wyrwal; and Marco Caccamo

arXiv:2203.02230·cs.LG·January 2, 2023

Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement Learning

Hongpeng Cao, Mirco Theile, Federico G. Wyrwal, and Marco Caccamo

PDF

Open Access

TL;DR

This paper introduces a cloud-edge architecture for real-time training of deep reinforcement learning agents on physical systems, effectively bridging the reality gap and enabling adaptive control in real-world environments.

Contribution

It proposes a novel distributed cloud-edge framework that separates inference and training, facilitating real-time adaptation of pretrained DRL policies to real-world dynamics.

Findings

01

Successfully applied to inverted-pendulum control system

02

Demonstrated effective adaptation to unseen dynamics

03

Achieved efficient real-time training in physical environment

Abstract

Deep reinforcement learning (DRL) is a promising approach to solve complex control tasks by learning policies through interactions with the environment. However, the training of DRL policies requires large amounts of training experiences, making it impractical to learn the policy directly on physical systems. Sim-to-real approaches leverage simulations to pretrain DRL policies and then deploy them in the real world. Unfortunately, the direct real-world deployment of pretrained policies usually suffers from performance deterioration due to the different dynamics, known as the reality gap. Recent sim-to-real methods, such as domain randomization and domain adaptation, focus on improving the robustness of the pretrained agents. Nevertheless, the simulation-trained policies often need to be tuned with real-world data to reach optimal performance, which is challenging due to the high cost of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research