HyperPPO: A scalable method for finding small policies for robotic   control

Shashank Hegde; Zhehui Huang; Gaurav S. Sukhatme

arXiv:2309.16663·cs.RO·September 29, 2023

HyperPPO: A scalable method for finding small policies for robotic control

Shashank Hegde, Zhehui Huang, Gaurav S. Sukhatme

PDF

Open Access

TL;DR

HyperPPO is a scalable reinforcement learning method that efficiently finds small, high-performing neural network policies for robotic control by estimating multiple architectures simultaneously using graph hypernetworks.

Contribution

It introduces HyperPPO, an on-policy RL algorithm that leverages graph hypernetworks to concurrently estimate weights for multiple neural architectures, enabling efficient discovery of small, performant policies.

Findings

01

HyperPPO scales well with more training resources.

02

It produces small neural policies suitable for resource-constrained robots.

03

Policies learned can control a Crazyflie2.1 quadrotor effectively.

Abstract

Models with fewer parameters are necessary for the neural control of memory-limited, performant robots. Finding these smaller neural network architectures can be time-consuming. We propose HyperPPO, an on-policy reinforcement learning algorithm that utilizes graph hypernetworks to estimate the weights of multiple neural architectures simultaneously. Our method estimates weights for networks that are much smaller than those in common-use networks yet encode highly performant policies. We obtain multiple trained policies at the same time while maintaining sample efficiency and provide the user the choice of picking a network architecture that satisfies their computational constraints. We show that our method scales well - more training resources produce faster convergence to higher-performing architectures. We demonstrate that the neural policies estimated by HyperPPO are capable of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Memory and Neural Computing · Adversarial Robustness in Machine Learning