Meta-Evolve: Continuous Robot Evolution for One-to-many Policy Transfer

Xingyu Liu; Deepak Pathak; Ding Zhao

arXiv:2405.03534·cs.RO·May 7, 2024

Meta-Evolve: Continuous Robot Evolution for One-to-many Policy Transfer

Xingyu Liu, Deepak Pathak, Ding Zhao

PDF

Open Access 1 Video 3 Reviews

TL;DR

Meta-Evolve introduces a continuous robot evolution framework that efficiently transfers policies from a source robot to multiple target robots by sharing evolution paths, significantly reducing simulation costs.

Contribution

The paper presents a novel tree-structured robot evolution method for efficient multi-robot policy transfer, outperforming naive approaches in simulation cost reduction.

Findings

01

Achieved up to 3.2× efficiency improvement in manipulation policy transfer.

02

Achieved up to 2.4× efficiency improvement in agile locomotion transfer.

03

Demonstrated effectiveness of robot evolution trees in policy transfer tasks.

Abstract

We investigate the problem of transferring an expert policy from a source robot to multiple different robots. To solve this problem, we propose a method named $M e t a$ - $E v o l v e$ that uses continuous robot evolution to efficiently transfer the policy to each target robot through a set of tree-structured evolutionary robot sequences. The robot evolution tree allows the robot evolution paths to be shared, so our approach can significantly outperform naive one-to-one policy transfer. We present a heuristic approach to determine an optimized robot evolution tree. Experiments have shown that our method is able to improve the efficiency of one-to-three transfer of manipulation policy by up to 3.2 $\times$ and one-to-six transfer of agile locomotion policy by 2.4 $\times$ in terms of simulation cost over the baseline of launching multiple independent one-to-one policy transfers.

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 5

Strengths

- The idea is using p-Steiner trees to represent the robot kinematic evolution tree is interesting, intuitive, simple and seems effective. - The method clearly accelerates transfer learning compared to the baselines provided. - The paper is generally well-written and the main messages are effectively conveyed.

Weaknesses

- My main concern is the practicality of the proposed method. In other words, how can this be used in real(ish) world applications? I explain further: - First, as the name suggests the method explores in the kinematic space of the robots. What happens if the dynamics differ drastically? An example would be, having a robot that has the same kinematic structure but twice the masses. Can the method handle this? It seems that no. Isn't changing the dynamics but keeping the kinematics the same a

Reviewer 02Rating 8· accept, good paperConfidence 3

Strengths

The paper is very well written with nice graphical explanations of the technique and clear and well reasoned theoretical work. The authors were careful to make sure that the technical details were written unambiguously and the descriptions of the underlying mathematics were excellent. The results of the work seem compelling and are well described.

Weaknesses

It is not obvious to me, as someone who does not deal much with the realm of physical robots, how common the problem addressed in this paper is—my uninformed guess would be that it is not so common, but the technique does not suffer much for this. It is not obvious that the work has much to do with evolution, in either the biological or computational senses, except in its use of trees which are reminiscent of the tree of life. It is also not obvious why the word “meta” was chosen, especially giv

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

The paper highlights the effectiveness of Meta-Evolve in inter-robot policy transfer, as demonstrated through experiments on hand manipulation tasks and agile locomotion tasks (very interesting). The results show that Meta-Evolve outperforms one-to-one policy transfer baselines in terms of training and simulation costs.

Weaknesses

The experiments are mainly limited to hand manipulation tasks, which can be easily represented by tree structures. Does your main idea still works on modular robots (like 3d voxel-based robot)? Reference: Nick Cheney, Robert MacCurdy, Jeff Clune, and Hod Lipson. Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding. In GECCO ’13, 2013.

Videos

Meta-Evolve: Continuous Robot Evolution for One-to-many Policy Transfer· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsSparse Evolutionary Training