Finetuning Deep Reinforcement Learning Policies with Evolutionary Strategies for Control of Underactuated Robots

Marco Cal\`i; Alberto Sinigaglia; Niccol\`o Turcato; Ruggero Carli; Gian Antonio Susto

arXiv:2507.10030·cs.RO·July 15, 2025

Finetuning Deep Reinforcement Learning Policies with Evolutionary Strategies for Control of Underactuated Robots

Marco Cal\`i, Alberto Sinigaglia, Niccol\`o Turcato, Ruggero Carli, Gian Antonio Susto

PDF

Open Access

TL;DR

This paper presents a method combining Deep Reinforcement Learning with Evolutionary Strategies to fine-tune control policies for underactuated robots, resulting in improved performance and robustness in complex tasks.

Contribution

It introduces a novel approach that refines Deep RL policies using Evolutionary Strategies, specifically targeting underactuated robotic control for enhanced effectiveness.

Findings

01

Significant performance improvements over baseline policies.

02

Enhanced robustness of control strategies.

03

Competitive scores achieved in AI Olympics tasks.

Abstract

Deep Reinforcement Learning (RL) has emerged as a powerful method for addressing complex control problems, particularly those involving underactuated robotic systems. However, in some cases, policies may require refinement to achieve optimal performance and robustness aligned with specific task objectives. In this paper, we propose an approach for fine-tuning Deep RL policies using Evolutionary Strategies (ES) to enhance control performance for underactuated robots. Our method involves initially training an RL agent with Soft-Actor Critic (SAC) using a surrogate reward function designed to approximate complex specific scoring metrics. We subsequently refine this learned policy through a zero-order optimization step employing the Separable Natural Evolution Strategy (SNES), directly targeting the original score. Experimental evaluations conducted in the context of the 2nd AI Olympics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics