$\mathrm{E^{2}CFD}$: Towards Effective and Efficient Cost Function   Design for Safe Reinforcement Learning via Large Language Model

Zepeng Wang; Chao Ma; Linjiang Zhou; Libing Wu; Lei Yang; Xiaochuan; Shi; and Guojun Peng

arXiv:2407.05580·cs.LG·July 9, 2024

$\mathrm{E^{2}CFD}$: Towards Effective and Efficient Cost Function Design for Safe Reinforcement Learning via Large Language Model

Zepeng Wang, Chao Ma, Linjiang Zhou, Libing Wu, Lei Yang, Xiaochuan, Shi, and Guojun Peng

PDF

Open Access

TL;DR

This paper introduces $ ext{E}^2 ext{CFD}$, a framework that uses large language models to generate and iteratively refine cost functions for safe reinforcement learning, improving policy performance across diverse safety scenarios.

Contribution

The paper presents a novel framework leveraging LLMs and a fast evaluation method to automatically generate and optimize cost functions tailored to specific safety scenarios in reinforcement learning.

Findings

01

Policies trained with $ ext{E}^2 ext{CFD}$ outperform traditional methods.

02

The framework effectively adapts to various safety scenarios.

03

Iterative refinement improves cost function suitability.

Abstract

Different classes of safe reinforcement learning algorithms have shown satisfactory performance in various types of safety requirement scenarios. However, the existing methods mainly address one or several classes of specific safety requirement scenario problems and cannot be applied to arbitrary safety requirement scenarios. In addition, the optimization objectives of existing reinforcement learning algorithms are misaligned with the task requirements. Based on the need to address these issues, we propose $E^{2} CFD$ , an effective and efficient cost function design framework. $E^{2} CFD$ leverages the capabilities of a large language model (LLM) to comprehend various safety scenarios and generate corresponding cost functions. It incorporates the \textit{fast performance evaluation (FPE)} method to facilitate rapid and iterative updates to the generated cost function.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling