From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning

Pusen Dong; Tianchen Zhu; Yue Qiu; Haoyi Zhou; Jianxin Li

arXiv:2412.08920·cs.CL·August 6, 2025

From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning

Pusen Dong, Tianchen Zhu, Yue Qiu, Haoyi Zhou, Jianxin Li

PDF

Open Access

TL;DR

This paper introduces TTCT, a novel method that uses natural language text both as a constraint and training signal in safe reinforcement learning, reducing manual design and improving adaptability to constraint shifts.

Contribution

The paper presents TTCT, a dual-role textual constraint translator that enhances safe RL by enabling zero-shot transfer and automatic constraint comprehension.

Findings

01

TTCT achieves lower violation rates than traditional cost functions.

02

TTCT demonstrates zero-shot transfer to new constraint environments.

03

Empirical results validate TTCT's effectiveness in safe RL tasks.

Abstract

Safe reinforcement learning (RL) requires the agent to finish a given task while obeying specific constraints. Giving constraints in natural language form has great potential for practical scenarios due to its flexible transfer capability and accessibility. Previous safe RL methods with natural language constraints typically need to design cost functions manually for each constraint, which requires domain expertise and lacks flexibility. In this paper, we harness the dual role of text in this task, using it not only to provide constraint but also as a training signal. We introduce the Trajectory-level Textual Constraints Translator (TTCT) to replace the manually designed cost function. Our empirical results demonstrate that TTCT effectively comprehends textual constraint and trajectory, and the policies trained by TTCT can achieve a lower violation rate than the standard cost function.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation