Natural Language Reinforcement Learning
Xidong Feng, Ziyu Wan, Mengyue Yang, Ziyan Wang, Girish A. Koushik,, Yali Du, Ying Wen, Jun Wang

TL;DR
This paper introduces Natural Language Reinforcement Learning (NLRL), a novel framework that integrates RL with natural language representations, enhancing interpretability and efficiency using large language models like GPT-4.
Contribution
It redefines core RL concepts in natural language space and demonstrates practical implementation with LLMs, addressing key RL limitations.
Findings
NLRL improves interpretability of RL policies.
Initial experiments show NLRL's effectiveness and efficiency.
NLRL leverages LLMs like GPT-4 for practical implementation.
Abstract
Reinforcement Learning (RL) has shown remarkable abilities in learning policies for decision-making tasks. However, RL is often hindered by issues such as low sample efficiency, lack of interpretability, and sparse supervision signals. To tackle these limitations, we take inspiration from the human learning process and introduce Natural Language Reinforcement Learning (NLRL), which innovatively combines RL principles with natural language representation. Specifically, NLRL redefines RL concepts like task objectives, policy, value function, Bellman equation, and policy iteration in natural language space. We present how NLRL can be practically implemented with the latest advancements in large language models (LLMs) like GPT-4. Initial experiments over tabular MDPs demonstrate the effectiveness, efficiency, and also interpretability of the NLRL framework.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems
MethodsAttention Is All You Need · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Absolute Position Encodings · Softmax · Byte Pair Encoding · Linear Layer · Dropout · Multi-Head Attention
