DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning
Runpeng Xie, Quanwei Wang, Hao Hu, Zherui Zhou, Ni Mu, Xiyun Li, Yiqin Yang, Shuang Xu, Qianchuan Zhao, Bo XU

TL;DR
DAIL introduces a novel approach combining distributional policy and semantic alignment to improve language-conditioned reinforcement learning by effectively resolving instruction ambiguities and enhancing task performance.
Contribution
The paper proposes DAIL, a new method that leverages distributional value estimation and semantic alignment to address language ambiguity in reinforcement learning tasks.
Findings
DAIL outperforms baseline methods on structured and visual benchmarks.
Theoretical analysis shows value distribution improves task differentiability.
Semantic alignment effectively links trajectories with instructions.
Abstract
Comprehending natural language and following human instructions are critical capabilities for intelligent agents. However, the flexibility of linguistic instructions induces substantial ambiguity across language-conditioned tasks, severely degrading algorithmic performance. To address these limitations, we present a novel method named DAIL (Distributional Aligned Learning), featuring two key components: distributional policy and semantic alignment. Specifically, we provide theoretical results that the value distribution estimation mechanism enhances task differentiability. Meanwhile, the semantic alignment module captures the correspondence between trajectories and linguistic instructions. Extensive experimental results on both structured and visual observation benchmarks demonstrate that DAIL effectively resolves instruction ambiguities, achieving superior performance to baseline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning
