DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning

Runpeng Xie; Quanwei Wang; Hao Hu; Zherui Zhou; Ni Mu; Xiyun Li; Yiqin Yang; Shuang Xu; Qianchuan Zhao; Bo XU

arXiv:2510.19562·cs.AI·October 24, 2025

DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning

Runpeng Xie, Quanwei Wang, Hao Hu, Zherui Zhou, Ni Mu, Xiyun Li, Yiqin Yang, Shuang Xu, Qianchuan Zhao, Bo XU

PDF

Open Access

TL;DR

DAIL introduces a novel approach combining distributional policy and semantic alignment to improve language-conditioned reinforcement learning by effectively resolving instruction ambiguities and enhancing task performance.

Contribution

The paper proposes DAIL, a new method that leverages distributional value estimation and semantic alignment to address language ambiguity in reinforcement learning tasks.

Findings

01

DAIL outperforms baseline methods on structured and visual benchmarks.

02

Theoretical analysis shows value distribution improves task differentiability.

03

Semantic alignment effectively links trajectories with instructions.

Abstract

Comprehending natural language and following human instructions are critical capabilities for intelligent agents. However, the flexibility of linguistic instructions induces substantial ambiguity across language-conditioned tasks, severely degrading algorithmic performance. To address these limitations, we present a novel method named DAIL (Distributional Aligned Learning), featuring two key components: distributional policy and semantic alignment. Specifically, we provide theoretical results that the value distribution estimation mechanism enhances task differentiability. Meanwhile, the semantic alignment module captures the correspondence between trajectories and linguistic instructions. Extensive experimental results on both structured and visual observation benchmarks demonstrate that DAIL effectively resolves instruction ambiguities, achieving superior performance to baseline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning