DCT: Dual Channel Training of Action Embeddings for Reinforcement Learning with Large Discrete Action Spaces
Pranavi Pathakota, Hardik Meisheri, Harshad Khadilkar

TL;DR
This paper introduces a dual channel training framework for action embeddings in reinforcement learning, enabling efficient handling of large discrete action spaces and improving policy learning in noisy environments.
Contribution
The paper proposes a novel encoder-decoder architecture with a dual loss for learning action embeddings that facilitate both action reconstruction and state prediction.
Findings
Outperforms baselines in large action spaces
Produces cleaner, more effective action embeddings
Achieves earlier convergence in policy learning
Abstract
The ability to learn robust policies while generalizing over large discrete action spaces is an open challenge for intelligent systems, especially in noisy environments that face the curse of dimensionality. In this paper, we present a novel framework to efficiently learn action embeddings that simultaneously allow us to reconstruct the original action as well as to predict the expected future state. We describe an encoder-decoder architecture for action embeddings with a dual channel loss that balances between action reconstruction and state prediction accuracy. We use the trained decoder in conjunction with a standard reinforcement learning algorithm that produces actions in the embedding space. Our architecture is able to outperform two competitive baselines in two diverse environments: a 2D maze environment with more than 4000 discrete noisy actions, and a product recommendation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Explainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics
