A Convolution and Attention Based Encoder for Reinforcement Learning under Partial Observability
Wuhao Wang, Zhiyong Chen

TL;DR
This paper introduces a lightweight convolution and attention-based encoder for reinforcement learning in partially observable environments, improving scalability and robustness without complex recurrent or Transformer models.
Contribution
It presents a novel temporal encoder combining depthwise separable convolution and self-attention, integrated into an actor-critic framework for better performance under partial observability.
Findings
Achieves superior performance on continuous control benchmarks
Efficiently encodes observation histories without heavy models
Enhances scalability of AI in uncertain environments
Abstract
Partially Observable Markov Decision Processes (POMDPs) remain a core challenge in reinforcement learning due to incomplete state information. We address this by reformulating POMDPs as fully observable processes with fixed-length observation histories as augmented states. To efficiently encode these histories, we propose a lightweight temporal encoder based on depthwise separable convolution and self-attention, avoiding the overhead of recurrent and Transformer-based models. Integrated into an actor-critic framework, our method achieves superior performance on continuous control benchmarks under partial observability. More broadly, this work shows that lightweight temporal encoding can improve the scalability of AI systems under uncertainty. It advances the development of agents capable of reasoning robustly in real-world environments where information is incomplete or delayed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural Networks and Applications
MethodsLinear Layer · Attention Is All You Need · Convolution · Pointwise Convolution · Softmax · Depthwise Convolution · Multi-Head Attention · Depthwise Separable Convolution
