A Convolution and Attention Based Encoder for Reinforcement Learning under Partial Observability

Wuhao Wang; Zhiyong Chen

arXiv:2505.23857·cs.LG·September 16, 2025

A Convolution and Attention Based Encoder for Reinforcement Learning under Partial Observability

Wuhao Wang, Zhiyong Chen

PDF

Open Access

TL;DR

This paper introduces a lightweight convolution and attention-based encoder for reinforcement learning in partially observable environments, improving scalability and robustness without complex recurrent or Transformer models.

Contribution

It presents a novel temporal encoder combining depthwise separable convolution and self-attention, integrated into an actor-critic framework for better performance under partial observability.

Findings

01

Achieves superior performance on continuous control benchmarks

02

Efficiently encodes observation histories without heavy models

03

Enhances scalability of AI in uncertain environments

Abstract

Partially Observable Markov Decision Processes (POMDPs) remain a core challenge in reinforcement learning due to incomplete state information. We address this by reformulating POMDPs as fully observable processes with fixed-length observation histories as augmented states. To efficiently encode these histories, we propose a lightweight temporal encoder based on depthwise separable convolution and self-attention, avoiding the overhead of recurrent and Transformer-based models. Integrated into an actor-critic framework, our method achieves superior performance on continuous control benchmarks under partial observability. More broadly, this work shows that lightweight temporal encoding can improve the scalability of AI systems under uncertainty. It advances the development of agents capable of reasoning robustly in real-world environments where information is incomplete or delayed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural Networks and Applications

MethodsLinear Layer · Attention Is All You Need · Convolution · Pointwise Convolution · Softmax · Depthwise Convolution · Multi-Head Attention · Depthwise Separable Convolution