Sequential Knockoffs for Variable Selection in Reinforcement Learning

Tao Ma; Jin Zhu; Hengrui Cai; Zhengling Qi; Yunxiao Chen; Chengchun; Shi; Eric B. Laber

arXiv:2303.14281·stat.ML·July 31, 2024·1 cites

Sequential Knockoffs for Variable Selection in Reinforcement Learning

Tao Ma, Jin Zhu, Hengrui Cai, Zhengling Qi, Yunxiao Chen, Chengchun, Shi, Eric B. Laber

PDF

Open Access

TL;DR

This paper introduces SEEK, a sequential knockoffs method for identifying minimal sufficient states in high-dimensional reinforcement learning, improving policy learning efficiency and accuracy.

Contribution

The paper proposes a novel SEEK algorithm for estimating minimal sufficient states in complex MDPs, with theoretical guarantees and practical advantages.

Findings

01

SEEK achieves selection consistency in large samples.

02

The method outperforms competitors in variable selection accuracy.

03

Empirical results show improved regret and policy performance.

Abstract

In real-world applications of reinforcement learning, it is often challenging to obtain a state representation that is parsimonious and satisfies the Markov property without prior knowledge. Consequently, it is common practice to construct a state larger than necessary, e.g., by concatenating measurements over contiguous time points. However, needlessly increasing the dimension of the state may slow learning and obfuscate the learned policy. We introduce the notion of a minimal sufficient state in a Markov decision process (MDP) as the subvector of the original state under which the process remains an MDP and shares the same reward function as the original process. We propose a novel SEquEntial Knockoffs (SEEK) algorithm that estimates the minimal sufficient state in a system with high-dimensional complex nonlinear dynamics. In large samples, the proposed method achieves selection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Neural dynamics and brain function