Sequential Knockoffs for Variable Selection in Reinforcement Learning
Tao Ma, Jin Zhu, Hengrui Cai, Zhengling Qi, Yunxiao Chen, Chengchun, Shi, Eric B. Laber

TL;DR
This paper introduces SEEK, a sequential knockoffs method for identifying minimal sufficient states in high-dimensional reinforcement learning, improving policy learning efficiency and accuracy.
Contribution
The paper proposes a novel SEEK algorithm for estimating minimal sufficient states in complex MDPs, with theoretical guarantees and practical advantages.
Findings
SEEK achieves selection consistency in large samples.
The method outperforms competitors in variable selection accuracy.
Empirical results show improved regret and policy performance.
Abstract
In real-world applications of reinforcement learning, it is often challenging to obtain a state representation that is parsimonious and satisfies the Markov property without prior knowledge. Consequently, it is common practice to construct a state larger than necessary, e.g., by concatenating measurements over contiguous time points. However, needlessly increasing the dimension of the state may slow learning and obfuscate the learned policy. We introduce the notion of a minimal sufficient state in a Markov decision process (MDP) as the subvector of the original state under which the process remains an MDP and shares the same reward function as the original process. We propose a novel SEquEntial Knockoffs (SEEK) algorithm that estimates the minimal sufficient state in a system with high-dimensional complex nonlinear dynamics. In large samples, the proposed method achieves selection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Neural dynamics and brain function
