Computational Hardness of Reinforcement Learning with Partial $q^{\pi}$-Realizability

Shayan Karimi; Xiaoqi Tan

arXiv:2510.21888·cs.AI·October 31, 2025

Computational Hardness of Reinforcement Learning with Partial $q^{\pi}$-Realizability

Shayan Karimi, Xiaoqi Tan

PDF

1 Video

TL;DR

This paper demonstrates that reinforcement learning with partial $q^{}pi$-realizability is computationally hard, establishing NP-hardness and exponential lower bounds, thus highlighting fundamental limitations in this approximation regime.

Contribution

It introduces the partial $q^{}pi$-realizability framework and proves its computational hardness, extending complexity results to a more practical RL setting.

Findings

01

NP-hardness under greedy policy set

02

Exponential lower bound with softmax policies

03

Hardness results mirror those in $q^{}$-realizability

Abstract

This paper investigates the computational complexity of reinforcement learning in a novel linear function approximation regime, termed partial $q^{π}$ -realizability. In this framework, the objective is to learn an $ϵ$ -optimal policy with respect to a predefined policy set $Π$ , under the assumption that all value functions for policies in $Π$ are linearly realizable. The assumptions of this framework are weaker than those in $q^{π}$ -realizability but stronger than those in $q^{*}$ -realizability, providing a practical model where function approximation naturally arises. We prove that learning an $ϵ$ -optimal policy in this setting is computationally hard. Specifically, we establish NP-hardness under a parameterized greedy policy set (argmax) and show that - unless NP = RP - an exponential lower bound (in feature vector dimension) holds when the policy set contains…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Computational Hardness of Reinforcement Learning with Partial $q^{\pi}$-Realizability· slideslive