Non-Myopic Active Feature Acquisition via Pathwise Policy Gradients
Linus Aronsson, Morteza Haghir Chehreghani

TL;DR
This paper introduces NM-PPG, a novel non-myopic active feature acquisition method using pathwise policy gradients, enabling efficient end-to-end training for cost-effective feature selection.
Contribution
It presents a continuous relaxation approach for AFA, reducing gradient variance and improving non-myopic policy optimization compared to existing methods.
Findings
NM-PPG outperforms state-of-the-art AFA baselines on synthetic and real datasets.
The method effectively balances feature acquisition costs with prediction accuracy.
The approach stabilizes training through entropy regularization and staged temperature sharpening.
Abstract
Active feature acquisition (AFA) considers prediction problems in which features are costly to obtain and the learner adaptively decides which feature values to acquire for each instance and when to stop and predict. AFA can be formulated as a partially observable Markov decision process (POMDP), which naturally admits a sequential decision-making perspective. In this paper, we present non-myopic pathwise policy gradients (NM-PPG), a new AFA method built around this formulation. We introduce a continuous relaxation of the acquisition process that enables pathwise gradients through the full acquisition trajectory, avoiding the high variance of standard score-function policy gradients while allowing end-to-end optimization of a non-myopic acquisition policy. To better align training with deployment, we further develop a straight-through rollout scheme that follows hard feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
