Policy Search with High-Dimensional Context Variables
Voot Tangkaratt, Herke van Hoof, Simone Parisi, Gerhard Neumann, Jan, Peters, Masashi Sugiyama

TL;DR
This paper introduces a novel model-based policy search method that effectively handles high-dimensional context variables by integrating supervised linear dimensionality reduction, improving learning efficiency and performance in complex tasks.
Contribution
It presents a new contextual policy search approach combining model-based stochastic search with supervised dimensionality reduction for high-dimensional contexts.
Findings
Outperforms PCA-based dimensionality reduction.
Achieves better policy learning in high-dimensional contexts.
Surpasses existing state-of-the-art methods.
Abstract
Direct contextual policy search methods learn to improve policy parameters and simultaneously generalize these parameters to different context or task variables. However, learning from high-dimensional context variables, such as camera images, is still a prominent problem in many real-world tasks. A naive application of unsupervised dimensionality reduction methods to the context variables, such as principal component analysis, is insufficient as task-relevant input may be ignored. In this paper, we propose a contextual policy search method in the model-based relative entropy stochastic search framework with integrated dimensionality reduction. We learn a model of the reward that is locally quadratic in both the policy parameters and the context variables. Furthermore, we perform supervised linear dimensionality reduction on the context variables by nuclear norm regularization. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
