Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning

Suzan Ece Ada; Emre Ugur

arXiv:2506.04399·cs.LG·June 6, 2025

Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning

Suzan Ece Ada, Emre Ugur

PDF

TL;DR

This paper presents UMCNP, a hybrid meta-RL method that efficiently adapts to new tasks without reward signals during testing by combining parameterized policy gradients and task inference using Conditional Neural Processes.

Contribution

The paper introduces UMCNP, a novel approach that combines PPG and task inference with CNPs for sample-efficient meta-testing without reward signals.

Findings

01

UMCNP adapts with fewer samples than baselines.

02

Effective in 2D-Point and continuous control benchmarks.

03

Reduces online interactions during meta-testing.

Abstract

We introduce Unsupervised Meta-Testing with Conditional Neural Processes (UMCNP), a novel hybrid few-shot meta-reinforcement learning (meta-RL) method that uniquely combines, yet distinctly separates, parameterized policy gradient-based (PPG) and task inference-based few-shot meta-RL. Tailored for settings where the reward signal is missing during meta-testing, our method increases sample efficiency without requiring additional samples in meta-training. UMCNP leverages the efficiency and scalability of Conditional Neural Processes (CNPs) to reduce the number of online interactions required in meta-testing. During meta-training, samples previously collected through PPG meta-RL are efficiently reused for learning task inference in an offline manner. UMCNP infers the latent representation of the transition dynamics model from a single test task rollout with unknown parameters. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.