When to ASK: Uncertainty-Gated Language Assistance for Reinforcement Learning

Juarez Monteiro; Nathan Gavenski; Gianlucca Zuin; Adriano Veloso

arXiv:2604.02226·cs.AI·April 3, 2026

When to ASK: Uncertainty-Gated Language Assistance for Reinforcement Learning

Juarez Monteiro, Nathan Gavenski, Gianlucca Zuin, Adriano Veloso

PDF

TL;DR

This paper presents ASK, a method that combines smaller language models with reinforcement learning to improve out-of-distribution generalization by selectively querying the language model based on uncertainty assessments.

Contribution

ASK introduces a hybrid approach using uncertainty gating to enhance RL agents' OOD performance without retraining, leveraging smaller LMs efficiently.

Findings

01

ASK achieves a reward of 0.95 in transfer tasks.

02

Selective querying maintains efficiency while improving OOD navigation.

03

Effective hybridization requires careful orchestration, not just simple combination.

Abstract

Reinforcement learning (RL) agents often struggle with out-of-distribution (OOD) scenarios, leading to high uncertainty and random behavior. While language models (LMs) contain valuable world knowledge, larger ones incur high computational costs, hindering real-time use, and exhibit limitations in autonomous planning. We introduce Adaptive Safety through Knowledge (ASK), which combines smaller LMs with trained RL policies to enhance OOD generalization without retraining. ASK employs Monte Carlo Dropout to assess uncertainty and queries the LM for action suggestions only when uncertainty exceeds a set threshold. This selective use preserves the efficiency of existing policies while leveraging the language model's reasoning in uncertain situations. In experiments on the FrozenLake environment, ASK shows no improvement in-domain, but demonstrates robust navigation in transfer tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.