KAGE-Bench: Fast Known-Axis Visual Generalization Evaluation for Reinforcement Learning

Egor Cherepanov; Daniil Zelezetsky; Alexey K. Kovalev; Aleksandr I. Panov

arXiv:2601.14232·cs.LG·January 21, 2026

KAGE-Bench: Fast Known-Axis Visual Generalization Evaluation for Reinforcement Learning

Egor Cherepanov, Daniil Zelezetsky, Alexey K. Kovalev, Aleksandr I. Panov

PDF

Open Access

TL;DR

KAGE-Bench introduces a fast, systematic benchmark for evaluating visual generalization in reinforcement learning by isolating individual visual shifts using a novel environment and axis-factorized observations.

Contribution

The paper presents KAGE-Env, a JAX-native platform that factorizes visual axes, and KAGE-Bench, a benchmark suite for analyzing visual generalization in RL, enabling controlled and rapid evaluation.

Findings

01

Background and photometric shifts often cause agent failure.

02

Agent-appearance shifts are less impactful on performance.

03

Visual shifts can break task completion without affecting reward.

Abstract

Pixel-based reinforcement learning agents often fail under purely visual distribution shift even when latent dynamics and rewards are unchanged, but existing benchmarks entangle multiple sources of shift and hinder systematic analysis. We introduce KAGE-Env, a JAX-native 2D platformer that factorizes the observation process into independently controllable visual axes while keeping the underlying control problem fixed. By construction, varying a visual axis affects performance only through the induced state-conditional action distribution of a pixel policy, providing a clean abstraction for visual generalization. Building on this environment, we define KAGE-Bench, a benchmark of six known-axis suites comprising 34 train-evaluation configuration pairs that isolate individual visual shifts. Using a standard PPO-CNN baseline, we observe strong axis-dependent failures, with background and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Neural Networks and Reservoir Computing