KAGE-Bench: Fast Known-Axis Visual Generalization Evaluation for Reinforcement Learning
Egor Cherepanov, Daniil Zelezetsky, Alexey K. Kovalev, Aleksandr I. Panov

TL;DR
KAGE-Bench introduces a fast, systematic benchmark for evaluating visual generalization in reinforcement learning by isolating individual visual shifts using a novel environment and axis-factorized observations.
Contribution
The paper presents KAGE-Env, a JAX-native platform that factorizes visual axes, and KAGE-Bench, a benchmark suite for analyzing visual generalization in RL, enabling controlled and rapid evaluation.
Findings
Background and photometric shifts often cause agent failure.
Agent-appearance shifts are less impactful on performance.
Visual shifts can break task completion without affecting reward.
Abstract
Pixel-based reinforcement learning agents often fail under purely visual distribution shift even when latent dynamics and rewards are unchanged, but existing benchmarks entangle multiple sources of shift and hinder systematic analysis. We introduce KAGE-Env, a JAX-native 2D platformer that factorizes the observation process into independently controllable visual axes while keeping the underlying control problem fixed. By construction, varying a visual axis affects performance only through the induced state-conditional action distribution of a pixel policy, providing a clean abstraction for visual generalization. Building on this environment, we define KAGE-Bench, a benchmark of six known-axis suites comprising 34 train-evaluation configuration pairs that isolate individual visual shifts. Using a standard PPO-CNN baseline, we observe strong axis-dependent failures, with background and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Neural Networks and Reservoir Computing
