Deep Intrinsic Surprise-Regularized Control (DISRC): A Biologically Inspired Mechanism for Efficient Deep Q-Learning in Sparse Environments

Yash Kini; Shiv Davay; Shreya Polavarapu

arXiv:2601.17598·cs.NE·January 27, 2026

Deep Intrinsic Surprise-Regularized Control (DISRC): A Biologically Inspired Mechanism for Efficient Deep Q-Learning in Sparse Environments

Yash Kini, Shiv Davay, Shreya Polavarapu

PDF

Open Access

TL;DR

DISRC enhances deep Q-learning in sparse environments by dynamically scaling updates based on a surprise signal, leading to faster and more stable learning compared to standard methods.

Contribution

This paper introduces DISRC, a biologically inspired mechanism that modulates Q-learning updates using a surprise measure, improving efficiency and stability in sparse-reward settings.

Findings

01

DISRC reaches successful episodes 33% faster than baseline.

02

DISRC achieves higher final rewards and AUC in tested environments.

03

DISRC demonstrates more consistent and stable learning performance.

Abstract

Deep reinforcement learning (DRL) has driven major advances in autonomous control. Still, standard Deep Q-Network (DQN) agents tend to rely on fixed learning rates and uniform update scaling, even as updates are modulated by temporal-difference (TD) error. This rigidity destabilizes convergence, especially in sparse-reward settings where feedback is infrequent. We introduce Deep Intrinsic Surprise-Regularized Control (DISRC), a biologically inspired augmentation to DQN that dynamically scales Q-updates based on latent-space surprise. DISRC encodes states via a LayerNorm-based encoder and computes a deviation-based surprise score relative to a moving latent setpoint. Each update is then scaled in proportion to both TD error and surprise intensity, promoting plasticity during early exploration and stability as familiarity increases. We evaluate DISRC on two sparse-reward MiniGrid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)