Never Reset Again: A Mathematical Framework for Continual Inference in   Recurrent Neural Networks

Bojian Yin; Federico Corradi

arXiv:2412.15983·cs.LG·December 23, 2024

Never Reset Again: A Mathematical Framework for Continual Inference in Recurrent Neural Networks

Bojian Yin, Federico Corradi

PDF

Open Access

TL;DR

This paper introduces a novel adaptive loss function for RNNs that enables continual inference without resets, maintaining accuracy over long sequences and improving streaming application performance.

Contribution

The paper presents a new loss function combining cross-entropy and Kullback-Leibler divergence that allows RNNs to operate continuously without state resets, a significant advancement over existing methods.

Findings

01

Outperforms reset-based methods in continual tasks

02

Maintains stable representations over extended sequences

03

Enhances RNN capabilities for streaming applications

Abstract

Recurrent Neural Networks (RNNs) are widely used for sequential processing but face fundamental limitations with continual inference due to state saturation, requiring disruptive hidden state resets. However, reset-based methods impose synchronization requirements with input boundaries and increase computational costs at inference. To address this, we propose an adaptive loss function that eliminates the need for resets during inference while preserving high accuracy over extended sequences. By combining cross-entropy and Kullback-Leibler divergence, the loss dynamically modulates the gradient based on input informativeness, allowing the network to differentiate meaningful data from noise and maintain stable representations over time. Experimental results demonstrate that our reset-free approach outperforms traditional reset-based methods when applied to a variety of RNNs, particularly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsAdaptive Robust Loss