NoRIN: Backbone-Adaptive Reversible Normalization for Time-Series Forecasting

Shun Zhang; Yuyang Xiao

arXiv:2605.10823·cs.LG·May 12, 2026

NoRIN: Backbone-Adaptive Reversible Normalization for Time-Series Forecasting

Shun Zhang, Yuyang Xiao

PDF

TL;DR

NoRIN introduces a non-linear reversible normalization for time-series forecasting, enabling better distribution reshaping by decoupling shape parameters from the backbone training process.

Contribution

It proposes a novel non-linear normalization method with shape parameters optimized separately, overcoming the degeneration problem in reversible normalization techniques.

Findings

01

Decoupled shape optimization improves forecasting performance.

02

Shape parameters vary systematically across backbones and datasets.

03

NoRIN outperforms affine RevIN in diverse time-series tasks.

Abstract

Reversible instance normalization (RevIN) and its successors (Dish-TS, SAN, FAN) have become the de facto plug-in for time-series forecasting, yet the map they apply to each data point is strictly affine, $x \mapsto a x + b$ , so they cannot reshape the underlying distribution -- heavy tails remain heavy and skewness remains uncorrected. We propose NoRIN, a non-linear reversible normalization based on the arcsinh-form Johnson $S_{U}$ transform with two shape parameters $(δ, ε)$ that control tailedness and skewness; the linear $Z$ -score used by RevIN is recovered only in the limit $δ \to \infty$ . Training $(δ, ε)$ jointly with the backbone via gradient descent reliably pushes them toward this linear limit within a few epochs -- a phenomenon we name the degeneration problem: the forecasting loss is locally indifferent to shape, and the high-capacity backbone…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.