Layer-wise Derivative Controlled Networks
Rowan Martnishn, Sean Anderson

TL;DR
ChainzRule introduces a derivative-aware neural architecture that balances accuracy, efficiency, and stability by regularizing intermediate derivatives, outperforming standard models with fewer parameters.
Contribution
The paper proposes ChainzRule, a novel neural architecture using Differential Regularization (DREG) to control derivatives, enhancing stability without sacrificing representational power.
Findings
Outperformed standard models in benchmarks with 15.5x fewer parameters.
Reduced peak gradient volatility by 23.1% on MNIST.
Achieved 70.17% accuracy on Yelp Full ordinal regression.
Abstract
As machine learning models grow in complexity, they increasingly struggle with three conflicting demands: the need for high accuracy, the requirement for hardware efficiency, and the necessity of functional stability. Traditional architectures often achieve performance at the expense of spiky or unpredictable behavior, where small changes in input lead to massive swings in output -- a critical flaw for real-world deployment in sensitive environments. This paper introduces ChainzRule (CR), a novel neural architecture designed to harmonize these competing goals. ChainzRule replaces standard piecewise-linear activations with a Polynomial Engine governed by Differential Regularization (DREG). Unlike traditional methods that impose global, coarse-grained constraints on a model's Lipschitz constant, DREG acts as a targeted regularization on intermediate derivatives. This approach suppresses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
