AdaCubic: An Adaptive Cubic Regularization Optimizer for Deep Learning

Ioannis Tsingalis; Constantine Kotropoulos; and Corentin Briat

arXiv:2604.09437·cs.LG·April 13, 2026

AdaCubic: An Adaptive Cubic Regularization Optimizer for Deep Learning

Ioannis Tsingalis, Constantine Kotropoulos, and Corentin Briat

PDF

TL;DR

AdaCubic is a new adaptive optimizer for deep learning that uses cubic regularization with dynamic weight adjustment, offering strong convergence guarantees and competitive performance without hyperparameter tuning.

Contribution

It introduces AdaCubic, the first scalable deep learning optimizer leveraging cubic regularization with adaptive weight adjustment and fixed hyperparameters.

Findings

01

AdaCubic outperforms or matches existing optimizers in vision, NLP, and signal processing tasks.

02

It maintains local convergence guarantees of cubic regularized Newton methods.

03

AdaCubic requires no hyperparameter tuning, simplifying practical deployment.

Abstract

A novel regularization technique, AdaCubic, is proposed that adapts the weight of the cubic term. The heart of AdaCubic is an auxiliary optimization problem with cubic constraints that dynamically adjusts the weight of the cubic term in Newton's cubic regularized method. We use Hutchinson's method to approximate the Hessian matrix, thereby reducing computational cost. We demonstrate that AdaCubic inherits the cubically regularized Newton method's local convergence guarantees. Our experiments in Computer Vision, Natural Language Processing, and Signal Processing tasks demonstrate that AdaCubic outperforms or competes with several widely used optimizers. Unlike other adaptive algorithms that require hyperparameter fine-tuning, AdaCubic is evaluated with a fixed set of hyperparameters, rendering it a highly attractive optimizer in settings where fine-tuning is infeasible. This makes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.