NysAct: A Scalable Preconditioned Gradient Descent using Nystrom Approximation

Hyunseok Seung; Jaewoo Lee; Hyunsuk Ko

arXiv:2506.08360·cs.LG·June 11, 2025

NysAct: A Scalable Preconditioned Gradient Descent using Nystrom Approximation

Hyunseok Seung, Jaewoo Lee, Hyunsuk Ko

PDF

1 Repo

TL;DR

NysAct is a scalable preconditioning method that uses Nystrom approximation to improve optimization efficiency and generalization in neural networks, balancing the benefits of first- and second-order methods.

Contribution

It introduces NysAct, a novel preconditioning technique leveraging Nystrom approximation for efficient second-order-like optimization.

Findings

01

Achieves higher test accuracy than first- and second-order methods.

02

Reduces computational and memory costs compared to existing second-order methods.

03

Maintains minimal impact on test accuracy while improving efficiency.

Abstract

Adaptive gradient methods are computationally efficient and converge quickly, but they often suffer from poor generalization. In contrast, second-order methods enhance convergence and generalization but typically incur high computational and memory costs. In this work, we introduce NysAct, a scalable first-order gradient preconditioning method that strikes a balance between state-of-the-art first-order and second-order optimization methods. NysAct leverages an eigenvalue-shifted Nystrom method to approximate the activation covariance matrix, which is used as a preconditioning matrix, significantly reducing time and memory complexities with minimal impact on test accuracy. Our experiments show that NysAct not only achieves improved test accuracy compared to both first-order and second-order methods but also demands considerably less computational resources than existing second-order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hseung88/nysact
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.