L-SR1: Learned Symmetric-Rank-One Preconditioning

Gal Lifshitz; Shahar Zuler; Ori Fouks; Dan Raviv

arXiv:2508.12270·cs.LG·August 19, 2025

L-SR1: Learned Symmetric-Rank-One Preconditioning

Gal Lifshitz, Shahar Zuler, Ori Fouks, Dan Raviv

PDF

Open Access 3 Reviews

TL;DR

This paper introduces L-SR1, a learned second-order optimizer with a trainable preconditioning unit that improves convergence and generalization in optimization tasks, demonstrated on Monocular Human Mesh Recovery.

Contribution

It presents a novel learned second-order optimizer that integrates a trainable preconditioning unit into the classical SR1 algorithm, enhancing efficiency and applicability.

Findings

01

Outperforms existing learned optimizers on HMR task

02

Requires no annotated data or fine-tuning

03

Offers strong generalization and lightweight design

Abstract

End-to-end deep learning has achieved impressive results but remains limited by its reliance on large labeled datasets, poor generalization to unseen scenarios, and growing computational demands. In contrast, classical optimization methods are data-efficient and lightweight but often suffer from slow convergence. While learned optimizers offer a promising fusion of both worlds, most focus on first-order methods, leaving learned second-order approaches largely unexplored. We propose a novel learned second-order optimizer that introduces a trainable preconditioning unit to enhance the classical Symmetric-Rank-One (SR1) algorithm. This unit generates data-driven vectors used to construct positive semi-definite rank-one matrices, aligned with the secant constraint via a learned projection. Our method is evaluated through analytic experiments and on the real-world task of Monocular Human…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 4

Strengths

A lightweight, self-supervised learned optimizer that integrates a trainable preconditioning unit into the SR1 framework is introduced. A learned projection mechanism that enforces both the secant condition and positive semi-definiteness, preserving core Quasi-Newton properties within a learned architecture is introduced. Experiments show that the proposed method works well on HMR. The paper is well written and easy to read.

Weaknesses

The claimed generalization of the proposed Learned-SR1 is not effectively validated. Currently, there is only simple evaluation on HMR task, and the compared baselines do not represent the current state-of-the-art for HMR. The paper lacks comparison with more optimization algorithms, e.g., AdamW, AdaHessian, etc. Moreover, the theretical analysis of the learned projection mechanism is insufficient. Currently, the evaluation is conducted only on a single dataset (3DPW), which fails to demonst

Reviewer 02Rating 6Confidence 3

Strengths

The theoretical foundation of this work is presented with clarity.

Weaknesses

The method is only tested on HMR, a specific application in image processing. It is not know whether this method is applicable or not for other tasks.

Reviewer 03Rating 2Confidence 3

Strengths

- Principled design bridging QN and learning. The method is explicitly grounded in the QN update, the secant condition, and the need for PSD preconditioners for descent directions; the learned projection aims to satisfy both simultaneously. - Lightweight, limited-memory, dimension-invariant formulation. L-SR1 uses rank-one outer products with a fixed-size buffer and element-wise modules to generalize across problem sizes without retraining. - Learned projection and per-coordinate step sizes

Weaknesses

- **Insufficient Comparative Analysis**: The paper should compare against a wider set of HMR methods (both optimization-based and modern learned/learnable refiners beyond LGD/SPIN) and report more metrics (e.g., MPJPE, PA-MPJPE, PVE, jitter/contact, temporal stability) on more datasets (e.g., Human3.6M, EHF, AGORA) under matched settings. As written, the HMR main table includes only a few baselines. Moreover, the paper fails to provide an analysis of the computational cost and runtime of the pro

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbabilistic and Robust Engineering Design · Nuclear reactor physics and engineering · Educational Robotics and Engineering