Efficient and provably convergent end-to-end training of deep neural networks with linear constraints

Zonglin Yang; Zhexuan Gu; Yancheng Yuan

arXiv:2605.11526·math.OC·May 13, 2026

Efficient and provably convergent end-to-end training of deep neural networks with linear constraints

Zonglin Yang, Zhexuan Gu, Yancheng Yuan

PDF

TL;DR

This paper introduces a new method for training deep neural networks with linear constraints using a novel HS-Jacobian, enabling efficient backpropagation and convergence guarantees, with demonstrated superior performance across various applications.

Contribution

We develop an HS-Jacobian for projection layers, proving its conservativeness, and integrate it into backpropagation, providing the first convergence guarantees for constrained deep network training.

Findings

01

HS-Jacobian enables seamless backpropagation with linear constraints.

02

The proposed Adam-based algorithm converges for linearly constrained networks.

03

Experiments show superior performance in finance, vision, and architecture design.

Abstract

Training a deep neural network with the outputs of selected layers satisfying linear constraints is required in many contemporary data-driven applications. While this can be achieved by incorporating projection layers into the neural network, its end-to-end training remains challenging due to the lack of rigorous theory and efficient algorithms for backpropagation. A key difficulty in developing the theory and efficient algorithms for backpropagation arose from the nonsmoothness of the solution mapping of the projection layer. To address this bottleneck, we introduce an efficiently computable HS-Jacobian to the projection layer. Importantly, we prove that the HS-Jacobian is a conservative mapping for the projection operator onto the polyhedral set, enabling its seamless integration into the nonsmooth automatic differentiation framework for backpropagation. Therefore, many efficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.