Towards Understanding Gradient Approximation in Equality Constrained Deep Declarative Networks
Stephen Gould, Ming Xu, Zhiwei Xu, Yanbin Liu

TL;DR
This paper investigates when approximating gradients in deep declarative networks with equality constraints is effective, providing theoretical insights and practical examples to guide efficient training.
Contribution
It offers a theoretical analysis of gradient approximation conditions in deep declarative networks with linear and normalization constraints, highlighting practical implications.
Findings
Approximate gradients can often be used without losing descent direction.
Theoretical conditions for successful gradient approximation are established.
Examples demonstrate both effectiveness and limitations of the approximation.
Abstract
We explore conditions for when the gradient of a deep declarative node can be approximated by ignoring constraint terms and still result in a descent direction for the global loss function. This has important practical application when training deep learning models since the approximation is often computationally much more efficient than the true gradient calculation. We provide theoretical analysis for problems with linear equality constraints and normalization constraints, and show examples where the approximation works well in practice as well as some cautionary tales for when it fails.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Graph Neural Networks · Topological and Geometric Data Analysis
