Learning to Initialize Gradient Descent Using Gradient Descent
Kartik Ahuja, Amit Dhurandhar, Kush R. Varshney

TL;DR
This paper introduces a learning-based method for initialization in gradient descent algorithms, improving performance on non-convex problems by leveraging previous solutions instead of relying on random or handcrafted initializations.
Contribution
It proposes a novel approach to learn initialization rules from past solutions, with theoretical guarantees and demonstrated improvements across multiple non-convex tasks.
Findings
Consistent performance gains over traditional initialization methods
Theoretical guarantees for the proposed initialization approach
Effective application across diverse non-convex problems
Abstract
Non-convex optimization problems are challenging to solve; the success and computational expense of a gradient descent algorithm or variant depend heavily on the initialization strategy. Often, either random initialization is used or initialization rules are carefully designed by exploiting the nature of the problem class. As a simple alternative to hand-crafted initialization rules, we propose an approach for learning "good" initialization rules from previous solutions. We provide theoretical guarantees that establish conditions that are sufficient in all cases and also necessary in some under which our approach performs better than random initialization. We apply our methodology to various non-convex problems such as generating adversarial examples, generating post hoc explanations for black-box machine learning models, and allocating communication spectrum, and show consistent gains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms
