An Essay on Optimization Mystery of Deep Learning
Eugene Golikov

TL;DR
This paper reviews the optimization mystery in deep learning, exploring why neural networks generalize well despite complex training processes, and connects various research efforts to shed light on these phenomena.
Contribution
It provides a comprehensive review and synthesis of existing research on the optimization mystery in deep learning, highlighting key challenges and insights.
Findings
Deep learning models generalize well despite overparameterization
Optimization processes exhibit implicit regularization effects
Connections between different theoretical approaches are identified
Abstract
Despite the huge empirical success of deep learning, theoretical understanding of neural networks learning process is still lacking. This is the reason, why some of its features seem "mysterious". We emphasize two mysteries of deep learning: generalization mystery, and optimization mystery. In this essay we review and draw connections between several selected works concerning the latter.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications · Machine Learning and ELM
