A Relaxation Argument for Optimization in Neural Networks and Non-Convex Compressed Sensing
G. Welper

TL;DR
This paper introduces a relaxation-based approach to enlarge neural networks and non-convex compressed sensing problems, aiming to improve optimization success rates by leveraging layered structures and multiple initializations.
Contribution
It proposes a relaxation argument for network widening and deepening, enabling parallel partial copies and increased optimization chances, applicable to neural networks and non-convex compressed sensing.
Findings
Enlarged networks can simulate multiple initializations.
Potential to achieve better training error through layered structures.
Application to non-convex compressed sensing improves global optimum chances.
Abstract
It has been observed in practical applications and in theoretical analysis that over-parametrization helps to find good minima in neural network training. Similarly, in this article we study widening and deepening neural networks by a relaxation argument so that the enlarged networks are rich enough to run copies of parts of the original network in parallel, without necessarily achieving zero training error as in over-parametrized scenarios. The partial copies can be combined in possible ways for layer width . Therefore, the enlarged networks can potentially achieve the best training error of random initializations, but it is not immediately clear if this can be realized via gradient descent or similar training methods. The same construction can be applied to other optimization problems by introducing a similar layered structure. We apply this idea to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · CCD and CMOS Imaging Sensors · Analog and Mixed-Signal Circuit Design
