
TL;DR
Persistent neurons introduce a trajectory-based optimization strategy that leverages previous solutions to escape local minima and improve neural network training outcomes across various architectures.
Contribution
This paper proposes persistent neurons, a novel trajectory-based optimization method that enhances neural network training by utilizing information from previous solutions.
Findings
Persistent neurons can converge to more optimal solutions.
They improve performance under various initializations.
Effective across different neural network architectures.
Abstract
Neural networks (NN)-based learning algorithms are strongly affected by the choices of initialization and data distribution. Different optimization strategies have been proposed for improving the learning trajectory and finding a better optima. However, designing improved optimization strategies is a difficult task under the conventional landscape view. Here, we propose persistent neurons, a trajectory-based strategy that optimizes the learning task using information from previous converged solutions. More precisely, we utilize the end of trajectories and let the parameters explore new landscapes by penalizing the model from converging to the previous solutions under the same initialization. Persistent neurons can be regarded as a stochastic gradient method with informed bias where individual updates are corrupted by deterministic error terms. Specifically, we show that persistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural dynamics and brain function
