Distributed Training and Optimization Of Neural Networks
Jean-Roch Vlimant, Junqi Yin

TL;DR
This paper reviews methods for parallel computation in training large neural networks, emphasizing their application in high energy physics to address resource and time challenges.
Contribution
It provides a comprehensive overview of parallel training techniques and contextualizes their use in high energy physics research.
Findings
Parallel training methods reduce computation time.
Effective hyper-parameter optimization requires distributed computing.
Application in high energy physics enhances model efficiency.
Abstract
Deep learning models are yielding increasingly better performances thanks to multiple factors. To be successful, model may have large number of parameters or complex architectures and be trained on large dataset. This leads to large requirements on computing resource and turn around time, even more so when hyper-parameter optimization is done (e.g search over model architectures). While this is a challenge that goes beyond particle physics, we review the various ways to do the necessary computations in parallel, and put it in the context of high energy physics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Particle physics theoretical and experimental studies · Particle Detector Development and Performance
