Distributed Training and Optimization Of Neural Networks

Jean-Roch Vlimant; Junqi Yin

arXiv:2012.01839·cs.LG·December 20, 2022·1 cites

Distributed Training and Optimization Of Neural Networks

Jean-Roch Vlimant, Junqi Yin

PDF

Open Access

TL;DR

This paper reviews methods for parallel computation in training large neural networks, emphasizing their application in high energy physics to address resource and time challenges.

Contribution

It provides a comprehensive overview of parallel training techniques and contextualizes their use in high energy physics research.

Findings

01

Parallel training methods reduce computation time.

02

Effective hyper-parameter optimization requires distributed computing.

03

Application in high energy physics enhances model efficiency.

Abstract

Deep learning models are yielding increasingly better performances thanks to multiple factors. To be successful, model may have large number of parameters or complex architectures and be trained on large dataset. This leads to large requirements on computing resource and turn around time, even more so when hyper-parameter optimization is done (e.g search over model architectures). While this is a challenge that goes beyond particle physics, we review the various ways to do the necessary computations in parallel, and put it in the context of high energy physics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Particle physics theoretical and experimental studies · Particle Detector Development and Performance