On Optimal Batch Size in Coded Computing
Swapnil Saha, Emina Soljanin, Philip Whiting

TL;DR
This paper investigates how to optimally choose batch size and redundancy level in coded computing systems to minimize expected job completion time, considering different service-time distributions and system parameters.
Contribution
It introduces a joint optimization framework for batch size and redundancy in coded computing, providing insights into their impact on execution time.
Findings
Optimal batch size depends on redundancy level and system parameters.
Joint optimization of redundancy and batch size reduces expected completion time.
Simulation results validate the theoretical findings.
Abstract
We consider computing systems that partition jobs into tasks, add redundancy through coding, and assign the encoded tasks to different computing nodes for parallel execution. The expected execution time depends on the level of redundancy. The computing nodes execute large jobs in batches of tasks. We show that the expected execution time depends on the batch size as well. The optimal batch size that minimizes the execution time depends on the level of redundancy under a fixed number of parallel servers and other system parameters. Furthermore, we show how to (jointly) optimize the redundancy level and batch size to reduce the expected job completion time for two service-time distributions. The simulation presented helps us appreciate the claims.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
