SimEngine: A Modular Framework for Statistical Simulations in R
Avi Kenny, Charles J. Wolock

TL;DR
SimEngine is an open-source R package designed for efficient, parallelized statistical simulations on high-performance computing systems, offering advanced features like automatic error calculation and cross-replicate information sharing.
Contribution
It introduces a novel R package tailored for parallel simulations on clusters, with unique features for error estimation and data sharing across replicates.
Findings
Supports simulation on local and cluster environments
Provides automatic Monte Carlo error calculation
Enables information sharing across simulation replicates
Abstract
This article describes SimEngine, an open-source R package for structuring, maintaining, running, and debugging statistical simulations on both local and cluster-based computing environments. Several R packages exist for structuring simulations, but SimEngine is the only package specifically designed for running simulations in parallel via job schedulers on high-performance cluster computing systems. The package provides structure and functionality for common simulation tasks, such as setting simulation levels, managing seeds for random number generation, and calculating summary metrics (such as bias and confidence interval coverage). SimEngine also brings several unique features, such as automatic calculation of Monte Carlo error and information-sharing across simulation replicates. We provide an overview of the package and demonstrate some of its advanced functionality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Analysis with R
