McRunjob: A High Energy Physics Workflow Planner for Grid Production Processing
Gregory E. Graham (Fermilab) Dave Evans, Iain Bertram (Lancaster, University)

TL;DR
McRunjob is a versatile grid workflow manager designed for high energy physics, enabling efficient large-scale job management, metadata handling, and workflow automation across diverse computing environments.
Contribution
It introduces a modular, extensible workflow management system with a rich metadata language for high energy physics grid processing.
Findings
Successfully used in large-scale Monte Carlo production since 1999
Supports complex metadata with expressions, dependencies, and ontologies
Facilitates parallelization and modular expansion in grid environments
Abstract
McRunjob is a powerful grid workflow manager used to manage the generation of large numbers of production processing jobs in High Energy Physics. In use at both the DZero and CMS experiments, McRunjob has been used to manage large Monte Carlo production processing since 1999 and is being extended to uses in regular production processing for analysis and reconstruction. Described at CHEP 2001, McRunjob converts core metadata into jobs submittable in a variety of environments. The powerful core metadata description language includes methods for converting the metadata into persistent forms, job descriptions, multi-step workflows, and data provenance information. The language features allow for structure in the metadata by including full expressions, namespaces, functional dependencies, site specific parameters in a grid environment, and ontological definitions. It also has simple control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management · Advanced Data Storage Technologies
