Generic framework for data-race-free many-particle simulations on shared memory hardware
Julian Jeggle, Raphael Wittkowski

TL;DR
This paper introduces a formal model and a domain-specific language inspired by Rust, enabling data-race-free, shared-memory many-particle simulations with deadlock prevention, demonstrated on molecular dynamics primitives.
Contribution
It presents a novel abstract model and programming language ensuring data-race freedom in shared-memory particle simulations, including deadlock prevention strategies.
Findings
Successfully guarantees data-race freedom in simulations.
Demonstrates applicability on molecular dynamics primitives.
Provides deadlock prevention method via graph representation.
Abstract
Recently, there has been much progress in the formulation and implementation of methods for generic many-particle simulations. These models, however, typically either do not utilize shared memory hardware or do not guarantee data-race freedom for arbitrary particle dynamics. Here, we present both a abstract formal model of particle dynamics and a corresponding domain-specific programming language that can guarantee data-race freedom. The design of both the model and the language are heavily inspired by the Rust programming language that enables data-race-free general-purpose parallel computation. We also present a method of preventing deadlocks within our model by a suitable graph representation of a particle simulation. Finally, we demonstrate the practicability of our model on a number of common numerical primitives from molecular dynamics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
Generic framework for data-race-free many-particle simulations on shared memory hardware
Julian Jeggle
Institut für Theoretische Physik, Center for Soft Nanoscience, Westfälische Wilhelms-Universität Münster, 48149 Münster, Germany
Raphael Wittkowski
Institut für Theoretische Physik, Center for Soft Nanoscience, Westfälische Wilhelms-Universität Münster, 48149 Münster, Germany
Abstract
Recently, there has been much progress in the formulation and implementation of methods for generic many-particle simulations. These models, however, typically either do not utilize shared memory hardware or do not guarantee data-race freedom for arbitrary particle dynamics. Here, we present both a abstract formal model of particle dynamics and a corresponding domain-specific programming language that can guarantee data-race freedom. The design of both the model and the language are heavily inspired by the Rust programming language that enables data-race-free general-purpose parallel computation. We also present a method of preventing deadlocks within our model by a suitable graph representation of a particle simulation. Finally, we demonstrate the practicability of our model on a number of common numerical primitives from molecular dynamics.
I Introduction
From the movement of gravitationally coupled celestial objects [1] to swarming animals [2] to bacteria [3] and colloids [4] to molecules in a fluid [5], there are countless examples of naturally occurring and synthetically produced many-particle systems, i.e., systems of interacting discrete entities. While it is typically easy to describe the dynamics of single particles analytically, the dynamics of many interacting particles is often only accessible numerically. Therefore, simulations of many-particle systems have become a standard tool for scientific computation and have numerous applications in physics [6, 7], chemistry [8, 9], biology [10, 11], and mathematics [12, 13].
Despite the fact that these kinds of simulation methods have been in use since the dawn of scientific computer simulations [14], there is still active development in the field [15, 16, 17]. One reason for this is that many-particle simulations frequently involve very computationally intensive calculations and therefore require careful algorithmic optimization. The complexity of crafting efficient implementations of many-particle dynamics has led to the development of numerous software packages, which promise to alleviate some of this complexity. This is especially the case in the field of molecular dynamics which has brought forward software packages such as GROMACS [18], HOOMD-blue [19], LAMMPS [20], or NAMD [21].
In recent times, there has been an increased interest in more general particle simulation software. This has given rise to software packages like FDPS [22], PPM [23], or OpenFPM [24]. These software packages are not simply a library of common routines like many molecular dynamics packages, but rather a framework for implementing custom particle dynamics, thus giving the user much more control over the exact numerical details of the simulation. Most of these existing software solutions for many-particle simulations utilize the distributed memory model for parallelization, i.e., the particle system is decomposed into multiple subsystems, which are then processed in parallel. While this means that the simulation can be spread evenly across multiple processing nodes, it also comes with an inherent communication overhead due to the need to send information between particle subsystems [25].
At the same time, we see a trend in computer hardware towards increasingly parallel computing devices, e.g., in the form of processors with increasingly large numbers of cores or specialized hardware such as graphics processors. This makes it viable to utilize a shared memory model on a single processing node instead, where all threads have access to the same memory. Some modern molecular dynamics frameworks such as OpenMM [26], thus target shared-memory CPU and GPU architectures to exploit current hardware to its fullest extent. While the shared-memory model eliminates most of the communication overhead, it comes at the cost of a parallelism model that is much more prone to errors, in particular a class of errors called data races [27].
While data races are highly problematic in general computation, their impact can be greatly reduced by limiting computation to a specific problem domain. The most popular example of this phenomenon are graphics shaders, which by the nature of graphics processors run on shared memory hardware, but are rarely affected by data races. This is due to the design of the render pipeline that restricts access to data as it is being processed in parallel [28]. This approach to preventing data races is taken even further in the novel systems programming language Rust [29], which generalizes the idea of access restrictions on variables such that general-purpose computation is possible.
Inspired by the success of Rust, we present in this article a domain-specific model and programming language for many-particle simulations with the goal of data-race freedom on shared-memory hardware. We show that many common particle dynamics can be expressed within this model and can therefore be parallelized safely. Our work can be seen as a continuation of Ref. [30], however, we found it more productive to construct our model independently instead of maintaining compatibility with the notation used in Ref. [30]. Due to a very broad definition of what constitutes a particle dynamics, the resulting model for safely parallel molecular dynamics surpasses the flexibility of existing shared-memory implementations such as OpenMM [26].
This article is structured as follows: In Sec. II, we briefly review the concept of data races and strategies to prevent them. Then we formulate a model to ensure data-race-free particle dynamics in Sec. III, before concretizing this model in form of a domain-specific programming language in Sec. IV. We conclude in Sec. V.
II Data races and data-race freedom
In this section, we briefly introduce the problem of data races in parallel programming and present a number of methods to ensure their absence in programs [31, 29].
Generally speaking, programs can be viewed as sets of tasks.111Sometimes these tasks are also called instructions, which, however, might be confused with machine instructions, which have additional technical connotations in computer architectures. In a sequential program, these tasks are executed in a predefined sequence called a thread of execution, with only one task being worked on at the same time. In contrast, concurrent programs do not have just a single thread of execution, but are composed of multiple threads of execution that can at least in principle progress independently. Notably, this does not imply that multiple threads of execution advance at the same time. The program could instead simply switch between threads of execution periodically and advance them one at a time. A concurrent program, where multiple threads of execution do in fact advance simultaneously, is called a parallel program. [29]
Compared to sequential programming, creating concurrent programs requires more diligence as one needs to take into account every possibility of the different threads of execution interleaving accesses to shared resources (e.g., memory). If the outcome of executing a program depends on the order of these accesses, the program is said to exhibit a race condition as two or more threads race for first access to a resource. A particularly significant case of this is a situation where two or more threads access the same location in memory concurrently with at least one thread writing to memory and at least one access being unsynchronized, i.e., not being forced into sequential execution with the other accesses. This situation is referred to as a data race [29]. Most programming language models give no guarantees to program behavior if a data race occurs (undefined behavior). This, combined with the erratic nature of race conditions in general, makes data races a class of programming errors that poses a significant threat to program correctness.
Consequently, data race detection and avoidance is an important field of research [32, 33]. One solution to avoid data races is to simply not share any memory. This is often used in the message-passing scheme of parallel programming, where threads cannot share data, but can send data between each other. However, message passing is less efficient compared to sharing memory as the data sent between threads typically needs to be copied first to form a contiguous message. This creates a communication overhead that can degrade performance if too much information needs to be communicated.
Here, we want to focus on the solution to the problem of data races utilized by the Rust Programming Language (henceforth simply called “Rust”). Rust is a modern systems programming language with a focus on reliability and efficiency. On the one hand, Rust offers low-level access to details of execution, e.g., memory management or operating system level multithreading, like typical systems programming languages such as C or C++. On the other hand, it also comes with strong correctness guarantees via a powerful type system and static analysis checks. In fact, in the so-called Safe Rust subset of Rust, it is impossible to trigger undefined behavior in any legal program.
This safety is possible in part due to the concepts of memory ownership and reference lifetimes being explicit and pervasive in Rust programs. At compile time the Rust compiler validates programs with respect to a set of rules regarding these concepts that are designed specifically to prevent data races as well as other memory-related sources of undefined behavior. In brief, Rust prohibits aliasing of values, i.e., at every point in time every value (i.e., some block of memory filled with data) of a Rust program is bound to exactly one variable. While the variable is in scope, the value must remain allocated, and conversely, when the variable drops out of scope, the value is freed. This prevents use-after-free errors without the overhead of memory management via garbage collection [34]. Besides value ownership, Rust also allows borrowing a value from a variable as a reference. To prevent use-after-free errors, the Rust compiler uses an elaborate system of tracking lifetimes of references to prove that no reference cannot outlive the variable it is derived from.
Of particular importance for this work is secondary restriction imposed on references in Rust, the so-called rules of borrowing. These state that at every point in time, there can be either an arbitrary number of read-only references or a single mutable reference. In either case, data races are impossible: there is either no writing access or only a single thread carries a reference to the same location in memory. It is this idea of regulating concurrent access and mutability of data that forms the basis for our domain-specific model for safely parallelizable particle simulations. One should keep in mind that all this reasoning has to happen at compile time as to not incur a management overhead that degrades performance.
It is outside the scope of this article to explain the intricacies of Rust in more detail, so we instead refer the interested reader to Refs. [35, 36, 37, 29] for in-depth explanations on the design of Rust.
At this point it should be highlighted that Rust does not provide its safety guarantees for free. Programmers have to handle the increased complexity of designing their programs in such a fashion that the compiler can track ownership and reference lifetimes. Because this is a special case of static program analysis, the Rust compiler is doomed by the impossibility of perfect static program analysis [38] to always reject some programs even though they are in fact (provably) safe.222In fact, it is trivial to find such a program in Rust. Because Rust treats arrays as a single value with respect to the aliasing rules the Rust compiler will reject a multithreaded program even if each thread accesses only a mutually disjoint subset of the array. To circumvent this, Rust allows programmers to selectively opt out of some static analysis checks via so-called Unsafe Rust. This, of course, places the responsibility of validating safety on the programmer, but can be used very effectively to create safe abstractions over inherently unsafe operations. This “escape hatch” design of Rust is another aspect we take inspiration from in this article.
Before we discuss the specific semantics of simulating particle systems numerically we will first present a more general formalization of the idea of safe parallelism via thread-exclusive mutability of variables. For this purpose, we first define the notion of a program state as the collective state of all variables of a program. To distinguish between different variables we identify each variable with a natural number. In the interest of simplicity we do not differentiate between different variable types but instead only assume all variables can take values from some generic set.
Definition 1**.**
A program state is a function that maps a finite index set to the set of possible variable values . The value of the -th variable is represented as . The set of all possible program states with a given index set and value set is , i.e., the set of all functions that map from to .
It should be noted that unlike typical automata models such as register machines this definition of a program state lacks any notion of a instruction counter or a similar construct to track the flow of execution. Instead, we describe program flow extrinsically as a sequence of fixed program steps.
Definition 2**.**
A program step for a set of possible variable values is defined as a tuple with an index set , a variable index and an update function which must satisfy the condition . Conceptually, this encodes a change in a program state with index set where the -th variable is substituted by . If , the variable is newly created and initialized by and if the variable is deleted instead. We define the output index set as
[TABLE]
To map input program states to output program states we define the execution function as
[TABLE]
Definition 3**.**
Let be a finite sequence of length such that is a program step for all . We call this sequence a program if the index set of each program step is the same as the output index set of the previous program step, i.e., for all it holds that . We define the execution function of the program , i.e., the function that maps from the initial state to the output state after executing each program step in order, as the composition of the execution functions of each program step
[TABLE]
Furthermore we define the input index set of the program as and the output index set of the program as .
This way of modeling programs comes with the obvious drawback that the program flow must be the same regardless of program input. While this is a severe limitation for general computing, it is much less problematic in the context of molecular dynamics, where the program flow is typically determined only by the underlying model equations and not the concrete state of a given physical system. It should also be noted that the static nature program flow in the above definition does not rule out branching computation in general since any kind of computation can be incorporated into the update functions .333Naturally, for this model to be of any practical use we have to assume that is computable (if ). As we will see later, this will form an “escape hatch” in our domain-specific programming language similar to Unsafe Rust.
Using the definition of a program in terms of its access patterns to variables we can now formalize the concepts of a program depending on and mutating variables.
Definition 4**.**
Let be a program. We define the set of variables mutated by as such that for all there must be a with . In other words, a variable is mutable with respect to a program if there is at least one program step in this program that updates it.
Definition 5**.**
Let be a program step. We say that this step depends on a variable if there are two program states with and for all such that , i.e., there are two program states that differ only in the value for the -th variable but produce different output states.
We emphasize again that we do not make any assumption on how is computed. Therefore, a program step not depending on the -th variable does not imply that every implementation of will be free of reading operations for this variable, but merely implies the existence of an implementation without such an operation. In practice, it is usually sufficient to analyze syntactically if an implementation of reads a given variable, e.g., by finding the corresponding symbol for the variable in the program code.444This method of analysis allows for false positives as it does not verify the reachability of the expression containing the symbol in question. Therefore it does not violate the general undecidability of (perfect) static analysis.
Definition 6**.**
Let be a program. For each variable we can split the program into two programms and such that the following three properties hold:
** 2. 2.
* contains no element with * 3. 3.
* has maximum length*
In other words, is the part of program before there is a write to the -th variable and contains the remaining program steps of . We then define the set of variables depends on as where if and only if contains a step that depends on the variable . Conceptually, depending on a variable means that during the execution of the value of the variable is read before it is first written to.
Finally, we use Def. 4 and 6 to formally define the notion of data-race freedom by exclusive mutability as follows.
Definition 7**.**
Let be a program composed of subprograms with for all . We say that can be parallelized without data races via if for all variables the following conditions hold:
The variable is mutated by at most one subprogram, i.e., there is at most one such that . 2. 2.
If the variable is mutated by one subprogram, no other subprogram may depend on this variable, i.e., if there is a such that it is implied that for all with it holds that .
The strategy we used to obtain Def. 7 serves as a guideline for the next section, where we consider the problem domain of particle simulations.
III Data-race-free particle simulations
In this section, we derive a general model for particle simulations for which we can make guarantees in terms of data-race freedom. To this end, we first define three concepts: physical quantities associated with particles, particle types, and particle systems. With these we can then define particle dynamics in a very general fashion and classify certain particle dynamics as particularly useful.
III.1 Modelling static particle systems
Conceptually, a physical quantity of a particle is a single, semantically atomic property of the particle, e.g., its position, mass, charge, velocity, or orientation. We make no assumptions on these quantities or their physical meaning aside from the fact that they can be represented by a finite number of real numbers.555For the sake of simplicity, we ignore the fact that physical quantities typically also possess a unit.
Definition 8**.**
Let . Then we call a physical quantity type of dimensionality . is called a physical quantity of type . Furthermore, we call the set of all physical quantity types and the set of all physical quantities , i.e., implies the existence of an such that .
Generally, we can express the idea of a particle as an entity that ties together a position in space with a varying number of physical quantities, e.g., orientation, (angular) velocity, mass, or charge. For a truly general model of particles we make no assumptions on the nature of these quantities except that their number must be finite.666Strictly speaking, this can be seen as a loss of generality. One could, e.g., imagine a particle that keeps a memory of its past states, e.g. to implement self-avoiding dynamics. Since a full representation of all past states of a particle cannot be expressed in a finite number of quantities, our model cannot capture these kinds of particles. However, practical implementations of particles with memory typically have either a temporal decay of memory (thus limiting the number of previous states that are remembered) or utilize global fields instead of per-particle memory to enable memory. When defining this idea formally, we must distinguish between the concepts of particle types and particle states. The first can be seen as a blueprint for a set of particles equal in structure, while the second one encapsulates the idea of a single concrete particle at a given point in time.
Definition 9**.**
Let be a finite set with and let be a function. Then is a particle type with the index set and the quantity function . For a particle type we call the position quantity type of and the dimensionality of the particle type. Furthermore, we define as the set of all particle types and as the set of all particle types of dimensionality .
Definition 10**.**
Let be a particle type. Then we call a particle state of type if for all it holds that , i.e., a particle state maps the index set to physical quantities in accordance to the quantity types defined in the corresponding particle type. For a particle state we define as the position of the particle state. Furthermore, we define as the set of all particle states of particle type as well as as the set of all particle states of any particle type and as the set of all particle states of any particle type of dimensionality .
The purpose of the index set , which associates every quantity of a particle state with a unique index, may seem questionable to a reader at first glance. As we will see later, it is crucial, however, to reason about partial particle states, i.e., restrictions of a particle state onto a subset of quantities. can also serve to give physical meaning to quantities by some form of indexing scheme. This makes reasoning within the particle model much easier than, e.g., modelling particle states as (unordered) sets of quantities would do.
Finally, we can use the notion of particle types and states to define particle systems and their states. Conceptually, a particle system contains two pieces of information. The first information contains the various particle types found in the system as well as how many particles of each type are contained within the system. Secondly, a particle system may also have physical information that is not bound to individual particles, i.e., information that is global to the system. Examples of this are the simulation time or external fields. Again, we make no assumption on these global information except for the fact that they must be representable by a finite set of real numbers. One might ask at this point if the global state could not simply be represented as a single particle of a unique particle type. As we will see in the next section, separating the particle system from the global state is worthwhile to exploit the fact that for many numerical simulations of particle systems the global state is mutated separately from the state of the particle system.
Definition 11**.**
Let be a finite set, with and be functions as well as for some . Then is a particle system of dimensionality with the index set , the particle type function , the particle number function , and the global-state space . For each we say that the system contains particles of type .
Definition 12**.**
Let be a particle system, be a function mapping every element of to a multiset777See Appendix A for a complete overview of the multiset formalism used here. of particle states of any type and . Then we call a state of the particle system if for all and for all it holds that and . In other words, is a function that maps the index for each particle type to a multiset containing the states for each particle of this type in the particle system. We define as the set of all possible states of a particle system , i.e. if and only if is a state of .
One should note that in the definitions above there is no concept of any ordering of particles, as the particle states for each particle type in the system are represented by a multiset. For the same reason, there is also no notion of particle identity inherent in the model, i.e., two or more particles in the same state are indistinguishable when represented in the state of a particle system. However, if desired, one can express both an ordering of particles and an inherent distinguishability of particles by extending the particle type with a (non-physical) quantity signifying particle identity (e.g., by using a quantity to enumerate particles).
To illustrate the above formalism, let us consider a particle system comprised of three point masses, i.e., particles that have position, velocity, and mass, in three spatial dimensions. Additionally, two of these particles shall also have an electric charge.888At this point, one could ask if it was not more sensible to unify all particles into one type, by expressing the uncharged particles as particles with zero charge. While sensible for this tiny example system, in larger systems, where the number of uncharged particles is high compared to that of the charged particles, it becomes computationally wasteful to use the more complicated dynamics of charged particles for the uncharged particles as well. We say the particles have a position , a velocity , a mass , and in the case of the charged particles a charge where represents the uncharged particle and the charged particles. Both uncharged and charged particles can then be represented by these particle types, respectively,
[TABLE]
and we can define the particle system as
[TABLE]
Then we can express the state of the particle system as
[TABLE]
with the particle states defined by
[TABLE]
III.2 Modelling particle dynamics
Using the definitions from the previous section, we can define particle dynamics simply as transitions between states of particle systems.
Definition 13**.**
Let and be particle systems. Then a function is called a particle dynamics. For we call the evolved state of under .
It should be noted that we do not impose any restrictions on particle dynamics other than the fact that it maps from all valid states of some particle system to valid states of some other particle system. In particular, we do not assume that the particle system being mapped from is the same as the one being mapped to. This allows a particle dynamics to, e.g., change the number of particles in the system, alter particle types, and redefine global state.
An important algebraic property of particle dynamics as defined above is that they are closed under composition under the condition that input and output states are compatible. This is relevant since in practice particle simulations are often formulated as a loop over multiple operations each of which is considered elementary within some numerical scheme (e.g., applying a force all particles of a type or advancing particle positions). Our model can express this by defining a particle dynamics for each of these elementary operation and then compose them into more complex dynamics. As we will see later, this is highly useful as these elementary operations can typically be reasoned about more strongly than their composition into a complex particle dynamics.
On its own, the above definition of particle dynamics is far too generic to be of practical use. In a sense it is simply a (convoluted) way of expressing functions mapping between vector spaces of finite dimension. For this we cannot make any general statements regarding parallelism. Similar to how Safe Rust carves out of all possible programs only those programs for which it can guarantee safety, we can carve out of all possible particle dynamics only those which we can safely process in parallel in some fashion.
The first useful classification of particle dynamics distinguishes between dynamics that conserve global state and those that do not.
Definition 14**.**
Let be a particle dynamics with and being particle systems. is a global-state preserving particle dynamics if and for all it holds that . In other words, the global state is invariant under the dynamics.
This separation is useful as the time evolution of global state can be driven by any numerical procedure. For example, the global state might contain discretized fields that are integrated in time by a numerical scheme such as finite differences. Parallel implementations of these generic algorithms are not specific to many-particle simulations and thus outside the scope of our simulation model for particle dynamics. Instead, it is more sensible to delegate the analysis and implementation of these kinds of programs to general purpose programming languages such as Rust.
Another classification of particle dynamics can be made by looking at the definition of particle types in the two particle systems connected by a particle dynamics. From Def. 13 there is no requirement for both particle systems to contain the same particle types. In some special instances this can be useful, e.g., for initializing a system subject to a complex dynamics via a simple dynamics. For instance, one can use a simple diffusion dynamics to relax a system of particles into a homogeneous state to then run a more complex particle dynamics (with different particle types) from there. However, like in the case for dynamics not preserving global state, dynamics that do not preserve particle types are not algorithmically restricted enough to make statements about safe parallelism. We therefore characterize particle-type preserving dynamics as particle dynamics that do not alter which particle types are found in a given particle system.
Definition 15**.**
Let be a particle dynamics with and being particle systems. is a particle-type preserving particle dynamics if and .
If we look at particle dynamics that are both global-state preserving and particle-type preserving, we can identify two subclasses of interest.
We consider first a class of particle dynamics that add particles of existing types to the particle system, but leaves existing particles untouched. To add particles to the system, we require the number of new particles as well as an initialization function for each particle type. Of particular interest for parallelization are dynamics of this kind, where the initialization of each new particle can be performed in parallel, i.e., independently from one another.
Definition 16**.**
Let be a global-state preserving and particle-type preserving particle dynamics with and being particle systems. Also, let and be functions indicating the number of new particles and their initial state for each particle type, respectively. We then call an independently initializing insertion if and .
Definition 16 formalizes the idea of an independent initialization of new particles by generating a dynamics from two functions and where determines the number of particles to create for each particle type and initializes a new particle for a given type based on an index enumerating each new particle as well as the state of the particle system before adding any particles. Since only depends on the original system state and the number of particles to create for each type is known in advance, all evaluations of can be performed in parallel without a data race as long as the new particles are only added to the system state after all particles have been initialized.
Next, we consider a class of particle deleting dynamics based on a predicate function that selects the particles to delete. Again, we can parallelize this dynamics safely if the individual evaluations of this predicate only depend on the individual particle state as well as the original system state.
Definition 17**.**
Let be a global-state preserving and particle-type preserving particle dynamics with and being particle systems and being a state of . Also, let be a Boolean predicate which flags particles for deletion. We then call an independently selecting deletion if and .
It should be noted that while for an independently initializing insertion the evolved particle system only depends on the original particle system, for an independently selecting deletion the evolved particle system depends both on the original particle system and its state due to the fact that the number of deleted particles might depend on this state. Therefore, an independently selecting deletion as defined in Def. 17 can only be meaningfully applied to a single state even if it is technically defined on all states of the original particle system. This is largely irrelevant for the domain-specific language developed later as we will design this language to be generic over particle numbers.
Until now, we have considered particle dynamics that can change the number of particles. Let us now look at particle-number conserving dynamics.
Definition 18**.**
Let be a particle-type preserving particle dynamics with and being particle systems. is a particle-number preserving particle dynamics if .
By itself this class does not directly provide a parallelization opportunity as we have not specified how calculates the evolved state. Ideally, one would like to process every particle in parallel as this will provide ample opportunity for parallelization for most real-world systems. To this end we define a new subclass of particle dynamics.
Definition 19**.**
Let be a global-state, particle-type and particle-number preserving particle dynamics with being a particle system. We call a particle-parallel particle dynamics under if there is a function such that . In other words, for every particle type the function produces a function that takes the state of a single particle of type as well as the initial state of the particle system and produces the new state for this particle.
It is noteworthy that, as a consequence of enforcing this form of dynamics, particles of the same type in the same state are necessarily mapped to the same state in the evolved system. We also enforce particles of the same type to be subject to the same state-mapping function, i.e., the notion of a particle type is now not only used to unify particles with a common set of quantities associated with them, but also particles that follow the same kind of behavior within a particle dynamics. While formally a restriction, this form of dynamics does not diminish the expressiveness of the particle model in practice. In particular, dynamics with completely unique behavior for each particle can be implemented by using a unique particle type for each particle. Similarly, if the restriction for particles of the same type in the same state is undesired, one can simply introduce a new quantity in the particle type to make particles of this type distinguishable.
A particle-parallel particle dynamics can be trivially parallelized by evaluating the respective for each particle in parallel. However, also has a secondary dependency on the state of the entire particle system. This is necessary to allow the dynamics of a single particle to depend not only on its own original state but also on the state of other particles, i.e., this enables particle interactions. If we wish to mutate the system state in-place (i.e., without making a copy first)999Making a full copy of the entire particle system in each step of the simulation would be prohibitively expensive in terms of memory and runtime in many applications. we might cause data races if we naively allow arbitrary functions to run in parallel. To formulate a restriction on such that data-race freedom is guaranteed for in-place state updates, we first introduce a number of helpful definitions related to particle systems and particle dynamics.
First, we formalize the idea of extracting partial information from a particle system by only considering some subset of the quantities of each particle type.
Definition 20**.**
Let be a particle system with and be a function such that for all it holds that . Then is called the subsystem of with respect to (written as ) if for all it holds that . In other words, is the particle system obtained by taking only those quantities for each type where the respective index of the quantity is in .
Definition 21**.**
Let be a state of a particle system and be a subsystem of . Then is called the substate of with respect to (written as ) if is a state of and for all it holds that .
Next, we formalize the idea of a subsystem being invariant under a particle-parallel particle dynamics.
Definition 22**.**
Let be a particle system and let be a particle dynamics that is particle-parallel under . Furthermore, let be a particle type index and be the corresponding particle type. We then call an immutable subset of the particle type under if for all it holds that
[TABLE]
In other words, there must be no state of where would mutate the state of any particle of type such that a quantity with an index in is changed.
Definition 23**.**
Let be a particle system and let be a particle dynamics that is particle-parallel under . Furthermore, let be a function. Then we call an immutable subsystem of under if for all it holds that is an immutable subset of the particle type under .
The notion of an immutable subsystem can be used to express the idea of extracting from a particle-system state only information that is invariant under a specific particle dynamics. This allows us to formulate a data-race-free subclass of the particle-parallel particle dynamics defined in Def. 19 in the case of in-place state manipulation. Similar to Def. 7 we demand that each quantity of any particle in the system must either remain unchanged by the dynamics, or, if it is mutated, may only influence the dynamics of the particle it is associated with while being irrelevant to the dynamics of any other particle. In other words, the particle dynamics has exclusive mutability.
Definition 24**.**
Let be a particle system and let be a particle dynamics that is particle-parallel under . Also, let be a function such that is an immutable subsystem of under . We then say that possesses exclusive mutability via the immutable subsystem and the restricted update function if
[TABLE]
In other words, for a dynamics that is particle parallel under to possess exclusive mutability, there must be another function that can reproduce the results of while only depending on quantities of the respective particle and those quantities of other particles that are immutable under .
The reasoning why this implies data-race freedom is the same as in the case of the borrow rules of Rust. If a particle quantity is immutable under a particle dynamics, there is no writing access to it that would be required for a data race. Conversely, if the quantity is mutated, its visibility is restricted to a single thread of execution, namely that of the particle the quantity is associated with. This makes a data race, which requires at least two threads to access the same value, impossible.
While in Def. 24 we have reached the goal of defining a general class of particle dynamics that we can parallelize without the risk of data races, its usefulness still remains to be proven. To show that real-world particle dynamics can be expressed within this class, we will look at a number of special cases of particle dynamics with exclusive mutability that relate directly to real-world applications.
First, we consider the special case of a particle dynamics with exclusive mutability where each evolved particle state only depends on one particle state in the original system, i.e., the particle dynamics only processes information local to a single particle.
Definition 25**.**
Let be a particle system and let be a particle dynamics that possesses exclusive mutability under . Then we call a particle-local dynamics if for all there is a function such that . In other words, we drop the dependency of on the immutable substate of the system for all particle types .
This kind of dynamics implies that no information is shared between particles or, in a more physical interpretation, that the particles do not interact. These kinds of dynamics can therefore be found, e.g., in the simulation of ideal gases.
In many real-world particle dynamics, however, we find that particles do interact with each other. To limit the computational expense of simulating these dynamics, the interaction of each particle is typically restricted to a small subset of the particle system. This means that information between particles is not shared over the whole system, but only between small groups of particles. To define this formally, we first introduce the analog of a power set for states of particle systems.
Definition 26**.**
Let be a particle system and be a state of . We define the power set of the particle-system state as the set of all particle-system states101010Note that in general is not a state of in this context. with which fulfill the condition that for all it holds that is a submultiset of . In physical terms, this means that all particles found in appear in the same state in , but not vice versa.
Definitions 26 allows us to formalize the idea that calculating new particle states may not require knowledge of the entire immutable substate of the particle system, but only of a (potentially small) selection of particles from this state.
Definition 27**.**
Let be a particle system and let be a particle dynamics that possesses exclusive mutability via the immutable subsystem and the restricted update function . Furthermore, let for all be a family of Boolean predicates that encode group association between particles. Then we call a group-local dynamics under with respect to if there is a function such that
[TABLE]
In other words, we restrict the dependencies of such that for each particle only information from other particles for which indicates group association is necessary to calculate the new state of the particle.
An important detail in this definition is the fact that not just the evolved state, but also group association, must be decidable based on the information in an immutable substate of the particle system to prevent data races. It should also be noted that every particle dynamics that possesses exclusive mutability is a group-local dynamics with respect to some group predicate, e.g., in the case of a particle-local dynamics a group predicate or in the case of unrestricted particle interaction . However, the formalism allows us to express two other relevant subclasses of particle dynamics with exclusive mutability besides particle-local dynamics.
On the one hand, we can have group-local dynamics where the group association of particles is determined by some form of particle identity.
Definition 28**.**
Let be a particle system and let be a particle dynamics that is group-local with respect to . Furthermore, let be a family of identifying function for all such that
[TABLE]
In other words, for every possible state of the function must map every possible state of a particle of type to a unique number. We then call a fixed-group local particle dynamics if there is a function such that . Conceptually, this means that the decision of group association does not need to inspect the whole state of each particle, but only the unique identifications produced by the function family and the indices of the respective particle types.
This class of particle dynamics is commonly used to enforce an association between particles that cannot be derived from within the simulation context and is therefore imposed on the system extrinsically. Typically these kinds of associations between particles are also considered unbreakable within the context of the particle simulation, which gives rise to the name of this category of particle dynamics. The most notable example for this are (fixed) chemical bonds. Notably, unlike many other molecular dynamics models, we make no assumption on the number of atoms involved in the bond. Typically, groups of up to four atoms (dihedral angles) are considered in molecular dynamics.
On the other hand, we can also have dynamic group associations based on parts of the particle state that vary over the course of a many-particle simulation. Since particle interactions typically decay with distance between particles, by far the most common case of this is a group predicate using the distance between particles.
Definition 29**.**
Let be a particle system and let be a particle dynamics that is group-local with respect to . Furthermore, let be a cutoff distance and Boolean matrix indicating which particle types are allowed to interact. We then call a neighborhood-local dynamics if
[TABLE]
In practice, usually a special case of group-local dynamics is used in the form of pairwise particle interactions, e.g., in the form of pairwise forces between particles. These interactions represent a particular method of calculating the evolved state of each particle from the state of all particles interacting with it.
First, we find each particle that is interacting with the particle being updated. Then, we calculate a single quantity for each of these interaction partners using only the initial state of the particle being updated and the interaction partner. Finally, we use some reduction scheme to calculate the evolved state of particle being updated from its initial state and all of the previously calculated pair quantities. In the example case of pairwise forces, this corresponds to first calculating the force for each pair of interacting particles and then summing these forces to obtain the net force for each particle. We can formalize this concept as follows.
Definition 30**.**
Let be a particle system and let be a particle dynamics that is group-local under . Furthermore, for let be a family of physical quantity types, be a function family to map pairs of particle states to physical quantities and be family of reduction functions. We then call a pairwise interaction with the mapping functions and the reduction functions if
[TABLE]
In practice, pairwise interactions are often also neighborhood-local, e.g., in the case of the popular Weeks-Chandler-Anderson potential for particle interactions.
When trying to map real-world particle simulations into the models developed in this section, we typically cannot find a suitably specialized classification of the entire particle simulation for parallel execution. However, as stated before, real world particle simulations are often composed of multiple sub-dynamics which can be expressed as, e.g., particle-local dynamics or pairwise interactions. In these cases, one can use the parallel schemes developed here as long as there is a synchronization barrier between each sub-dynamics.111111This is a result of the third condition for data races demanding that the memory accesses must be unsynchronized.
An overview of the different kinds of particle dynamics presented in this section and their relationship towards one another can be found in Fig. 1.
IV A domain-specific language for data-race-free particle simulations
In this section, we present a practical application of the results of Sec. III in the form of a domain-specific language for many-particle simulations that can be parallelized while guaranteeing the absence of data races. First, we define the grammar of this language in Sec. IV.1 before discussing how to ensure deadlock freedom in Sec. IV.2 and finally presenting a number of example simulation codes to demonstrate the practicability of our approach in Sec. IV.3.
IV.1 Language grammar
The purpose of this subsection is to translate the abstract models developed in Sec. III into the practical form of a (minimalistic) programming language for expressing concrete particle dynamics in a fashion that can then be compiled into a runnable program. Much like the concepts in Sec. III lean heavily on the borrow rules of Rust, here we lean on Rust also on a syntactical level. To describe this formally, we use a slightly modified version of the EBNF notation (see Appendix B for an overview) to define the grammar of the language. It should be noted that the language we describe in the following is meant to be stand-alone, but rather supposed to cooperate with some kind of environment hosting it. This greatly simplifies the design as we can delegate tasks requiring general computation interfaces to this hosting environment.
It is useful to define some common syntax elements first before getting to the more unique design points of the language. For one, we define the following basic non-terminals by regular expressions in the syntax of Perl compatible regular expressions (PCRE) [39]
Identifier = [A-Za-z][A-Za-z0-9_]* Nat = 0|([1-9][0-9]) Integer = (+|-)(0|([1-9][0-9])) Float = [+-]?[0-9]+(.[0-9]*([eE][+-]?[0-9]+)?)?
which in order of appearance denote identifiers (i.e., symbol names), natural numbers, integer numbers, and floating point numbers.
Another common syntax element is that of an expression, which we define by the EBNF rule
Expression = Expression ("+" | "-" | "*" | "/") Expression | Identifier "(" (Expression ** ",") ")" | Identifier ("." Identifier)? ("[" Nat "]")? | "[" (Expression ** ",") "]" | Float | Integer
where the productions (in order of appearance) encode binary arithmetic expressions, function calls, variables (including options for namespacing and static indexing), array literals, and primitive number literals. In terms of semantics, the only noteworthy aspect of expressions is that for arithmetic expressions involving array types both vector addition and scalar multiplication are defined. Otherwise the typical semantics for imperative languages apply.
Finally, we define types via the EBNF rule
Type = "i64" | "f64" | "[" Type ";" Nat "]" | "position"
which in order of appearance correspond to 64-bit integer numbers, 64-bit floating point numbers, array types of fixed length, and the special position type which is equivalent to an array type of floating point numbers with a length corresponding to the dimensionality of the system. The position type might seem redundant, but compared to the general array type it encodes additional semantic meaning as it can be used to mark a quantity as the distinguished position quantity of a particle type.
From these syntax elements, we first construct a representation of the global-state space introduced in Def. 11. Unless stated otherwise, the syntax constructs introduced in the following are all top-level constructs, i.e., can coexist simply separated by whitespace in a single unit of compilation. While for considerations of data-race freedom modeling the global state as a single lumped vector was sufficient, for practical purposes it is more sensible to allow splitting this vector into multiple distinct physical quantities. These can be expressed syntactically as a top-level construct similar to that of global variables:
GlobalStateMember = "global" Identifier ":" "mut"? Type ";"
Here, the optional mut keyword indicates whether this quantity of the global state can be changed after compilation.121212This allows certain optimization strategies such as constant folding or rewriting of expensive operations like divisions as a sequence of cheaper operations like additions and bitwise shifts.
While the syntax of the global state we propose here is similar as for global variables, the semantics is not. As discussed in Sec. III.2, the evolution of the global state can be better described by a general-purpose language. Therefore, we delegate the task of initializing and mutating the global state to the hosting environment.
Next, we look at the state space spanned by each particle type, which consists of an indexed set of physical quantities. For practical purposes, it is preferable to use symbolic names over numeric indices to distinguish both the quantities within a particle type and the particles types within a particle system. With this we can use a syntactical construct similar to heterogeneous product types (i.e., struct types) in Rust:
ParticleDefinition = "particle" Identifier "{" (ParticleQuantity ** ",") "}" ParticleQuantity = Identifier ":" ("mut")? Type
The optional mut qualifier is again a concession to possible optimizations such as eliding redundant storage of quantities that are constant for all particles of the same type. It should be noted that while the syntax is similar to struct types in Rust, the semantics is very different as the definition of a particle type neither introduces a new data type nor makes any statements about the layout of particle data in memory.
With the definitions of the static state structure being complete, we proceed with the syntactic encoding of particle dynamics.
As discussed at the end of Sec. III.2, real-world particle dynamics typically consist of a loop containing multiple of the different classes of dynamics discussed in Sec. III.2. To express this syntactically, a simulation schedule is required that defines each of the loop iterations for each particle type. It is also useful to allow filtering of dynamics based on the loop counter (i.e., the simulation step number), e.g., to switch on and off certain parts of the dynamics at different times or to schedule evaluation steps that calculate statistical data from the state of the particle system. Overall, we propose the following syntax rules:
Simulation = "simulation" Identifier "{" ScheduleFilter* "}" ScheduleFilter = "once" Nat "{" ParticleFilter "}" | "step" StepRange "{" ParticleFilter "}" StepRange = Nat | Nat? ".." Nat? ("," Nat)? ParticleFilter = "particle" Identifier "{" (Statement ** ";") "}"
A simulation is thus described as a two-layered nested structure of statement blocks where the first layer filters by simulation step and the second layer filters by particle type. For the simulation step filters, up to three numbers are required to express a step range where starting at step and ending at step (not inclusive) every -th step is taken. We suggest that a single number is to be interpreted as a value of with and , while the notation with up to three numbers should indicate , and in this order with defaults of , , and being used if the respective number is omitted. The particle filter construct should express that the contents of its block of statements only applies to the particle type indicated by its symbolic name. To decide whether a simulation should terminate, control should (temporarily) be passed back to the hosting environment after each iteration of the simulation loop.
Nested within the simulation schedule, we use a set of statements to both express particle-local dynamics and provide a mechanism to “escape” the restrictions of the syntax we propose to the more expressive host environment. To this end, we define the following syntax rules for statements:
Statement = "let" Identifier ":" Type "=" Expression | Identifier ("[" Nat "]")? "=" Expression | "call" Identifier | "update" Identifier ("." Identifier)?
The first two rules for statements are used to implement particle-local dynamics with the first one allowing to introduce new (temporary) variables and the second one allowing to either write a value to one of the previously declared variables or to mutate a particle quantity of the particle type of the current particle filter. The optional natural number in square brackets can be used to assign values to individual fields of vector variables or quantities.
Statements beginning with the keyword call indicate that control should be handed over to the hosting environment temporarily at this point in the simulation. This is likely to be implemented as a form of callback, i.e., a function pointer from the hosting environment registered under a symbolic name which can then be referenced in this kind of statement after the call keyword.
Finally, statements beginning with the update keyword implement general group-local interactions. Since interactions can be defined between two or more particle types, it is sensible to place these syntax constructs outside the simulation schedule (which is filtered by particle types) and reference them by a symbolic name after the update keyword. In this article, we only develop syntax for neighborhood-local, pairwise interactions and fixed-group local dynamics with a constant group size known at compile time as these are by far the most common primitives for molecular dynamics in particular and can be implemented with a relatively simple grammar. Both of these can be represented as follows:
PairInteraction = "pair interaction" Identifier "(" Identifier ":" Identifier "," Identifier ":" Identifier ")" "for" ("|" Identifier "|" "=")? Identifier "<" Expression InteractionBody FixedGroupInteraction = "fixed group interaction" Identifier "(" (Identifier ":" Identifier) ** "," ")" InteractionBody
Here, the dynamics is first bound to a name following the pair interaction or fixed-group interaction keywords, respectively. After this, the two interacting particle types for pairwise interactions or an arbitrary number of particle types are specified. This is done with a syntax similar to function parameters in Rust as a pair of identifiers per particle type, where the first identifier defines a namespace for the quantities of each of the types and the second identifier specifies the particle type by its symbolic name. This again represents a syntactic similarity of particle types and structured data types even though the semantics of both are very different. Binding a namespace to each particle involved in the dynamics is necessary as there is no guarantee that quantity names are unique between different particle types.
In the case of a pairwise interaction, following the particle type specification, after the for keyword is the required information for defining the particle neighborhood, which, as discussed in Def. 29, is defined by a cutoff distance. This distance is given as an expression after the symbol <. An implementation has to verify that this expression evaluates to a numeric value and only depends on constant quantities from the global state. Before this, one or two identifiers have to be placed in a notation resembling the equation . Both of these identifiers serve the purpose of binding symbolic names to the distance between two interacting particles with the first, optional identifier denoting the distance vector and the second identifier denoting the scalar distance. Binding the distance (vector) to a name in such a prominent position is motivated by the fact that many-particle interactions depend on the inter-particle distance to determine the interaction strength. In the case of periodic boundary conditions, this also allows to differentiate between the position of the particle in the simulation domain and the distance vector that might correspond to one of the images of the particle.
The last part of both the definition of a pairwise interaction and a fixed-group local dynamics is a section containing the information on what calculations are involved in the respective dynamics and which quantities are to be mutated. We propose the following syntax rules for this task:
InteractionBody = "{" ("common" "{" (InteractionStatement ** ";")* "}")? InteractionQuantity* "}" InteractionQuantity = "quantity" Identifier "-[" ReductionMethod "]->" TargetSpecifier "{" (InteractionStatement ** ";")+ ";" Expression "}" InteractionStatement = "let" Identifier ":" Type "=" Expression | Identifier ("[" Nat "]")? "=" Expression ReductionMethod = "sum" | "max" | "min" TargetSpecifier = (("-")? Identifier "." Identifier) ** ("=" | ",")
Here, an InteractionBody is composed of one optional common block as well as multiple definitions of interaction quantities, i.e., the physical quantities calculated for every pair of interacting particles. The purpose of the common block is to allow the definition of variables used for the calculation of more than one interaction quantities. The definitions of interaction quantities follows very closely to Def. 30.
Following the symbolic name of the interaction quantity is a specification of the reduction method, i.e., the equivalent to the function in Def. 30 which transforms the individual results of all pairwise interactions of a particle into a single physical quantity. Here, we suggest only three common reduction operations as examples, namely summing all pairwise interaction results or taking either the minimum or maximum of all interaction results.
After the specification of the reduction method follows a specification of the target particle quantities that this interaction quantity should be written to. The syntax for the target quantities can be seen as a variation of the pattern-matching syntax of Rust. Here, this pattern is matched against the result of the last expression in the block following the target specification. The target specification itself is composed of multiple namespaced particle quantities separated either by equality signs or commas. Here, an equality sign indicates that a single result from the calculation of the interaction quantity is written to multiple particle quantities, while a comma indicates that the result is an array which is to be split into its elements which are then written to distinct particle quantities. Both variants have the option of negating the result, which is a convenience for force-like quantities which are subject to Newton’s third law as these can then simply be expressed as p1.F = -p2.F for two particle namespaces p1 and p2 each of which contains a particle quantity F. Notably, target specifiers can contain both target quantities separated by equality signs and targets separated by commas.131313This is only useful for fixed-group local dynamics as pairwise interactions cannot have more than two targets. For example, an interaction quantity for an interaction of three particles with namespaces p1, p2, and p3, respectively, and a particle quantity F in each namespace can have the target specifier p1.F = -p3.F, p2.F. This indicates that the calculation of the interaction quantity produces a vector of two values, the first of which is written to p1.F and in negated form to p3.F while the second one is written to p2.F.
Finally, after the target specifier, the actual calculation of each interaction quantity follows in the form of a block of statements terminated by an expression that determines the value of the interaction quantity. In this context, the statements are limited to variable definitions and assignments. In all expressions appearing in this section, it is essential to respect the constraints of Def. 24, i.e., to forbid access to any particle quantities that are mutated by this interaction, i.e., all quantities appearing as part of a target specification in this interaction.
It should be noted that the fact that each interaction quantity can only be written to a single particle quantity is a restriction compared to the more general reduction function in Def. 30. The reason for this is the necessity for an implementation to statically analyze which quantities of each particle is immutable under the interaction and therefore visible in the computation of the interaction quantities. With the grammar above this task becomes trivial as the programmer must explicitly indicate that a variable is not immutable. A similar mandatory opt-in for mutability can also be found in the design of Rust.
For the two data-race-free, particle number altering dynamics – namely independently initializing insertions and independently selecting deletions – it is necessary to implement them in particularly tight integration to the hosting environment as they often require general purpose functionality (such as I/O operations to load data for initialization or random number generation for probabilistic deletion of particles). Therefore, we do not propose a special syntax for these dynamics but assume that they can be implemented by a suitable mechanism in the hosting environment (e.g., function pointers for dynamic dispatch or generics for static dispatch).
One aspect a reader might find surprising in the syntax developed in this section is the lack of any control-flow structures such as conditionals or loops. The primary reason for this is the fact that branching control flow makes static program analysis much harder if not impossible. In particular, for reasons explained in the following section, it is required that the sequence of elementary particle dynamics for each particle type must be known at compile time to guarantee the absence of deadlocks. Allowing conditional control-flow would require verifying all possible control flow paths which is impractical as their number grows exponentially with the number of control flow statements. This problem could be worked around by preventing conditional control flow to “leak” beyond the boundaries of each elementary particle dynamics, e.g. by using syntax elements such as ternary operators to allow branching control flow only on the level of expressions. However, even in this case branching code can still be problematic as it interferes with optimization strategies such as vectorization and adds complexity to the optimization tasks due to effects of branch prediction. Therefore, for very complex branching code it might be more viable to delegate this part of the dynamics to the hosting environment instead.
IV.2 Eliminating interaction deadlocks
As discussed at the end of Sec. III.2 for particle dynamics composed of multiple (sub-)dynamics, it is necessary to add synchronization barriers between each sub-dynamics for each particle type. As interactions can affect more than one particle type, this, however, also synchronizes the simulation schedules of multiple particle types at certain points. So far, we implicitly assumed that an implementation will ensure that these synchronization points match up, i.e., that if an interaction between a particle type A and a particle type B is implied in the simulation schedule of type A there is a corresponding one found in the schedule for type B. This, however, brings with it the problem of deadlocks, i.e., multiple interactions blocking each other from progressing in a dependency cycle. For example, consider the following simulation:
particle A { /* ... / } particle B { / ... / } particle C { / ... */ }
pair interaction AB(a: A, b: B) {} pair interaction BC(b: B, c: C) {} pair interaction CA(c: C, a: A) {}
simulation { step { particle A { update AB; update CA; } particle B { update BC; update AB; } particle C { update CA; update BC; } } }
This simulation will stall indefinitely as each simulation schedule prescribes a different interaction to be performed first.
To detect the presence of deadlocks in a simulation, we can use a graph-based intermediate representation of the simulation schedule. The nodes of this graph represent each interaction in the simulation schedule as well as by two special nodes the start and end of each simulation step. Then, for each particle type a directed path is added beginning at the start node and connecting the interaction nodes in the order they appear in the simulation schedule for this particle type. Each of these paths is then terminated at the end node. Such an interaction graph is presented in Fig. 2 for the previous example.
Since interaction deadlocks are created by a cyclic dependency of interactions on one another, a simulation is deadlock-free if the interaction graph is acyclic. This can be verified by standard methods such as Tarjan’s algorithm [40].
IV.3 Examples
In this section, we provide examples of common simulations to illustrate the application of the previous results to real-world particle dynamics.
Newtonian dynamics
First, we present in Listing LABEL:code:velocityverlet an example for a particle system subject to Newtonian dynamics. We employ the commonly used velocity-Verlet scheme to solve these dynamics numerically. The particle system itself is composed of free moving particles using the Weeks-Chandler-Andersen (WCA) potential [41]
[TABLE]
to model particle interactions. Here, is the particle distance, is an energy scaling factor, and is the diameter of the particles.
In the implementation, we first make use of the global state to define the time-step size, the particle diameter , and the energy scaling factor of the WCA potential as constants of the simulation. This allows us to insert the concrete values for these constants from the hosting environment without having to change the domain-specific simulation code.
Next, we define a particle type representing a spherical particle without orientational degrees of freedom. The physical state of the particle is fully described by its position, velocity, and mass, the latter of which is considered to be constant. Additionally, we define particle quantities for the net force acting on the particle and the total potential energy associated with the particle. These two quantities are not necessary to describe the physical state of the particle, but rather store the physical quantities resulting from particle interactions.
Afterwards, we define a pairwise particle interaction between two particles of the previously defined particle type with the cutoff distance defined by the WCA potential. After defining a variable for the common term , we define two interaction quantities. The first quantity is the repulsion force defined by , which is written to the force quantity of each particle with a negation for one of the particles to account for Newton’s third law. The second quantity calculates the potential energy of the interaction evenly split between both particles. This allows to calculate the total potential energy of the system by summing over all particles.
Finally, we define a simulation schedule for the Spherical particle type in combination with the velocity-Verlet integration scheme. Here, the particle dynamics is composed of a particle-local dynamics mutating position and velocity, a pairwise interaction that mutates forces and potential energy, and then another particle-local dynamics mutating velocity.
Overall, this example shows that even though the domain-specific language has no in-built support for any concrete numerical scheme, it can still express real-world simulations relatively concisely. For example, the entire velocity-Verlet scheme only takes up four lines of code.
Harmonic valence angle potential
Next, we look at the example in Listing LABEL:code:angularbond which presents an implementation of a chemical bond between three atoms with a harmonic potential for the valence angle, i.e.,
[TABLE]
where is the valence angle, is the bond strength, and is the equilibrium valence angle. This is an important example as potentials depending on bond angles are a very common archetype in molecular dynamics.
In this example, we reuse the same particle type as in Listing LABEL:code:velocityverlet for spherical particles and use a similar style of passing the constants and to the simulation via global state quantities. After this we define a fixed-group interaction for triples of particles. This interaction defines a single interaction quantity that writes three different forces to each of the three particles involved in the interaction. For the sake of clarity, we use the formulation of Ref. [42] for calculating the forces on each atom. This is algebraically equivalent, but numerically less efficient than the typical formulation found in molecular dynamics software packages for (harmonic) angular bonds.
Particularly noteworthy is the distvec function in this example. Typically, one would calculate the distance vector between two particles by a simple vector difference of the position vectors of each particle. However, this method is problematic if periodic boundary conditions are used, in which case one has to decide the image of each particle that is used for the distance calculation. For this purpose, we propose the implementation to support the pseudo function distvec that takes two particle namespaces and returns the distance vector for the particles they represent. In the case of periodic boundary conditions, we can define the distance vector between particles as the vector between the position of the first particle and the position of the closest image of the second particle (minimum image convention [43]).
Overall, this example shows one of the more common applications of fixed-group interactions in the form of a potential depending on bond angles. It can be extended easily to larger particle groups, e.g., to implement potentials depending on dihedral angles.
Brownian dynamics
The final example is shown in Listing LABEL:code:eulermaruyama, which demonstrates how a simple Brownian dynamics can be implemented in form of the Euler-Maruyama scheme. Assuming spherical particles, the numerical scheme then reads
[TABLE]
where is the position of the -th particle at time , is the time-step size, is the friction coefficient, is the total force acting on the -th particle at time , is the Boltzmann constant, is the system temperature, and is a Gaussian white noise vector with zero mean and a variance of one.
Again, the implementation begins with a declaration of the global state quantities, which in this case consist of , , and . Additionally, we use the global-state quantities to pass as a symbolic value for the sake of brevity in this example. The particle type is notably shorter compared to the earlier examples since particle mass and instantaneous velocity are irrelevant in the case of Brownian dynamics141414More precisely, inertial forces are considered negligible compared to friction forces.. Finally, in the definition of the simulation schedule, we find an almost identical expression to Eq. (18).
This example demonstrates how the model of particle dynamics we propose can be molded to the exact requirements of a numerical scheme. In particular, one can define particle types with just the degrees of freedom a particle has under a given numerical scheme thus saving resources. For example, in Brownian dynamics there is no need to store the velocity of each particle. Of course, this places some responsibility on the programmer to implement a numerically sound particle dynamics as the domain-specific language only prevents the user from formulating particle simulations that are unsound in the sense that they exhibit undefined behavior (e.g., data races).
V Conclusions
We have developed a formalization of the concepts of generic particle systems and based on this a very general definition of particle dynamics. This was done in the context of safe parallelism, i.e., the avoidance of data races, to find a safe abstraction for computational implementations of particle dynamics in the shared-memory model of parallel programming. For this we took inspiration from the design of the general purpose programming language Rust which is able to guarantee the absence of data races by enforcing a set of borrowing rules at compile time. We found that general particle dynamics are not algorithmically constrained enough to transfer these rules directly. Instead, we formulated a hierarchy of restrictions on particle dynamics that eventually allowed us to define concrete subclasses of particle dynamics which we can safely parallelize automatically. In particular, we identified two classes of particle dynamics that alter the number of particles in a particle system in such a way that they can be split into independent per-particle operations. For particle dynamics that do not add or remove particles, we found an equivalent of the borrowing rules of Rust in the context of particle systems. This concept was then concretized into multiple subclasses of particle dynamics that are common in practical applications.
After this, we designed a domain-specific programming language around our abstract formulation and classification of particle dynamics. Borrowing heavily from the syntax of existing languages, we formulated a minimalist grammar for this language and discussed where static analysis is required within this language to ensure data-race freedom. We found that unlike Rust, which utilizes an elaborate type system to ensure safety, for the language we designed it is sufficient to regulate symbol visibility and perform basic type checking. We also showed that by means of an appropriate intermediate representation we can not only guarantee data-race freedom, but also deadlock freedom. To prevent over-extending our language and still overcome the inherently limited expressiveness of a domain-specific language, we designed the language to be embedded in a general-purpose programming language and added mechanisms for these two contexts to cooperate.
Finally, we asserted the practicability of our model by expressing common operations from molecular dynamics in our domain-specific programming language. These examples also demonstrate the high degree of flexibility our model provides which we expect to be very useful for prototyping more exotic many-particle dynamics.
There are many opportunities for future extensions of our results presented in this work. For one, we have only formulated safe abstractions for particle dynamics that are local either to a single particle or to a small group of particles. There are, however, many physical systems that include long-range interactions (e.g., long-range electrostatic interactions) which often require more sophisticated algorithms such as Ewald summation [44], the particle-particle-particle-mesh method [45], or the Barnes-Hut method [46] for simulations. More research is required to find general, data-race-free abstractions for these kinds of methods.
Naturally implementing the language we designed here is another big task. A partial implementation of the model we designed here for multicore CPU-based systems can be found in Ref. [47]. Due to the high degree of parallelism we provide in our model of particle dynamics, the development of an implementation based on GPU hardware is another worthwhile avenue to continue this work.
There is also much opportunity for extending the syntax we gave in this article for real-world applications. Here, we presented a very minimal language model to ease the discussion, but this leaves much to be desired in terms of language features. Potential areas of improvement include extensions of the type system to support tensor quantities of a rank higher than one, support for sharing behavior between particle types to reduce redundant code, a module system to provide common primitives such as interactions for commonly used potentials, limited forms of conditional expressions and loops, and many more. Due to the already present possibility of integrating a general-purpose programming language into the model, strictly speaking these extensions do not add functionality, but will drastically increase the usability of any implementation of our model.
We expect that the model of particle dynamics and the language we developed for it will aid future research related to the numerical treatment of many-particle systems, and we hope that in return feedback from these applications will guide future refinement of abstractions for safe particle dynamics.
Acknowledgements.
We thank Cornelia Denz, Matthias Rüschenbaum, Stephan Bröker and Tobias Nitschke for helpful discussions. This work is funded by the DFG – Project-ID 433682494 - SFB 1459.
Appendix A Multiset notation
This article makes extensive use of multisets for which there is no generally accepted notation. In this section, we therefore formally define multisets, their notation, and all operations on them used in this work.
Definition 31**.**
A multiset over a set is defined as a tuple composed of the underlying set and a multiplicity function . is an element of of multiplicity (written as or if the multiplicity is irrelevant) if and only if . We represent a multiset by an enumeration of its elements, each repeated according to its multiplicity, in square brackets, e.g., the multiset over the underlying set is equivalent to . is the empty multiset which does not contain any elements.
Definition 32**.**
We define as the set of all multisets over .
Definition 33**.**
Let and be multisets. and are equal (written as ) if .
Definition 34**.**
The cardinality of a multiset is defined as .
Definition 35**.**
The sum of two multisets and over a common underlying set is defined as the multiset with .
Definition 36**.**
Let be a multiset. Another multiset is called a submultiset of (written as ) iff . We define the powerset as the sets of all submultisets of , i.e., .
Definition 37**.**
Let be a multiset over a set and a function. We define the map operation as . This operation creates a new multiset over the set from the result of mapping each element of to another element via the function while preserving multiplicity.
Definition 38**.**
Let be a multiset over a set and be a Boolean predicate function. We then define the select operation as
[TABLE]
This operation determines the largest submultiset of which only contains elements fulfilling the predicate .
Appendix B Modified EBNF notation
In this article we use a slight modification of the classical Extended Backus-Naur form (EBNF) syntax [48] to express the productions of a context-free grammar. In particular, we do not use the comma as a concatenation operator and instead simply use whitespace to separate the elements of a sequence. We also omit the terminating semicolon and indicate the end of a rule by indentation instead. Furthermore we do not use brackets and curly parentheses to denote repetition and instead use a similar notation to regular expressions, i.e., a question mark indicates that the preceding terminal or non-terminal is optional, a star indicates that it can be repeated an arbitrary time including zero, and a plus sign indicates arbitrary repetition with at least one occurrence. For the sake of brevity, we also introduce the double-star operator ** which indicates that the previous terminal or non-terminal can be repeated arbitrary many times but each repetition is separated by the following terminal or non-terminal. As an example, the expression "A" ** "," can produce the empty word, the word A, the word A,A, the word A,A,A, etc.. Finally, we omit explicit whitespace terminals as they contribute little to the actual language design. In an implementation of the grammar one must therefore take care to introduce mandatory whitespace as is necessary to eliminate ambiguity, e.g., in places where an identifier can collide with a keyword.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Fischer and Ciesla [2014] R. A. Fischer and F. J. Ciesla, Dynamics of the terrestrial planets from a large number of N-body simulations, Earth and Planetary Science Letters 392 , 28 (2014).
- 2Reynolds [1987] C. W. Reynolds, Flocks, herds and schools: A distributed behavioral model, SIGGRAPH Computer Graphics 21 , 25 (1987).
- 3Zhang et al. [2010] H. P. Zhang, A. Be’er, E.-L. Florin, and H. L. Swinney, Collective motion and density fluctuations in bacterial colonies, Proceedings of the National Academy of Sciences U.S.A. 107 , 13626 (2010) . · doi ↗
- 4Denz et al. [2021] C. Denz, A. Jurado, M. Rueschenbaum, J. Hallekamp, J. Jeggle, and R. Wittkowski, Light-driven microrobots: light fuels motion, in Complex Light and Optical Forces XV , edited by D. L. Andrews, E. J. Galvez, and H. Rubinsztein-Dunlop (SPIE, Bellingham, WA, USA, 2021). · doi ↗
- 5Rahman [1964] A. Rahman, Correlations in the motion of atoms in liquid argon, Physical Review 136 , A 405 (1964).
- 6Kong [2011] L. T. Kong, Phonon dispersion measured directly from molecular dynamics simulations, Computer Physics Communications 182 , 2201 (2011).
- 7Ristow [1992] G. H. Ristow, Simulating granular flow with molecular dynamics, Journal de Physique I 2 , 649 (1992).
- 8Van Gunsteren and Berendsen [1990] W. F. Van Gunsteren and H. J. Berendsen, Computer simulation of molecular dynamics: methodology, applications, and perspectives in chemistry, Angewandte Chemie International Edition 29 , 992 (1990).
