A General Overview of Formal Languages for Individual-Based Modelling of   Ecosystems

Mauricio Toro

arXiv:1901.10820·cs.LO·January 31, 2019

A General Overview of Formal Languages for Individual-Based Modelling of Ecosystems

Mauricio Toro

PDF

Open Access

TL;DR

This paper reviews various formal languages used for individual-based ecological modeling, focusing on process calculi and other approaches, highlighting their differences, advantages, disadvantages, and future research directions.

Contribution

It provides a comprehensive overview of existing formal languages for ecological modeling, emphasizing process calculi and comparing them with other methods.

Findings

01

Process calculi are widely used in ecological modeling.

02

Different languages offer distinct advantages and limitations.

03

Future research should explore hybrid approaches and new formal frameworks.

Abstract

Various formal languages have been proposed in the literature for the individual-based modelling of ecological systems. These languages differ in their treatment of time and space. Each modelling language offers a distinct view and techniques for analyzing systems. Most of the languages are based on process calculi or P systems. In this article, we present a general overview of the existing modelling languages based on process calculi. We also discuss, briefly, other approaches such as P systems, cellular automata and Petri nets. Finally, we show advantages and disadvantages of these modelling languages and we propose some future research directions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDNA and Biological Computing · Cellular Automata and Applications · Wastewater Treatment and Nitrogen Removal

Full text

A General Overview of Formal Languages for Individual-Based Modelling of Ecosystems

Mauricio Toro

Universidad Eafit

[email protected]

Abstract

Various formal languages have been proposed in the literature for the individual-based modelling of ecological systems. These languages differ in their treatment of time and space. Each modelling language offers a distinct view and techniques for analyzing systems. Most of the languages are based on process calculi or P systems. In this article, we present a general overview of the existing modelling languages based on process calculi. We also discuss, briefly, other approaches such as P systems, cellular automata and Petri nets. Finally, we show advantages and disadvantages of these modelling languages and we propose some future research directions.

1 Introduction

The collective evolution of a group of individuals is of importance in many fields; for instance, in system biology, ecology and epidemiology. When modelling such systems, we want to know the emergent behavior of the whole population given a description of the low-level interactions of the individuals in the system. As an example, in eco-epidemiology the focus is on the evolution in time of the number of individuals infected in a certain population and how a small number of individuals infected may lead to an epidemic.

Eco-epidemiology can be seen as a particular case of population ecology. The main aim of population ecology is to gain a better understanding of population dynamics and make predictions about how populations will evolve and how they will respond to specific management schemes. In epidemiology, such management schemes can be a cure to a disease, mechanisms to prevent a disease such as vaccines, or mechanisms to prevent the vector (species infected with a disease) to spread a disease. To achieve these goals, scientists construct models of ecosystems and management schemes (e.g., [108]).

Luisa Vissat et al. argue that we can use these models to predict the system behaviour and use this predictive power to support decision-making [45]. As an example, in the case of the spread of diseases, models are used to find the optimal control strategies for containing the spread or to predict the results of a population reduction, as a key way to control disease in livestock and wildlife [63].

Various formalisms have been proposed in the literature for the individual-based modelling of ecological systems. These formalisms differ in their treatment of time and space and each modelling language offers a distinct view and techniques for analyzing systems. Surprisingly, many formalisms developed for population modelling have been also used to model music interaction in real life performance; for instance process calculi has been used to model real-time performance in music [105], computer music improvisation [85] and interactive music scores [102]. In fact, process calculi have been extensively applied to the modeling of interactive music systems [95, 94, 92, 88, 93, 87, 101, 90, 91, 86, 89, 85, 100, 106, 84, 105, 1, 102, 96, 55, 82, 79, 81, 83, 2, 99, 80, 97, 98, 78] and ecological systems [103, 60, 104, 61]. In addition, research on algorithms [57, 53, 70, 67, 69] and software engineering [74, 46] also contributes to this field.

In what follows, we explain, in Section 2, classic process calculi and different extensions of classic process calculi used to model population systems and some applications in ecology; in Section 3, P systems and some extensions used to model population systems; and, in Section 4, other approaches used for the formal modelling of population systems. Finally, in Section 5, we discuss the advantages and disadvantages of these formalisms and we recommend some future research directions in this field. Note that the formalisms explained in this article are not only strictly related to applications in ecology. Futhermore, this paper focuses on process calculi and P systems, but we also present other approaches for formal modelling of population systems such as cellular automata and Petri nets.

2 Process calculi

Process calculi (or process algebras) are a diverse family of related approaches for formally modelling concurrent systems. Process calculi provide a tool for the high-level description of interactions, communications, and synchronizations between a collection of independent processes. Process calculi also provide algebraic laws that allow process descriptions to be manipulated and analyzed, and allow formal reasoning about equivalences between processes.

In what follows, first we explain classic process calculi. After that, we explain, in more detail, different extensions of classic process calculi used to model population systems and some applications in ecology. Note that process calculi explained in this section are not only strictly related to applications in ecology.

2.1 Classic Process Calculi

CCS.

The calculus of communicating systems (ccs) [50] is a process calculus introduced by Robin Milner around 1980. The calculus includes primitives for describing parallel composition, choice between actions and scope restriction. ccs is useful for evaluating qualitative correctness properties of a system such as deadlock or livelock. The expressions of the language are interpreted as a labeled transition system. Between two labeled transition systems, bisimilarity is used as a semantic equivalence. In fact, there is a tool to check bisimulation, as well as for the simulation and model checking of ccs and sccs 111http://homepages.inf.ed.ac.uk/perdita/cwb/doc/.

SCCS.

Soon after the creation of ccs, Milner developed an extension of ccs. In synchronous ccs (sccs) [51], concurrent processes proceed, in contrast to ccs, simultaneously in lockstep. This means that at every step the concurrent processes in a composite system perform a single action.

CSP.

Communicating sequential processes (csp) [38] is a process calculus first described in 1978 by C. A. R. Hoare. csp is different from ccs: In csp, processes composed in parallel cooperate in a multi-party synchronization. In contrast, in ccs, parallel processes interact with each other by means of a two-party synchronization. Another difference between csp and ccs is that synchronization in csp is made by symmetrical actions, whereas in ccs there are actions and co-actions. Soon after the invention of ccs and csp, the need of sending channel names to channels themselves motivated the invention of the pi-calculus.

Pi-calculus.

The pi-calculus (or $\pi$ -calculus) [52] is a process calculus developed by Robin Milner, Joachim Parrow and David Walker in 1992, as a continuation of Milner’s work on ccs. The pi-calculus allows channel names to be communicated along the channels themselves. Using this feature, one is able to describe concurrent computations whose network configuration may change during the computation.

CCP.

Concurrent constraint programming (ccp) was developed by Vijay Saraswat in 1991. In ccp, a system is modeled in terms of processes adding to a common constraint store partial information on the value of variables. Concurrent processes synchronize by blocking until a piece of information can be deduced from the store. ccp is based on the idea of a constraint system (cs). A constraint system is composed of a set of (basic) constraints and an entailment relation specifying constraints that can be deduced from others.

2.2 Stochastic process calculi

Stochastic process algebras aim to combine two successful approaches of modelling: labeled transition systems and continuous-time Markov chains [31]. Indeed, they argued that labeled transition systems are a very convenient framework to provide compositional semantics of concurrent languages and to prove qualitative properties. Markov chains, instead, have been used in performance evaluation and quantitative properties. The common feature of most stochastic process calculi is that their actions are enriched with rates of exponentially distributed random variables characterizing their mean duration.

Continuous Pi-Calculus.

Ever since the invention of the pi-calculus, there has been an interest to use it to model ecosystems. A successful development on this direction is an extension of the pi-calculus with continuous time. The continuous pi-calculus [43] is an extension of the pi-calculus in which the processes evolve in continuous time. It can be seen as modular way to generate ordinary differential equations for a system behavior over time. The continuous pi-calculus has been used to model biochemical systems [43].

Stochastic ccs.

In stochastic ccs (stoccs), output actions are equipped with a parameter characterizing a random variable with a negative exponential distribution, modelling the duration of the action [31]. Input actions are denoted with a weight, a positive integer used to determine the probability that the specific input is selected when a complementary action is executed [31]. There are no tools available for their simulation nor verification up to our knowledge. This calculus has not been used to model ecosystems, but the definition of rate-based transition systems has inspired many other modelling languages [31].

As argued by Cardelli et al. in [23], one cannot guarantee associativity of parallel composition operator up to stochastic bisimilarity when the synchronization paradigm of ccsis used in combination with the synchronization rate computation based on apparent rates. This is a problem, specially, in presence of dynamic process creation.

Stochastic pi-calculus.

The stochastic pi-calculus is an extension of the pi-calculus with rates associated to the actions, developed by Corrado Priami in 1995. Recently, Cardelli et al. developed new semantics with replication and fresh name quantification [23]. Cardelli et al. argued that parallel composition failed to be associative up to stochastic bisimilarity with previous semantics. With the new semantics it is now possible to capture associativity, which will lead to new applications and simulation tools. In fact, there is already a simulation tool for the stochastic pi-calculus named the stochastic pi machine (Spim) 222http://research.microsoft.com/en-us/projects/spim/; Spim uses the probabilistic model checker prism. Another approach for model checking has been developed by Norman et al. [54].

Stochastic pi-calculus has been used in system biology to model regulating gene expression by positive feedback [62].

PEPA.

The properties that may be checked for concurrent systems modeled by an algebraic description include freedom from deadlock and algebraic equivalence under observation [35]. Hillston et al. argue that there are further properties of interest for process calculi models such as steady-state probabilities and rewards for performance measures. Performance evaluation process algebra (pepa) [35] was defined by Hillston et al., in 1994, to allow the specification and verification of such properties. Every activity in pepa has an associated duration which is a random variable with an exponential distribution, determined by a single real-number parameter. A pepa process can be translated into a continuous-time Markov process [35].

pepa has been used in system biology to model the raf kinase inhibitor protein on the extracellular signal regulated kinase signalling pathway [19] and to model measles epidemics in the UK from 1944–1964 [11].

Bio-pepa.

Bio-pepa [27] is a process calculus for modelling and analyzing biochemical networks. Bio-pepa is a modification of pepa adding some features such as support for stoichiometry and general kinetic laws. Kinetic laws are functions which allow us to derive the rate of reactions from varying parameters such as the rate coefficients and concentration of the reactants [56]. According to Hillston et al., the main difficulty with pepa is the definition of stoichiometric coefficients [27]: These coefficients are used to show the quantitative relationships of the reactants and products in a biochemical reactions. A major feature of Bio-pepa is the possibility to represent explicitly some features of biochemical models such as stoichiometry and the role of species in a given reaction. Bio-pepa is also enriched with some notions of equivalence such as isomorphism and bisimulation, extended from pepa. Bio-pepa has been used in epidemiology to model the H5N1 avian influenza [28].

Stochastic CCP.

Stochastic ccp (sccp) was developed by Bortolussi [18], in 2006. sccp is an extension of ccp by adding a stochastic duration to all instructions interacting with the constraint store. In sccp, each instruction has an associated random variable, exponentially distributed [18]. Bortolussi et al. also proposed an approximation of sccp semantics into ordinary differential equations [16].

Bortolussi et al. have used sccp to model bio-mechanical reactions such as an enzymatic reaction [18]. sccp has also been in system biology [17] and to model prey-predator dynamics [16].

2.3 Probabilistic Process Calculi

Process calculi depend on concepts of non-deterministic choice. Tofts argues that it is natural to consider extending such systems by adding a probabilistic quantification for non-determinism [77]. In terms of expressiveness of modelling, McCaig et al. argue that probabilistic choice is a more natural way to express individual behavior than stochastic rates of activities for epidemiological models [48]. Another alternative for semantics of probabilistic calculi is having both non-deterministic and probabilistic choice, as explained by Norman el at. [54].

Weighted SCCS.

A probabilistic calculus, derived from Milner’s sccs [51] , is weighted sccs (wsccs) [77]. This calculus was developed by Chris Tofts, in 1994. wsccs is synchronous because the purpose is to quantify the relative frequency of free simultaneous choice. In an asynchronous system, choices may be resolved at arbitrary times, thus giving a choice which may not be between equally free objects. wsccs does not quantify choices directly with probabilities, but it uses weights: Weights are interpreted as probabilities via the concept of relative frequency.

McCaig et al. presented a new semantics in terms of mean field equations in [48]. This semantics captures the average behavior of the system over time, without computing the entire state space; therefore, avoiding the state-space explosion problem. The new semantics is shown to be equivalent to the standard discrete-time Markov chain semantics of wsccs, as the number of processes tends to infinity. Using wsccs’s probabilistic workbench, one can analyze systems in wsccs [48]. There is also some work on population models for stochastic process algebras done by Hillston et al. [37, 107]. A related interesting work is the model checking of mean-field approximations of continuous-time Markov chains [15].

Note that this calculus has been used to model human-population growth [48] and wsccs has been employed in various ecological studies by its author and others [75, 47].

Probabilistic csp#.

Hierarchical probabilistic csp (pcsp#) [76] extends csp with probabilistic choice and data structures. pcsp# was developed by Jun Sun et al. in 2010. pcsp# combines sequential programs defined in a simple imperative language such as C# with high-level specifications such as parallel composition, choice and hiding, as well as probabilistic choice. Its underlying semantics is based on Markov decision processes.

Although pcsp#’s semantics are given on Markov decision processes, Sun et al. argue that existing probabilistic model checkers have been designed for simple systems without hierarchy [76]. For that reason, they extended an existing model checker to support probabilistic model checking of hierarchical complex systems. Probabilistic csp has been used to model sensor networks, security protocols and probabilistic algorithms. pcsp# has not been used to model ecosystems.

Probabilistic pi-calculus.

The probabilistic pi-calculus was developed by Herescu et al. in 2000 [36]. In the probabilistic pi-calculus, there are two types of choice operators: non-deterministic and probabilistic. The operational semantics for probabilistic extensions of the pi-calculus are typically expressed in terms of Markov decision processes. Existing semantics are concrete, which means processes are encoded directly into Markov decision process. Norman et al., defined a symbolic representation of the calculus semantics [54]. The main feature of Norman et al.’s symbolic semantics is that the input variable of input transitions is kept as a name variable and the communication rule that matches between the input and the output channel is represented by a condition [54]. Norman et al. used those semantics for model checking of the calculus using probabilistic model checker prism. Norman et al. also used this calculus to model and verify randomized security protocols [54]. Probabilistic pi-calculus has also been used for model checking models of system biology [42].

Probabilistic CCP.

There is a timed, probabilistic, non-deterministic extension of ccp (pntcc) [58], developed by Pérez et al. in 2009. They defined the semantics as a discrete-time Markov chain in which non-determinism is unobservable because it is removed by a scheduler that must be defined along with the model. pntcc has not been used for models in system biology nor ecology.

2.4 Spatial Process Calculi

Spatial aspects and locations are required in ecosystems because such systems can be physically distributed in space and this distribution can affect the length of time required for an activity to occur [33]. Spatial process calculi are extensions of process calculi with different notions of space. Spatial features of biological systems can be studied in different ways [56]. Pardini argues that most computer-science formalisms focus on the abstract spatial feature which concerns the modelling of compartments; nonetheless, he also argues that there are other formalisms that focus on a more concrete notion of space (e.g., continuous 2D space) [56].

palps.

Process algebra with locations for population systems (palps) can be considered as an extension of ccs with probabilistic choice, and with locations and location attributes [61]. It shares a similar treatment of locations with process algebras developed for reasoning about mobile ad hoc networks; for instance, Galpin’s spatial extension to pepa and Kouzapas et al.’s calculus for dynamic networks [41]. Nonetheless, such approaches are stochastic. palps shares some similarities with sccs although it does not provide synchronous parallel composition nor probabilistic choice based on weights. palps considers a two-dimensional space where locations and their interconnections are modeled as a graph upon which individuals may move as computation proceeds. Compartments are static in palps. In this matter, palps is related to the spatial calculi previously described such as the calculus of wrapped compartments or to cellular automata. palps was used to model prey-predator dynamics [61].

palps was also extended with process ordering because simulations carried out by ecologists often impose an order on the events that may take place within a model [60]. As an example, if we consider mortality and reproduction within a single-species model, three cases exist: concurrent ordering, reproduction preceding mortality and reproduction following mortality. Process ordering in palps was used to model the reproduction of the parasitic varroa mite that attacks honey bees [60].

Synchronous palps (s-palps).

To alleviate the problem of the interleaving nature of parallel composition and high level of non-determinism in palps, Toro el al. proposed a new semantics of palps, and an associated prism translation, that disassociates the number of modules from the maximum number of individuals, while capturing more faithfully the synchronous evolution of populations and removing as much unnecessary non-determinism as possible. The synchronous extension of palps is named synchronous palps (s-palps) [104]. This synchronous semantics of palps implements the concept of maximum parallelism: at any given time all individuals that may execute an action will do so simultaneously.

Toro et al. developed a mean-field semantics for s-palps [103]. This semantics allows an interpretation of the average behavior of a systems as a set of recurrence equations. Recurrence equations are an approximation useful when dealing with a large number of individuals, as it is the case of epidemiological studies. s-palps was used to model dengue transmission in Bello (Antioquia), Colombia, in particular, to analyze the factors involved in the epidemiology of dengue disease [103]. The model presented by Toro et al. in [103] is an individual-based version of the model presented in [108].

Spatial pepa.

Galpin introduced spatial notions to the stochastic process algebra pepa, creating spatial pepa [33], in 2009. Galpin was motivated by both computer networks and epidemiology where locations of actions or processes may affect the time taken by an event. She defined a very general stochastic process algebra with locations, where locations are introduced to both actions and processes.

Galpin defined a weighted hypergraph structure to represent locations. The locations may or may not have structure. Nodes in a directed graph may represent computers in a network or locations in n-dimensional space. The graph edges could represent simple network connectivity, or a tree structure representing nesting of locations. As another example of its generality, the weight in the hypergraph structure could be obtained from distance function or from something else.

Galpin used Spatial pepa to model single packet traversal through network [33]. Spatial pepa has not been used to model ecosystems.

BioAmbients.

BioAmbients [68] calculus was developed by Cardelli et al. in 2004. It was motivated by the Ambient calculus [72], originally developed by Cardelli for the specification of process location and movement through computational domains [22]. Bio-molecular systems, composed of networks of proteins, underlie the major functions of living cells. Compartments are the key to organization of such systems, according to Cardelli et al., BioAmbients is inspired in such biological compartments. The novelties with respect to the Ambient calculus are the movement of molecules between compartments, the dynamic rearrangement of cellular compartments and the interaction between molecules in a compartmentalized setting. Cardelli et al. adapted the BioSpi simulation tool for their new calculus and used it to study a complex multi-cellular system [68].

Brane calculus.

A biological cellular membrane is an oriented closed surface that can perform various molecular functions. One constraint of such membranes is bi-tonality. This constraint requires nested membranes to have opposite orientations, so they can be coded with a coloring system of two tones. The organization of these membranes inspired the creation of the Brane calculus, in 2004, by Cardelli et al. [20]. In Brane, there is composition of systems, composition of membranes and replication. There is also a choice operation that can be added to membranes and communication on ccs style. Cardelli argues that it is difficult to encode this calculus on BioAmbients although they are closely related. The Brane calculus has been used to model the behavior of the semliki forest virus [49].

Calculus of wrapped compartments.

The calculus of wrapped compartments (cwc) [14] is also based on the notion of a compartment. cwc has no explicit structure modelling a spatial geometry; nonetheless, Bioglio et al. argue that the compartment labeling feature can be exploited to model various examples of spatial interactions in a natural way [14].

To model 2D-space, cwc allows to label a compartment as a 2D-coordinate, denoted by a tuple of integers (row,column), thus they can define a two-dimensional grid [14]. Modelling the space as a two-dimensional grid, this calculus was used to analyze the growth model for arbuscular mychorrizal fungi [14]. Biochemical transformations are described via a set of stochastic reduction rules which characterize the behavior of the represented system. Stochastic simulation can be defined along the lines of Gillespie’s simulation. Bioglio et al. performed simulations of the growth model using the surface language (a language they developed) [14].

Spatial pi-calculus.

Most calculi have focused on population-based approaches or on how to place and move individuals relative to each others position in space. SpacePi [39] extends the pi-calculus with continuous time and space. Processes are embedded into a vector space and move individually. Only processes that are sufficiently close can communicate.

According to John et al., existing spatial individual-based approaches have so far focused on indirect, relative space [39]. For that reason, John et al. developed SpacePi to explicitly model location and movement of individuals in absolute space. By combining an individual-based perception and absolute space, not only diffusion processes, but also active transportation inside membranes can be modeled realistically at different resolutions [39]. This makes their approach different from other pi-calculus extensions, like the stochastic pi-calculus. This implies that existing simulation engines based on Gillespie’s algorithm cannot be used; this also implies, that simulation will be less efficient than the one obtained with engines based on Gillespie’s algorithm. This calculus has been used by John et al. to model the behavior of a euglena, a micro-organism living in inland water which motion is influenced by sun irradiation. John et al. ran simulations on a tool of their own [39].

Spatial ccp .

Barco et al. recently proposed a spatial extension of ccp [7]. In spatial ccp, each process has a computation space that can be shared with other processes. In addition, it is possible to have nested computation spaces. Spatial ccp has not been used to model ecosystems.

Modelling in Ecology with

Location Attributes.

Modelling in Ecology with Location Attributes (mela) is a process calculus to formally describe ecological systems, with focus on space abstraction and environmental description [45]. Like other stochastic process algebras, mela allows complementary analysis techniques, such as stochastic simulations, numerical solution of ordinary differential equations. In mela, the modeller can choose between different spatial structures; for instance, a graph, a discrete line segment, a 2D grid, a 3D grid, or a nested spatial structure. mela has been used to model prey-predator dynamics (the Lotka-Volterra model) and the spread of a Cholera epidemic [45]. A disadvantage of mela is the lack of tools to support the spatial and temporal analysis that could be possible using its semantics.

Process Algebra for Located

Markovian Agents.

Process Algebra for Located Markovian Agents (paloma) [32] uses multi-class multi-message Markovian agents to model collective systems comprised of populations of agents which are spatially distributed. paloma is equipped with both discrete event and differential semantics. The discrete semantics provides the theoretical foundation for discrete event simulation and the differential semantics allows us to automatically derive the underlying mean-field model. paloma was used to model a simplified scenario of the 1918-1919 flu epidemic in central Canada and to investigate the effect of quarantine on the spread of the flu.

Other extensions of CCS.

Bartocci et al. proposed an extension of CCS to model 3D space. This extension is called the shape calculus [9, 8].

Other extensions of the Pi-calculus.

Pardini described several spatial extensions of the Pi-calculus [56]: He mentioned Beta binders, $\pi@$ calculus, attributed $\pi$ -calculus, the imperative pi-calculus and the $3\pi$ calculus. In Beta binders, the notion of compartments is represent by the means of boxes which contain pi-calculus processes. In $\pi@$ calculus, channels are associated with compartments. In attributed pi-calculus, processes can have attributes, for instance, information of their location. In imperative pi-calculus, there is a global store with information on the volume of each compartment. Finally, in the $3\pi$ calculus, processes are embedded in 3D space. It is worth to note that some of these extensions have been also extended with stochasticity, as it is the case of the $\pi@$ calculus.

Beta binders has been used to model biological interactions [64] and the $\pi@$ calculus has been used to model the euglena’s phototaxis, gene regulation at lambda switch and population models [40].

3 P systems

A different approach towards modelling of ecological systems is that of P systems [65, 66]. P system [65] is a class of distributed and parallel computing described by Gheorghe Păun in 2001. Each membrane represents a region and contains a multiset of objects. A cell is considered, in an abstract way, a hierarchy of compartments enclosed by membranes. Each compartment may include elementary objects as well as other compartments. Processes in a cell are viewed as sequences of discrete events.

Romero et al. argue that P systems started from the assumption that the processes taking place in the compartmental structure of a living cell can be interpreted as computations [71]. Romero et al. argue that a P system consist of cell-like membrane structures (known as compartments) in which one places multisets of objects that evolve according to given rewriting rules (i.e., evolution rules).

3.1 Stochastic P systems

Spicher et al. described a sequential strategy for executing P systems based on Gillespie’s stochastic simulation algorithm [73]. They defined a stochastic simulation of a new class of P systems. This new kind of P systems are called stochastic P systems. In rewriting systems, the policy for deciding which rule(s) will be applied is called the application strategy [73]. They argued that in case of conflicts, rules are selected non-deterministically. Maximal-parallel strategy is well suited for the modelling of discrete dynamic systems operating synchronously. This strategy means that the rules are applied simultaneously in a maximal way during each rewriting step. This strategy is less suited to capture events that occur asynchronously in continuous time.

Gillespie’s algorithm is an alternative rule application strategy for P systems. Gillespie’s algorithm leads to a sequential application strategy for these rules: only one rule is applied in each derivation (simulation) step [73].

Spicher et al. implemented stochastic P systems using the spatially-explicit programming language MGS [73]. Spicher et al. have used stochastic P systems to model the Lotka-Volterra auto-catalytic system and the life cycle of the similiki forest virus [73]. Romero et al., provide a translation of stochastic P systems to prism, transforming a P system specification into a prism input [71]. Romero et al.’s translation allows model checking of stochastic P systems.

3.2 Probabilistic P systems

Probabilistic P systems have been applied to a wide range of situations in the field of ecology (e.g., [13]). The semantics of probabilistic P systems is closely related to s-palps: rules are usually applied with maximal parallelism while several proposals have been considered on resolving the non-determinism that may arise when more than one combination of rules is possible, for instance, [59, 24]. Probabilistic P systems have been applied to model the population dynamics of various ecosystems [12, 10] as well as to study the invasiveness of eastern species in European water frog populations [4]. As an example, there is a model of mosquitoes in Italy that uses attributes such as temperature and humidity [10].

In probabilistic P systems, the rules include a real number between zero and one: a probability of the rule for its application [3]. As expected, the sum of all rules with the same left-hand side of a rule must be one. The rules are applied in a maximal-consistent parallel way, as usual. The inherent randomness and uncertainty in ecosystems is captured by using probabilistic strategies.

Another application of probabilistic P systems in ecology is to model an ecosystem of some scavenger birds [3]. Cardona et al. described a P system for modelling an ecosystem based on three vulture species and the prey species from which vulture species obtain most of their energy requirements [3]. They include rules to limit the maximum amount of animals that can be supported by the ecosystem as well as the amount of grass available for the herbivorous species. They consider two regions: the skin and the inner membrane. The first region controls the densities of every species do not overcome the threshold of the ecosystem. The skin is where the animals reproduce and the inner membrane where they feed or die. Note that this model is not spatially explicit. Cardona et al. simulated their ecosystem model using the specification language P-Lingua for P systems [3].

3.3 Other extensions of P systems.

There are at least two extensions to probabilistic P systems: Dynamical probabilistic P systems [25] and multi-environment P systems [56]. In both extensions, probabilities are associated with rules and the values may vary through time. A multi-environment P system model is composed of a set of environment, each containing a probabilistic P system. All the P systems inside the environments share some common features such as the alphabet of symbols, membrane structure and evolution rules. Multi-environment P systems have been used to model a simple (but realistic enough) ecosystem where a carnivore and several herbivorous species interact [29].

Another class of term rewriting formalism, similar to P systems, is the calculus of looping sequences (cls) that allows to represent membranes explicitly with the calculus syntax. By the means of rewriting rules, it is possible to create, dissolute and merge membranes [56]. Recently, Pardini et al. extended this calculus with spatial information [6].

4 Other approaches

There exists a variety of other proposals which introduce locations or compartments into formal frameworks. In what follows we explain some of them.

Processes in Space.

This modelling language [21] has been used to analyze the influence of network topologies on local and global dynamics of metapopulation systems [13].

Cellular Automata.

Cellular automata were proposed by Von Neumann, in 1949 . There are one-, two- and three-dimensional cellular automata. Cellular automata schematize the space into a regular lattice, according to the spatial scale of the system studied. Each cell has some properties and takes a value from a finite set of states. The value is updated in discrete time steps. Rules are updated with a function that depends on the current state of the cell itself and its neighbored cells. The rules can be deterministic, stochastic or empirical. There are several tools for the simulation of cellular automata. Chen et al. developed a rule-based stochastic cellular automata model to simulate the classical predator-prey Lotka-Volterra model [26].

Petri nets.

Heiner et al. described a Petri net-based framework for modelling and analyzing biochemical pathways [30]. The framework unifies qualitative, stochastic and continuous paradigms. According to Heiner et al., each perspective adds its contribution to the understanding of the system. Heiner et al. modeled a extracellular signal-regulated kinases (ERK) transduction pathway system [30]. A pathway describes a sequence of interactions among molecules [56]. In the particular case of signal transduction pathway, there is a process, called receptor, which receives an external stimulus. The receptor triggers a chain of reaction which end on some internal result.

Heiner et al. focused on transient behavior analysis of the transduction system. The stochastic Petri net describes a system of stochastic reaction rate equations and the continuous Petri net is an structured description of ordinary differential equations. Note that arcs in Petri can be annotated with stoichiometric information. Qualitative Petri nets can be analyzed by a branching-time temporal logic. The stochastic Petri nets can be analyzed by a continuous stochastic logic. As usual, the continuous model replaces the discrete values of species with continuous values which represent the overall behavior via concentrations. This means that the concentrations of a particular species in such a model will have the same value at each point of time for repeated experiments. Continuous model can also be analyzed with a linear-time temporal logic.

Grid Systems.

Grid Systems is a formalism to model population dynamics proposed by Barbuti et al. [5]. The formalism is inspired by concepts of P systems and spatiality dynamics of cellular automata. In Grid Systems, environmental events that change population behaviour can be defined as rewrite rules. There is a simulation tool for Grid Systems developed in Java [5].

Grid Systems was used to model a population model of a species of mosquitoes, Aedes albopictus, that considers three types of external events: temperature change, rainfall, and desiccation [5]. The events change the behaviour of the species directly or indirectly. Each individual in the population can move around in the ecosystem.

5 Conclusions

In this article, we presented different formal languages for individual-based modelling of ecosystems. Some are based on process calculi, some on P systems and others on cellular automata or Petri nets. All these existing formal languages are complementary to each other and should not be seen as competition.

We presented several stochastic formalisms, for instance, stochastic process calculi and stochastic P systems. An advantage of stochastic formalisms is that they have semantics in terms of continuous-time Markov chains. It is well-known that continuous-time Markov chains for large numbers of individuals can be approximated by ordinary differential equations (ode) and there are several numeric methods that are computationally efficient to simulate odes. The disadvantage of these formalisms is that, sometimes, it is more intuitive to model systems with discrete time because stochastic rates are not known and, instead, we only know the probabilities that an individual will change from one state to another.

We also presented several probabilistic formalisms, for instance, probabilistic process calculi and probabilistic P systems. As we argued before, an advantage of this type of formalisms is that, sometimes, it is easier to model systems in discrete time by considering the probabilities that an individual will execute a certain action, instead of modelling stochastic rates. However, the semantics of these languages suffer from state-space explosion, since they have an exponential number of states. A solution for this problem is to represent the average behavior of the system using mean-field equations, as it was done for wsccs and s-palps. Another disadvantage of these formalisms is that they usually consider both probabilistic and non-deterministic behavior. This semantics limits the properties that can be verified for these systems because non-deterministic behavior cannot be approximated by probabilities.

There is another interesting category of formalisms which are spatial extensions of probabilistic and stochastic process calculi or P systems. The semantics of these modelling languages is either stochastic or probabilistic, but they offer simpler ways to represent space –either continuously or discrete–, which facilitates the modelling of complex temporal properties of population systems.

Spatial, probabilistic and stochastic modelling languages have some drawbacks as modelling formalisms for ecosystems. Garavel argues that models based on process calculi are not widespread because there are many calculi and many variants for each calculus, being difficult to choose the most appropriate [34]. Garavel also argued that existing tools for process calculi are not user-friendly. We claim that these problems are also present in P systems and other formalisms to model population systems.

A research direction is how to develop user-friendly tools to simulate process calculi and P systems; for instance, to develop web applications or mobile applications that do no require extensive knowledge on Unix, Latex or programming to be used. This could help to make these formalisms spread world wide.

Another research direction is to analyze and simulate larger case studies, studying how different evolutions can arise, depending on different assumptions, such as spatial configurations, interaction rates and the role of the environment, high-level reasoning about them. Since spatial dynamics and temporal evolution of ecological systems are of key interest, an interesting further step will be to develop a complementary spatio-temporal logic, suitable for formally describing and verifying spatio-temporal properties of the different formalisms. A research on that direction is the three-valued spatio-temporal logic developed by Luisa Vissat et al. [44].

Finally, we argue that research in formal modelling languages for population systems in ecology has produced a vast corpus of deep, valuable results; however, these results are only known, understood and applied by a small fraction of computer scientists. As a consequence, their practical impact is not as strong as it could be, in spite of successful attempts at using process calculi, P systems and other approaches to model and verify properties of ecological systems. We believe, however, that these formal modelling languages still have an important role to play in the future, but more interdisciplinary work is needed in collaboration with ecologists, software developers and policy makers.

Bibliography108

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Allombert, M. Desainte-Catherine, and M. Toro. Modeling temporal constrains for a system of interactive score. In G. Assayag and C. Truchet, editors, Constraint Programming in Music , chapter 1, pages 1–23. Wiley, 2011.
2[2] J. Aranda, G. Assayag, C. Olarte, J. A. Pérez, C. Rueda, M. Toro, and F. D. Valencia. An overview of FORCES: an INRIA project on declarative formalisms for emergent systems. In P. M. Hill and D. S. Warren, editors, Logic Programming, 25th International Conference, ICLP 2009, Pasadena, CA, USA, July 14-17, 2009. Proceedings , volume 5649 of Lecture Notes in Computer Science , pages 509–513. Springer, 2009.
3[3] R. Barbuti, A. Bompadre, P. Bove, P. Milazzo, and G. Pardini. Attributed probabilistic p systems and their application to the modelling of social interactions in primates. In Proceedings of SEFM 2015, LNCS 9509 , pages 176–191. Springer, 2015.
4[4] R. Barbuti, P. Bove, A. Maggiolo-Schettini, P. Milazzo, and G. Pardini. A computational formal model of the invasiveness of eastern species in European water frog populations. In Proceedings of MOKMASD 2013, LNCS 8368 , pages 207–218. Springer, 2013.
5[5] R. Barbuti, A. Cerone, A. Maggiolo-Schettini, P. Milazzo, and S. Setiawan. Modelling population dynamics using grid systems. In Proceedings of SEFM 2012, LNCS 7991 , pages 172–189. Springer, 2014.
6[6] R. Barbuti, A. Maggiolo-Schettini, P. Milazzo, and G. Pardini. Spatial Calculus of Looping Sequences. Theoretical Computer Science , 412(43):5976–6001, 2011.
7[7] A. Barco, S. Knight, and F. D. Valencia. K-Stores: A Spatial and Epistemic Concurrent Constraint Interpreter. In Proceedings of WFLP 2012 . Informal proceedings, 2012.
8[8] E. Bartocci, D. R. Cacciagrano, M. R. D. Berardini, E. Merelli, and L. Tesei. Timed Operational Semantics and Well-Formedness of Shape Calculus. Scientific Annals of Computer Science , 20:32–52, 2010.