Graph-Based Modeling and Simulation of Complex Systems
Jordan Jalving, Yankai Cao, Victor M. Zavala

TL;DR
This paper introduces graph-based abstractions for modeling complex cyber-physical systems, enabling scalable optimization and simulation across diverse computing architectures through a Julia package.
Contribution
It presents novel algebraic and computing graph abstractions that improve modeling, analysis, and simulation of complex systems in a scalable and flexible manner.
Findings
The algebraic graph abstraction aids in optimization problem decomposition.
The computing graph abstraction supports implementation of algorithms in distributed environments.
The Julia package Plasmo.jl enables practical application of these abstractions.
Abstract
We present graph-based modeling abstractions to represent cyber-physical dependencies arising in complex systems. Specifically, we propose an algebraic graph abstraction to capture physical connectivity in complex optimization models and a computing graph abstraction to capture communication connectivity in computing architectures. The proposed abstractions are scalable and are used as the backbone of a Julia -based software package that we call Plasmo.jl . We show how the algebraic graph abstraction facilitates the implementation, analysis, and decomposition of optimization problems and we show how the computing graph abstraction facilitates the implementation of optimization and control algorithms and their simulation in virtual environments that involve distributed, centralized, and hierarchical computing architectures.
| Variable | Description | Units |
|---|---|---|
| Time | ||
| Spatial dimension | ||
| Density | ||
| Velocity | ||
| Volumetric flow rate | ||
| Pressure | ||
| Mass flow rate | ||
| Pipe inlet flow | ||
| Pipe outlet flow | ||
| Supply flow | ||
| Demand flow | ||
| Node Pressure | ||
| Boost Pressure | ||
| Compressor Power | ||
| Line-pack |
| Parameter | Description | Units |
|---|---|---|
| , | Space and time discretization interval length | |
| Speed of sound in gas | ||
| Gas heat capacity for constant pressure and volume | ||
| Isentropic expansion coefficient and compressibility | ||
| Universal gas constant | ||
| Gas molar mass | ||
| Gas density at normal conditions | ||
| Gas temperature | ||
| Pipeline length, diameter, and cross sectional area | ||
| Friction factor and pipe rugosity | ||
| Scaling factor for flow | ||
| Scaling factor for pressure | ||
| Auxiliary constant | ||
| Auxiliary constant | ||
| Auxiliary constant | ||
| Auxiliary constant |
| Parameter | Value | Units | Parameter | Value | Units |
|---|---|---|---|---|---|
| 16.1475 | 1.0 | ||||
| 0.6291 | 1.0 | ||||
| 0.3593 | 0.5 | ||||
| 387.594 | 1000 | ||||
| 12.3137 | 4.2 | ||||
| 0.6102 | 0.98 | ||||
| 0.3760 | 359.1 | ||||
| 386.993 | 2769.44 | ||||
| 15.0 | 2500.0 | ||||
| 0.2928 | 6013.95 | ||||
| 0.67 | 7216.74 | ||||
| 387.01 | -167.4 | ||||
| 6.3778 | -139.5 | ||||
| 26.0601 | 5.0 | – | |||
| 6.8126 | 1.0 | – | |||
| 5.0382 | 0.5 | – | |||
| 56.7989 | |||||
| 5.0347 | |||||
| 63.1766 | |||||
| 69.9892 | |||||
| 12.6224 |
| State | Value | Units | Input | Value | Units |
|---|---|---|---|---|---|
| 25.4702 | 1.1866 | ||||
| 0.1428 | 29.0597 | ||||
| 0.7045 | 7.0263 | ||||
| 415.944 | 5.1067 | ||||
| 5.4703 | 11.6962 | ||||
| 0.3653 | 5.09834 | ||||
| 0.5307 | 12.8828 | ||||
| 399.303 | 19.9091 | ||||
| 15.0 | 8.0960 | ||||
| 0.1565 | |||||
| 0.67 | |||||
| 399.364 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Simulation Techniques and Applications · Scientific Computing and Data Management
Graph-Based Modeling and Simulation of Complex Systems
Jordan Jalving*†‡*
Yankai Cao*†* and Victor M. Zavala*†*
† Department of Chemical and Biological Engineering
University of Wisconsin-Madison, 1415 Engineering Dr, Madison, WI 53706, USA
‡ Decision and Infrastructure Sciences Division
Argonne National Laboratory, 9700 South Cass Ave, Lemont, IL 60439, USA
Abstract
We present graph-based modeling abstractions to represent cyber-physical dependencies arising in complex systems. Specifically, we propose an algebraic graph abstraction to capture physical connectivity in complex optimization models and a computing graph abstraction to capture communication connectivity in computing architectures. The proposed abstractions are scalable and are used as the backbone of a -based software package that we call . We show how the algebraic graph abstraction facilitates the implementation, analysis, and decomposition of optimization problems and we show how the computing graph abstraction facilitates the implementation of optimization and control algorithms and their simulation in virtual environments that involve distributed, centralized, and hierarchical computing architectures.
Keywords: graphs, cyber-physical, connectivity, algebraic, computing
1 Introduction
Complex systems are cyber-physical in nature, in the sense that a physical system is driven by decisions made by a cyber (computing) system [32]. For instance, a chemical process is a physical system that is driven by decisions made by a control system, which is in turn a cyber system comprised of a collection of devices (e.g., sensors, controllers, actuators) that execute diverse computing tasks (e.g., data processing and control action computation) and that exchange signals and data (e.g., measurements and actions) through a communication network. The devices executing the tasks of a control system form a computing architecture, similar in spirit to a parallel computing cluster in which computing processors are connected through a communication network. Other examples of cyber-physical systems include hierarchical architectures for coordination of supply chain, scheduling, and planning tasks for an enterprise or the control architecture for an infrastructure network such as a natural gas pipeline [11, 4].
Modeling and simulating the behavior of cyber-physical systems is becoming increasingly important but also increasingly challenging. In particular, emerging paradigms such as cloud computing and the internet-of-things are drastically changing the landscape of decision-making architectures. Such reconfiguration is driven by the need to process increasingly larger amounts of data in a distributed manner while making decisions faster and in a more scalable manner. Architectural reconfiguration needs to carefully balance diverse cyber-physical issues such as economic performance, safety, data privacy, as well as computing and communication latency and failures. For instance, failure of a computing device (e.g., a sensor) can lead to significant losses in performance or to full collapse of the physical system.
Capturing interdependencies between cyber and physical systems in a coherent manner is technically challenging. Specifically, the behavior of a physical system is expressed mathematically in the form of an algebraic model (i.e., a set of algebraic equations) while the behavior of a cyber system is expressed mathematically in the form of algorithms. Moreover, in modeling a cyber system, one must consider the fact that algorithms are executed under highly heterogeneous and dynamic computing architectures that exhibit complex computing and communication protocols and logic (e.g., synchronous and asynchronous) and associated time delays [31].
This work proposes graph-based abstractions to facilitate modeling and simulation of cyber-physical systems. The proposed graph-based abstractions seek to provide a coherent framework to capture modeling elements that arise in various engineering domains. In the context of modeling physical systems (specifically chemical processes), early sequential modular and equation-based flowsheeting tools used graph-theoretical insights to express equations in a modular form and to facilitate object-oriented software implementations, analysis, and algorithmic development [48]. Such developments are the basis of powerful simulation environments such as , , and [18, 41]. Graph-theoretical concepts have also been widely used for expressing and processing algebraic models in platforms such as , , , and [21, 15, 6, 3]. The Modelica [22] simulation platform also uses graph concepts such as modularity and inheritance to instantiate and simulate complex systems. Recent advances in algebraic modeling languages have been enabled by [26] and [16]. These open-source tools enable the user to implement models in modular form and to expose structure to facilitate the implementation of parallel solution algorithms. Graph-based modeling abstractions have recently been explored in convex optimization [25], infrastructure networks [27], and simulation of partial differential equations [1], but these abstractions are restricted in that the graph structure is directly tied to physical topology, thus limiting modeling flexibility.
In the context of modeling and simulating cyber systems, the most popular framework is [36]. Here, a system is expressed in terms of blocks of operators (computing tasks) which are connected by communication channels and input and output ports. This framework facilitates the simulation of complex control architectures. Agent-based modeling platforms can also be used to simulate cyber systems; under this abstraction, agents make decisions and communicate under channels [35]. Popular agent-based simulation tools include , , and [38, 13, 34, 37].
While the aforementioned physical and cyber modeling tools exploit graph concepts to facilitate implementation, they do not use a coherent abstraction (which is key to enable extensibility). To enable this, in this work, we introduce general abstractions for modeling cyber-physical systems. Specifically, we propose the concept of an algebraic graph to facilitate modeling of mathematical optimization problems and the concept of a computing graph to facilitate simulation of cyber systems. The graph abstractions exploit physical and communication topologies to facilitate model construction, data management, and analysis. The computing graph abstraction offers advantages over and agent-based tools in that it handles cyber features such as communication delays, latency, and synchronous/asynchronous computing and information exchange in a more coherent manner. This is done by using a state-space representation of the computing graph, which keeps track of task states and which manages actions and timings that trigger state changes in the form of action signals. We also demonstrate how the proposed abstractions can be combined to enable modeling of cyber-physical systems and we provide an implementation in the programming language [5] that we call .
The manuscript is structured as follows. Section 2 presents basic graph terminology for describing the proposed abstractions. Section 3 presents the algebraic graph abstraction for modeling optimization problems with a focus on facilitating data management and problem decomposition. Section 3 also introduces our own implementation of graph abstractions called with a simple example. Section 4 introduces the computing graph abstraction and Section 5 provides case studies for cyber-physical systems.
2 Basic Graph Terminology
A graph is a collection of nodes and edges . We highlight the fact that a set of nodes belong to a specific graph by using the syntax and we denote the node elements using index . Similarly, we highlight that a set of edges belong to by using syntax and we denote the edge elements using . We define the set of supporting nodes of an edge (nodes that the edge connects) as and the set of supporting edges for a node (edges connected to the node) as . In a standard graph, two nodes are connected by an edge . For example, nodes and are connected by the single edge in the left panel of Figure 1. In a hypergraph, multiple nodes can be connected by a single edge. Hypergraphs are useful for describing algebraic structures of systems. For example, the middle panel in Figure 1 is a hypergraph that contains an edge that connects all three nodes. In a multigraph, multiple edges can connect two nodes ( might not be a singleton). In a directed multigraph, multiple edges can connect two nodes and also have direction. This representation is useful when edges represent flows (e.g., physical or communication) between nodes. The set of incoming directed edges to a node is expressed as and the set of outgoing edges is . The right-most graph in Figure 1 contains a directed multigraph wherein the nodes are connected by directed edges.
Connectivity between nodes and edges is usually expressed in terms of the incidence matrix (where the notation for set denotes its cardinality). For an undirected standard graph we have if or . For a directed graph we have if and if . Moreover, for a standard directed graph we have that for all and for a standard undirected graph we have for all (because in a standard graph an edge only has two supporting nodes). For a standard graph, the degree of a node is the number of edges connected to the node and can be computed as . The set of nodes connected to node (without counting self-loops) is denoted as and for a standard graph we have that .
A graph can be constructed in a hierarchical manner by partitioning it into subgraphs (with their own local nodes and edges). The nodes in a subgraph are connected to those in other subgraphs by a set of global edges (see Figure 3). Consider, for instance, the subgraphs and the graph , which indicates that is a graph with nodes and edges and where, for every global edge , we have that and . In other words, the edges in only connect nodes across subgraphs and but not within subgraphs. Consequently, if we treat the elements of a node set as graphs, we can represent as a general graph of the form . This nesting procedure can be repeated over multiple levels to form a hierarchical (multi-level) graph.
The discussed terminology and constructs form the basis for a wide range of graph analysis techniques. Graphs also provide a coherent framework to represent and analyze a wide range of complex systems.
3 Algebraic Graphs
Here we present a graph-based modeling abstraction to enable the flexible representation of algebraic optimization models.
3.1 Representation
We define an algebraic model graph (that we simply refer to as a model graph) as a hypergraph wherein every node has an associated algebraic component model of the form:
[TABLE]
where is a vector of decision variables (of arbitrary dimension), is the feasible set with associated constraint vector mapping , and the mapping is a scalar objective function. For simplicity, we define the elements of the node as . We highlight that this general representation also includes weighted graphs, systems of algebraic equations ,and optimization problems with mixed-integer variables.
To capture coupling between the component models residing in the nodes, we introduce the concept of link constraints. For simplicity in the presentation, here we consider linear linking constraints of the form:
[TABLE]
where are the hyperedges of the model graph and is the set of nodes that support edge (nodes connected to the edge). The connectivity matrix corresponds to the coefficients of the linking constraint. For simplicity, we define the elements of the edge set as .
To enable compact notation, we can encapsulate all the link constraints as , where denotes the set of all decision variables in the model graph and where is a connectivity matrix of the form:
[TABLE]
Under this representation, we have that if . In other words, the block matrix of node is zero if the node does not support the edge (it is not connected to it). Consequently, the connectivity matrix captures connectivity over the entire model graph. In fact, the node-edge sparsity structure coincides with that of the incidence matrix.
We use the above notation to express the optimization problem over the entire graph as:
[TABLE]
We have assumed, for convenience, that the objectives of every node are added to form the graph objective. Other operators can be performed on the aggregate the objectives (e.g., product, expected value, variance, worst-case).
It is often useful to define row and column partitions of the connectivity matrix in order to convey structure to optimization solvers. If we partition by rows (or columns), we recover how each link connects the model graph nodes (or edges). This is illustrated in Figure 2 and gives rise to the connectivity matrices:
[TABLE]
3.2 Hierarchical Graphs
The concept of an algebraic graph facilitates the creation of hierarchical graphs, wherein a node can represent an algebriac graph (see Figure 3). This representation arises in many applications such as integrated planning, scheduling, hierarchical control, coupled power transmission and distribution networks, and multi-stage stochastic programming. For example, the two-level hierarchical model graph corresponding to Figure 3 can be represented as:
[TABLE]
Here, we have that and are children graphs (partitions or subgraphs) of the parent graph . Moreover, we have that and . In other words, the parent graph contains every node in every child subgraph but edges are local to their corresponding subgraphs. Localizing edges in this form facilitates the expression of hierarchy.
We can generalize formulation (3.10) to account for an arbitrary number of subgraphs by defining the set of subgraphs . This set contains elements that are derived by partitioning the parent graph . This gives the general two-level hierarchical graph problem:
[TABLE]
Every element of the subgraph can in turn be partitioned into a set of of subgraphs (with elements ) to create a three-level hierarchical graph of the form:
[TABLE]
The partitioning procedure can be repeated to create an arbitrary number of levels. As can be seen, the graph representation can be conveniently used to arrange complex models.
3.3 Decomposition Strategies
Graph models directly provide structural information that facilitates the implementation of decomposition algorithms. For example, formulation (3.8) has a partially separable structure, because eliminating the linking constraints (3.8c) results in a fully separable problem (i.e., each component model can be solved independently). Continuous problems with partially separable structure induce specialized linear algebra kernels that can be exploited inside solvers, while structures in mixed-integer problems can be exploited using techniques such as Lagrangian decomposition. For instance, a continuous variant of problem (3.8) with feasible sets of the form gives rise to the following Karush-Kuhn-Tucker (KKT) system:
[TABLE]
Upon linearization, this system of algebraic equations gives rise to the block-bordered system:
[TABLE]
where is the Lagrange function, is the primal-dual step and
[TABLE]
is a block matrix corresponding to node , where is the Hessian of the Lagrange function, is the constraint Jacobian, and is the objective gradient. It is well-known that the block-bordered structure can be exploited by using Schur decomposition or block preconditioning strategies [12, 29].
The algebraic graph abstraction facilitates the implementation of distributed algorithms such as Benders, ADMM, or Lagrangian decomposition[24, 14, 42]. For instance, in a Lagrangian decomposition scheme applied to (3.8), one solves the node subproblems:
[TABLE]
in parallel for all and for fixed values of the dual variables . A coordination step is then performed by updating the multipliers of the linking constraints as . In a Benders scheme, it is assumed that the graph has the structure of a tree (variables in a parent node are connected to those in the children nodes but no connectivity is present among children). In this scheme, multiplier information from the node subproblems is used to construct a master problem that updates the coupling variables.
It is also possible to apply different graph analysis techniques directly to the model graph topology to perform functions such as graph partitioning [30], community detection [20], and identification of spanning trees. These strategies can be used, for instance, to create subgraphs that share a minimum set of linking constraints. Such strategies have shown to improve the performance of decomposition algorithms [47].
3.4 Data Management
Besides decomposition, there are several other key advantages of using a model graph abstraction. In particular, component models are isolated from the graph topology. A benefit of this is that it is possible to apply automatic differentiation or convexification techniques (or other processing techniques) to each component model separately, which often results in computational savings. It is also possible to exchange components models in a graph without altering the core topology. Moreover, model syntax remains local to the node and it is thus possible to reuse a component model template in multiple nodes without having to alter syntax (this is not easy to do in algebraic modeling languages such as AMPL or GAMS). This feature enables modularity and re-usability and enables the implementation of models as parametric functions of data. To highlight this, we consider the following model graph representation:
[TABLE]
where is an input data (attribute) vector associated with the model of node . This modularization approach facilitates the implementation of warm-starting procedures, which is key in control or estimation applications [28]. We also note that the output data (solution attributes) structure inherit the structure of the graph model, thus facilitating analysis and post-processing.
3.5 Model Graph Implementation
We describe the basic elements of a software implementation of the algebraic graph abstraction in (https://github.com/zavalab/Plasmo.jl). is written in the Julia programming language and leverages basic algebraic modeling capabilities of JuMP [16] to express and process individual node models and the graph analysis capabilities of [44] to construct and manage graph structures.
A implementation corresponding to the system shown in Figure 2 is presented in Figure 4. The code snippet shows how to construct a graph object and how to populate it with nodes in lines 6 - 10. Notice that nodes do not require component models upon creation and remain separate from such models to facilitate modification of the component models (facilitating model swapping and warm-starts). Next, component models are created and added to each node in lines 12, 13, and 14. In this example, all nodes contain a local variable named which are linked together using a model graph link constraint on line 16. Note, again, that syntax of a component model remains private. Finally, the model graph object is mapped into an object that can be solved by an optimization solver (line 18). The solution of each component model is queried individually on lines 20 and 21 (the solution has the same structure as the graph).
4 Computing Graphs
Simulating cyber systems requires capturing dynamic and logical aspects that arise in real-time decision-making such as as time delays, computing/processing latency and failures, and asynchronicity. For example, we might be interested in predicting how a given control system will perform on an architecture that contains sensors and computing devices with limited processing capabilities (and thus ling delays) or how a distributed optimization algorithm will perform on a distributed-memory computing cluster compared to a single central processing unit (CPU). Communication aspects are particularly challenging to capture in cyber systems, as they often involve complex topologies and frequencies. This section presents the basic elements of a computing graph abstraction seeks to facilitate modeling of cyber systems.
4.1 Representation
In the proposed abstraction, a computing graph is a directed multigraph that we denote as and that contains a set of nodes (which perform computing tasks) and edges (which communicate attributes between nodes).
A node contains a set of attributes and computing tasks . The attributes represent node data and tasks are computational procedures that operate on and/or change attributes. In other words, a computing task maps attributes (a task takes attribute data and processes it to create other attribute data). This interpretation resembles that of a manufacturing process, which takes raw material to generate products. Each task in the computing graph requires a given execution time . An edge contains a set of attributes associated with its support nodes . Communicating attributes between nodes involves a communication delay .
The collection of computing and communication tasks comprises an algorithm (also known as a computing workflow in the computer science community). Consequently, a computing graph abstraction seeks to facilitate the creation and simulation of algorithms.
The computing graph contains a global clock and each node has an internal local clock . The clocks are used to manage and schedule computing tasks and communication. For any task executed at time , its attributes become updated at clock time . Likewise, for any edge that communicates its attribute at time , the destination attribute value is updated with the source attribute value at time . Under the proposed abstraction, computing and communication tasks can be synchronous (a task is not executed until all attributes are received) or asynchronous (a task is executed with current values). This enables capturing a wide range of behaviors seen in applications.
The computing and communication times and can represent true times (times required by the computing devices executing the tasks) or virtual times (times required by hypothetical devices executing tasks). In other words, the proposed abstraction allows us to simulate the behavior of algorithms on virtual (hypothetical) computing architectures. This is beneficial when one lacks access to an actual sophisticated architecture (such as a large-scale parallel computer or an industrial control system) but one seeks to predict how effective an algorithm will be when executed under such architecture. Moreover, this enables us to analyze the behavior of algorithms under extreme events that might involve communication or computation failures and with this test their resilience.
Both nodes and edges use the concept of state managers to manage task behavior (e.g., determining when a task has been completed) and to manage communication (e.g., determining when data is sent or received). This representation resembles that used in process scheduling and has interesting connections with automata theory and discrete event simulation [2]. These connections can be exploited to derive a coherent state-space representation. In Section 4.3 we provide details on how this can be used to facilitate implementation of the computing graph abstraction.
Figure 5 depicts an example computing graph containing three nodes and six edges. Each node contains a single task which takes local attribute values , , and as input and updates one of their values. For example, processes its attributes and updates the value of attribute . The nodes communicate attribute values with each other using the six edges. For instance, attribute is communicated to both nodes and which updates the value of on these respective nodes. The superscript indicates that the attribute may be updated after a given time (to capture computing and communication delays).
It is important to highlight the differences and synergies between a computing and an algebraic graph. In a computing graph a node contains a dynamic component (a computing task) while in a algebraic graph a node contains a static component (an algebraic model). Moreover, in a computing graph an edge connects attributes (dynamically) while in a model graph an edge connects algebraic variables (statically). Under a computing graph, the solution of an algebraic graph is considered a computing task. Consequently, a computing node might use an algebraic graph to perform a given task or a computing graph might be an algorithm for solving a given algebraic graph. For instance, for the former, we might create a computing graph that executes a control algorithm and use an algebraic graph to simulate the behavior of the physical system under the actions of the control system. For the later, we might create a computing graph that executes a solution algorithm (e.g., Bender decomposition) to solve an algebraic graph. These capabilities enable the simulation of complex cyber-physical systems. In Section 5, we provide examples on how this can be done. We also highlight that computing tasks are general and might involve procedures that go beyond the solution of algebraic graphs such as forecasting, data analysis, learning, solution of optimization problems (that are not expressed as graphs), and so on.
4.2 Computing Graph Implementation
Here, we present an implementation example in to highlight the features of computing graphs. The example corresponds to that of Figure 5 (this structure resembles that of a distributed optimization or control algorithm). The implementation is shown in Figure 6. Line 3 creates the computing graph, line 6 adds the first node and line 7 adds three attributes to the node (named , , and ). We also specify start = 0 on this line to provide an initial value for the data attributes. We add a task to node in line 10 which runs the task function and provide the keyword argument triggered_by = Received(x,z) to indicate that receiving attributes or will trigger this task. We also provide keyword argument compute_time = :walltime, which indicates that is the true compute time of the function (in the computing device executing it), and trigger_during_busy = :queue_task which indicates that triggering the task during computation will queue the task until its current computation is finished.
Next, we add the second computing node with the same attributes in lines 13 and 14. We add the task to the node in line 17 and provide similar keyword arguments. We add the last node with attributes in lines 20 and 21 add its task in line 24. For this node, we use the keyword argument triggered_by = Updated(z), which indicates that updating attribute z will trigger that task (i.e., if updates attribute , the task will run continuously). Here, we fix the computing time to two time units using the keyword argument compute_time = 2. This forces the computing time of the task to be a fixed value (as opposed to the actual value of the processor). This feature allows us to simulate virtual behavior or a hypothetical computing device.
Node is connected to nodes and in line 27 which communicates the value of attribute on node to attribute on nodes and . We provide the keyword argument delay = 2 which indicates that communication requires a fixed time of two time units (to simulate a fixed delay) and the argument send_on = Updated(n1[:y]), which indicates that communication occurs when the source attribute value n1[:y] is updated (attribute residing in node ). We also connect attribute in node to that in nodes and in line 30 and connect attribute in node to nodes and in line 33.
We provide an initial trigger to start each task at global time and execute the computing graph for twenty time units in lines 37 and 38.
The keyword arguments in can be used to manage how tasks and edges respond to different signals. This makes it possible to simulate diverse computation and communication behaviors. For instance, if each task in Figure 5 was to solve an optimization problem (e.g., an algebraic graph), the computing graph described in Figure 6 could simulate the behavior of a distributed optimization algorithm wherein nodes and run their optimization problems in response to receiving attribute values and would compute its optimization problem continuously (or asynchronously).
The computing behavior of Figure 6 could also be changed depending on the nature of the tasks and arguments provided. For instance, it would be possible to simulate a control system wherein one node could continuously compute a simulation task (e.g., a plant simulation) and its outgoing edges could communicate on a sampling interval (as opposed to communicating based on source updates). The other nodes could receive measurement attributes which would trigger their tasks to compute control actions and update their inject attributes which would be communicated back to the plant node.
4.3 State-Space Representation of Computing Graph
In the proposed abstraction, the computation and communication logic provided by the computing graph is driven by an underlying state manager abstraction wherein transitions in tasks states are triggered by input signals. This representation conveniently captures computing and communication latency and is flexible and extensible. In this section, we provide details on how state managers can facilitate the implementation of computing graphs. We highlight, however, that an actual user does not need to know these details to use Plasmo.jl.
Node and edge managers use states and signals to manage task computation and attribute communication. The use of managers is motivated by state machine abstractions from automata theory [9, 8], which are often used in control-logic applications [17]. State machines are also used to manage logical behaviors in Simulink [10] and agent-based simulation frameworks [40]. State machines can be used to represent actions that trigger transitions in task states. Figure 7 presents a simple state machine with three states, three signals, and five possible transitions between states.
In a computing graph, a node manager oversees the state of node tasks. Such tasks are specified by the user in the form of functions. Each node has an associated tuple () where is the set of states, is the set of action signals, is the set of broadcast (output) attributes, is the state transition mapping, and is the attribute update mapping. The node dynamic evolution is represented as a system of the form:
[TABLE]
Here, the next state is the result of the mapping from the current state and a given action signal . Every state transition also triggers a transition in the attribute values (i.e., tasks update attributes). Action signals can also be sent to the state managers of other nodes in the form attributes.
The proposed abstraction can incorporate an arbitrary number states and actions but here we provide an example of a basic set of states and actions that can be considered in an actual implementation:
[TABLE]
At a given point in time, a node manager can be in one of the states in . The set of signals recognized by the manager are and these will trigger (depending on the current state) a transition between states. Such signals include, for instance, or . The set of broadcast targets (that receive created signals) are the node itself and all of its outgoing edges . Hence, a node can send signals (in the form of attributes) to itself or its outgoing edges.
Using the sets defined in (4.36), we can define a transition mapping as:
[TABLE]
In (4.37) we can see, for instance, that a task transitions to the state when it receives the corresponding signal and it transitions to the state when it is executing a task and receives a signal to finalize such task. The signals to execute or finalize a task are generated by user-defined attributes. For instance, a user-defined attribute consisting of a flag such as or can generate a finalize task signal that in turn triggers a state transition.
An edge manager can be defined in the same way as a node manager with associated states and signals for communication. A possible implementation of an edge manager includes the following states and actions:
[TABLE]
An edge can send signals to itself or its supporting nodes in the form of its attributes. Using the sets defined in (4.38), we can define a transition mapping of the form:
[TABLE]
The mappings in (4.39) closely reflect the node transition mapping in (4.37). An edge transitions to the communicating state when it receives a signal and transitions to the state when it receives the signal (indicating that all sent attributes were received).
Figure 8 depicts the node and edge manager transition mappings (4.37) and (4.39) with additional transitions. This figure highlights that action signals can trigger self-transitions wherein nodes or edges loop back to their original state. For instance, a node can transition back to the state when it receives the signal and an edge can transition back to the state when it receives either or signals. Self-transitions allow attribute updates to occur during task execution or edge communication.
4.4 Task Scheduling and Timing
A computing graph implementation needs to capture timings and order of task execution and attribute communication. These timings can be managed using a discrete-event queue wherein items in the queue are evaluated in an order based on their scheduled time [19]. The computing graph specifically uses a signal-queue wherein action signals are evaluated based on their scheduled evaluation time.
Figure 9 illustrates an example execution of the computing graph from Figure 5. Node computes (which requires compute time ) after which the value of attribute is sent to the corresponding attribute on nodes and (which each requires communication time). The compute and communication times are expressed using signals. For instance, when completes its execution, it produces a signal with a delay to capture the computation time. Equivalently, when edge that connects node to node communicates attribute , it produces the signal with a delay .
5 Case Studies
In this section we illustrate how the proposed graph-based abstractions can be used to simulate complex cyber-physical systems. We first present a case study which models a pipeline gas network and show how the model graph facilitates partitioning of the network to enable a scalable solution. We then illustrate how the computing graph can be used to simulate the behavior of optimization and control algorithms under different computing architectures.
5.1 Optimal Control of Gas Pipeline Network
We consider a system of connected pipelines in series [49] shown in Figure 10. This linear network includes a gas supply at one end, a time-varying demand at the other end, and twelve compressor stations (giving a total of 14 gas junctions). The gas junctions connect thirteen pipelines. This forms a model graph with a linear topology. We formulate an optimal control problem to track a sudden demand withdrawal.
Under a model graph abstraction , all components (gas junctions and pipelines) are treated as nodes (there are a total of 27 nodes in the model graph). Note that a pipeline is treated as a node as well and has a corresponding component model (discretized partial differential equations that capture conservation of mass and momentum). Edges indicate algebraic connectivity between the nodes. The structure of the model graph is depicted in Figure 10. Here, we also show the optimal partitioning of the graph using a -way algorithm.
We solve the optimal control problem by: (i) aggregating all nodes and edges to produce a large optimization problem which ignores the graph structure, and (ii) by exploiting -way partitioning to enable a parallel Schur decomposition scheme. The code snippet shown in Figure 11 depicts how straightforward it is to implement both approaches using the model graph and highlights several features that are enabled by using a graph abstraction. Lines 6 through 31 create the model graph, add nodes, set component models, and connect nodes using link constraints. For approach (i), the graph is flattened into a traditional optimization model and solved with Ipopt (line 35). In approach (ii), a k-way partioning scheme [30] is applied to the model graph to produce thirteen partitions and the partition information is passed to the PIPS-NLP solver (line 39). The resolution of the spatial discretization mesh of the pipeline PDEs is gradually increased to produce optimization problems that span the range of 100,000 to 2 million variables. The partitioned problem can be solved with near-linear scaling whereas the solution time of the flattened formulation scales cubically.
We note that approach (i), although limited in computational efficiency, still benefits from having an algebraic graph representation. For instance, the same PDE component model is appended to different pipeline nodes and this enables the modular construction of complex network models. Moreover, approach (i) can also benefit from warm-starting and updating component models with different demands, operational constraints, or lower-fidelity PDE representations.
5.2 Benders Decomposition Algorithm
We now show how to use computing graphs to simulate the behavior of Benders decomposition under a parallel computing architecture with different numbers of CPUs. The problem under study is a resource allocation stochastic program (B.54). This stochastic program can be decomposed with Benders decomposition into a master problem (B.55) and a set of subproblems (B.56) defined for a set of sampled scenarios . This forms a computing graph with a two-level tree topology.
A synchronous variant of Benders decomposition using a master processor and set of worker CPUs that solve scenario subproblems is described in Algorithm 1. The algorithm uses computing tasks implemented in the functions , , and . The function computes a subproblem solution for a given scenario data sample using the current master solution . Here, the master solution , the scenario data , and the subproblem solution are data attributes that are communicated between the master and the worker nodes. The solve_subproblem function activates the receive_solution task on the master node, which stores subproblem solutions into the set , updates the set of cuts , and checks how many scenario subproblems have been solved. If all subproblems are completed, it triggers the execution of the solve_master task using the current set of solution data and cuts . Otherwise, it triggers the execution of the solve_subproblem task on a worker processor. When solve_master is executed, it takes subproblem solution attribute to update the master attribute and stops the computing graph if convergence is achieved. If not converged, it empties the solution set and activates the solve_subproblem tasks for each worker again, which will then obtain new subproblem attributes.
Algorithm 1 can be modeled and simulated using a computing graph. Under this abstraction, the parent node will solve the master problem and a set of children task nodes will be available to solve the set of scenario subproblems. Here, the parent node will allocate scenarios to the available children nodes dynamically (by keeping track of which ones are available and which ones are busy solving another scenario subproblem). The children will communicate their solutions and cutting plane information to the parent node once they are done solving their subproblem. The proposed graph abstraction allows us to simulate the effect of computing and communication delays on the performance of the Benders scheme. This allows us, for instance, to simulate the behavior of the algorithm on a hypothetical parallel computer executing the algorithm that might be subjected to a different number (and type) of CPUs or subjected to random computing loads from other jobs that increase computing latency in the CPUs. Figure 13 depicts a hypothetical parallel computing architecture with four computing nodes that execute the Benders algorithm. Here, CPU 4 solves the master problem while CPU 1, CPU 2, and CPU 3 each solve a subproblem after receiving the scenario data attribute (where corresponds to a CPU id) and the master solution attribute .
The simulation of the Benders algorithm (Algorithm 1) can be expressed in terms of nodes, tasks, and attributes following the setup provided in the Computing Graph 1. The master node contains attributes defined in line 2 which include the solution to the master problem and the scenario data which are communicated to available subnodes . The master node also contains tasks (line 3) analogous to the functions in Algorithm 1 to execute the master problem () and to receive solutions from subnodes (). The details of each task and what attributes it updates can be found in Appendix B. The solve_master task is triggered by the attribute shown in line 4. The task solves problem (B.55) using the master node attributes and updates the attributes for the master solution and scenarios . The updated attributes trigger edge communication to the subnodes (lines 12 and 13). The task is executed when any solution attribute is received from the connected subnode . This task determines whether the task is ready to run again and either updates the attribute (which triggers run_master) or updates the corresponding scenario attribute with a new scenario to send back to subnode . Each subnode computes its task when it receives its scenario attribute . The task updates , which triggers communication to the attribute on the master node (line 14).
The implementation of the computing graph in for the case of three subnodes is shown in the code snippet of Figure 15. Lines 2-11 create the graph, add the master node with its attributes and tasks ( and ). Lines 13-23 add subnodes to the graph, each with a task that is executed after receiving scenario data (line 17). Finally, communication edges are created between attributes (lines 20-22) and the graph is executed (25) until it terminates (i.e., a task calls such as shown in Figure 14). It is also possible to simulate to a pre-determined time by providing an argument to .
Figure 14 depicts how to define tasks in . One argument is typically needed: a reference to a node to retrieve and update attributes, but a reference to the computing graph can be provided for access to the graph clock or to terminate the computation. For example, line 3 retrieves the attribute value for the set of master problem cuts to solve the master problem, and updates the master solution in line 6. Also note that the and tasks are given default compute times as the true compute time of their execution (). We set a communication delay of 0.005 seconds from the master node to the subnodes (but it is also possible to make delay time a function of the attributes communicated or to experiment with different delays to evaluate the effect of communication overhead).
Figure 16 summarizes the simulation results of the Benders algorithm as we increase the number of CPUs available in the computing architecture (we consider cases with and CPUs). We can see that, with a communication delay of 0.005 seconds from the master to the subnodes, using one CPU has a shorter total solution time than using four CPU nodes (due to the communication overhead). Executing the algorithm on 8 and 16 CPU nodes, however, results in algorithm speed up (reduction in computing latency overcomes communication latency). This illustrates how the proposed framework can help predict trade-offs of computing and communication latency. For instance, our results predict that the proposed Benders scheme only benefits from parallelization when the number of CPUs is sufficiently large. We highlight that the parallel architectures evaluated are hypothetical (the actual simulation of the algorithm was executed on a single CPU). In other words, simulates the behavior of the Benders algorithm on a virtual computing environment.
5.3 Model Predictive Control Architectures
This case study demonstrates how a computing graph can be used to simulate the behavior of distributed control architectures when considering computation and communication delays and to simulate how such delays impact the actual behavior of a physical system. Specifically, we consider a reactor-separator system (see Figure 17) from [46], which is a standard application for evaluating distributed model predictive control (MPC) algorithms. The system consists of two reactors in series where the reaction takes place and a separator which produces a product stream and a recycle. The system is described by twelve states: the weight fractions of A and B in each unit, the unit heights, and the unit temperatures. The manipulated inputs are the flow rates , , , , , , and heat exchange rates , , and . The details of the system are given in Appendix C.
The behavior of the physical system (the plant) shown in Figure 17 is simulated under three different MPC architectures. We consider a centralized MPC architecture (Figure 18(a)) wherein every output is sent to a central MPC controller which computes all control actions for the plant. We also consider decentralized control architectures consisting of three MPC controllers (one for each unit) and simulate their behavior when they do not communicate (Figure 18(b)) and when they cooperate by communicating state and intended control actions (Figure 18(c)). Complex performance, computation, and communication trade offs arise under the three MPC architectures studied. In particular, the centralized scheme achieves best performance when the computing and communication delays are short (which might not be achievable in large systems). On the other hand, the performance of decentralized schemes might be lower but computing delays are expected to be shorter as well. Analyzing such trade offs is facilitated by the computing graph, since this captures communication and computing times while simultaneously advancing the plant simulation. The proposed framework also captures asynchronous behavior of the decentralized and cooperative schemes. In particular, the controllers might inject their control actions as soon as they complete their computing task (as opposed to waiting when all of them are done).
Computing Graph 2 specifies the setup for the cooperative MPC algorithm. The plant node contains the task which advances the state of the system from the current clock time to the time of the next action signal. The plant node includes attributes , , and (which are the control actions received from the MPC controllers) and (which are the plant states). The plant node is self-triggered by updating , which allows the task to run continuously (i.e., the task constantly updates the attribute ). The plant state is communicated to each MPC node at a constant sampling time. The edges that connect the plant state attribute to each MPC node will trigger when sending its source attribute (with a given waiting time ). The MPC nodes , , and each execute their task which computes a control action by solving an optimization problem using attributes from the rest of the MPC controllers and from the plant. If they have completed enough coordination iterations, they each update their attribute , which triggers communication into the plant. Otherwise, they each update their control trajectory and exchange their attributes. This triggers the MPC node task , which manages the MPC trajectory exchange. Additionally, we specify logic to handle the case when the attributes {} are received while a node is busy executing a task. When this occurs, the triggered task is queued and re-triggered when the current task being executed is completed.
Figure 19 demonstrates how the cooperative MPC algorithm (Computing Graph 2) is implemented in . A node is created for the plant and for each MPC controller. Tasks and attributes are added to each node and attributes for sensor measurements and control inputs are connected between nodes. The execution behaviors are once again modified using keyword arguments. Lines 2 to 26 create the graph, add a node for the plant simulation, and a node for each MPC controller. Task execution is triggered by receiving and updating attributes. Each MPC controller computes its control action when it samples (i.e., receives the output measurement attribute ) and starts performing iterations. The measurement attributes are communicated to each controller with a delay of 30 seconds at a sample period of 60 seconds (by specifying ), starting at 5 seconds (line 29). The controller inputs are connected to the corresponding plant attributes and are communicated when they are updated (line 31). Finally, lines 34-36 connect the MPC controllers to each other (create communication edges) so that they exchange information when they update their control actions.
Figure 20 presents the simulation results for each MPC algorithm. The centralized MPC communication pattern (20(a)) shows the communication delays between the plant and the controller (grey arrows), the time required to compute the control action (the purple bar), and highlights how the plant state advances continuously while computation and communication tasks execute. Despite the delays enforced for the controller, centralized MPC is able to drive the state to the set-point (Figure 20(b)). Decentralized MPC does not require communication between controllers and computing times are decreased (Figure 20(c)) but we observe that the set-point cannot be reached (Figure 20(d)). This is because this approach does not adequately capture multi-variable interactions. Finally, cooperative MPC has a more sophisticated communication strategy (Figure 20(e)) but we observe that this helps mimic the performance of centralized MPC (Figure 20(f)).
6 Conclusions and Future Work
We have presented graph-based abstractions that facilitate modeling complex cyber-physical systems. An algebraic model graph abstraction is used to facilitate the construction, solution, and analysis of complex optimization models while a computing graph abstraction is used to facilitate the construction, solution, and analysis of complex decision-making algorithms that involve diverse computing tasks and communication protocols. We implemented both abstractions in the Julia package and provided case studies that demonstrate the capabilites of the abstractions.
Future work will focus on developing and implementing more sophisticated graph analysis techniques for the model graph that facilitate decomposition strategies and visualization. We are also interested in extending the computing graph capabilties to allow for co-simulation [23], which would facilitate decomposition of the simulation framework. This is necessary to enable scalability to systems with many computing nodes such as swarm-based systems (which contain thousands of nodes) [33, 43] or control systems with adaptive communication [7]. We are also interested in using the developed capabilities to simulate more sophisticated hierarchical control architectures that involve combinations of centralized/decentralized subsystems [45].
Acknowledgements
This material is based on work supported by the U.S. Department of Energy (DOE), Office of Science, under Contract No. DE-AC02-06CH11357 as well as the DOE Office of Electricity Delivery and Energy Reliability’s Advanced Grid Research and Development program (AGR&D). This work was also partially supported by the U.S. Department of Energy grant DE-SC0014114.
Appendix A Gas Pipeline Study Model
We model gas pipeline networks as algebraic graphs wherein the nodes consist of models for the links (pipelines) in the network and the gas network junctions . By defining component models for each node in the graph, we express the physical system as a collection of models connected by algebraic link constraints.
A.1 Gas Pipeline Component Model
We assume isothermal flow through a horizontal pipeline segment . The isothermal mass and momentum conservation equations are presented in [39] and take the form:
[TABLE]
where notation and units are defined in Table 1. The euler equations can be approximated by dropping the the momentum term which has a negligible effect on dynamics, and the states and can be converted into mass flow rate and pressure to produce the following formulation
[TABLE]
where , and are defined constants in Table 2.
The flow and pressure variables can also be used to define boundary conditions for the link flows as
[TABLE]
Pipelines with compressors at their suction are denoted as active links . The total power consumed by each compressor as part of each active link is then given by
[TABLE]
where , , and are constant parameters given in Table 2. Pressure boundaries are defined by the following conditions:
[TABLE]
Each pipeline also has a a steady-state initial condition:
[TABLE]
and an operational constraint to return its linepack back to its starting inventory:
[TABLE]
Here, the amount of linepack in any pipeline segment can be computed as:
[TABLE]
Combining equations (A.41) through (A.48) results in the following component model for each pipeline:
[TABLE]
These PDEs are discretized in space-time by using finite differences.
A.2 Gas Junction Component Model
Each gas node model consists of a nodal pressure , a set of gas supplies , and a set of demands . Gas nodes impose the following operational constraints on the network
[TABLE]
where is the lower pressure bound for the node, is the upper pressure bound, is the target demand for demand on node and is the available gas generation from supply on node .
A.3 Linking Constraints
The linking constraints connect the pipeline and gas node models into a physical gas network by enforcing boundary conditions and conservation constraints. We can thus define the boundary conditions for the link pressures with
[TABLE]
and we can express the node balances around each gas node with the following link constraint
[TABLE]
We note that the supply flows and the demand flows are algebraic states of the system (if the pressures of the corresponding nodes are specified) [49].
Appendix B Benders Study Model
The model for the two-stage stochastic resource management optimization problem is:
[TABLE]
where is a set of realized scenarios, is a set of bases each containing resources, is a set of districts with resource demands, is a set of arcs connecting bases and districts, is the set of arcs between bases, and is the set of arcs between bases and districts. Parameter is the probability that scenario realizes at district and variable is the unmet demand at district after dispatch decisions are made for scenario . Variable is a first stage decision to move resources between bases, is a first stage decision to purchase resources at bases, and variable is a second stage decision to dispatch resources to districts after realizing district demands. Parameter is the initial amount of resources in base , is the amount of resources in base after making transfers, variable is the amount of resources in each base after dispatching to districts for scenario , and parameter is the resource demand of district for scenario .
Problem (B.54) is reformulated to conduct Benders decomposition by decomposing it into a master problem and subproblem for each scenario. The master problem is given by:
[TABLE]
where is a variable to enforce cutting planes and is the set of cutting planes added to the master problem. The subproblem is a function of the master solution variable and scenario and is given by (B.56).
[TABLE]
The master node in the computing graph contains two tasks described by Task 1 and Task 2. Each subnode in the computing graph consists of a single task described by Task 3. Task 1 runs the master problem (B.55) and updates the first stage solution (which contains all first-stage variables including ) and checks whether convergence has been satisfied. If not satisfied, it updates the master node scenario attributes with the first values from the scenario set . Task 2 runs when the master node receives an update to a solution attribute . The task checks whether every solution has returned, and if true, it updates the flag attribute , which indicates that the master problem is ready to be solved. If not all subproblem solutions have returned, the task updates the attribute with the next scenario in . Task 3 solves the subproblem (B.56) given a first stage solution and a scenario and updates the subnode solution attribute .
Appendix C MPC Study Model
The model for the plant is given by the following set of differential equations:
[TABLE]
where for we have:
[TABLE]
The recycle and weight fractions are given by:
[TABLE]
The target steady-state (set-point) is described by the parameters in Table 3 and the initial operating condition is given in Table 4.
For the 3 MPC controllers, we have the following outputs and inputs:
[TABLE]
Each MPC controller uses a quadratic cost function with weights:
[TABLE]
The differential equations are discretized using an Euler scheme with a time horizon of and a time step . The cooperative MPC computation tasks are defined in Tasks 4, 5, and 6. Task 4 simulates the plant forward in time from the current computing graph time to the time of the next signal in the computing graph queue and updates the attribute . Task 5 computes the open-loop control trajectory for MPC controller and updates its control injection if it has completed enough iterations and updates its control policy if it has not. Task 6 checks whether the MPC controller has recieved updates from the other MPC controllers and updates the flag indicator if it has received both inputs.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Abhyankar, J. Brown, E. M. Constantinescu, D. Ghosh, B. F. Smith, and H. Zhang. Petsc/ts: A modern scalable ode/dae solver library. ar Xiv preprint ar Xiv:1806.01437 , 2018.
- 2[2] A. Agarwal and I. E. Grossmann. Linear coupled component automata for milp modeling of hybrid systems. Computers & Chemical Engineering , 33(1):162–175, 2009.
- 3[3] J. A. E. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl. Cas A Di – A software framework for nonlinear optimization and optimal control. Mathematical Programming Computation , In Press, 2018.
- 4[4] M. Baldea and P. Daoutidis. Dynamics and Nonlinear Control of Integrated Process Systems . Cambridge Series in Chemical Engineering. Cambridge University Press, 2012.
- 5[5] J. Bezanson, A. Edelman, S. Karpinski, and V. B. Shah. Julia: A Fresh Approach to Numerical Computing. SIAM Review , 59(1):65–98, 2017.
- 6[6] J. Bisschop. AIMMS - Optimization Modeling, 2006.
- 7[7] M. Bodson and S. Sastry. Adaptive Control: Stability, Convergence, and Robustness . Dover Publications, 2011.
- 8[8] B. Bollig, M. Fortin, and P. Gastin. Communicating Finite-State Machines and Two-Variable Logic. ar Xiv preprint ar Xiv:1709.09991 , 2017.
