A yeast cell cycle pulse generator model shows consistency with multiple   oscillatory and checkpoint mutant datasets

Julian Fox; Breschine Cummins; Robert C. Moseley; Marcio Gameiro; and; Steven B. Haase

arXiv:2302.12946·math.DS·February 28, 2023

A yeast cell cycle pulse generator model shows consistency with multiple oscillatory and checkpoint mutant datasets

Julian Fox, Breschine Cummins, Robert C. Moseley, Marcio Gameiro, and, Steven B. Haase

PDF

Open Access

TL;DR

This study presents a simplified yeast cell cycle model that accurately reproduces multiple experimental datasets, demonstrating its robustness and providing insights into the network components involved in cell cycle regulation and checkpoints.

Contribution

The paper introduces a minimal Boolean network model that captures key dynamical behaviors of the yeast cell cycle and checkpoint arrest, aligning well with diverse transcriptomic data.

Findings

01

Model reproduces observed transcriptomic oscillations

02

Identifies potential missing network components

03

Supports robustness of simplified network approach

Abstract

The regulatory mechanisms driving progression of the yeast cell cycle appears to be comprised of an interacting network of transcription factors (TFs), cyclin-dependent kinases (CDK) and ubiquitin ligases. From a systems perspective the controlling regulatory network must produce robust periodic behavior during proliferative phases, but have the capability to halt the cycle when unfavorable conditions trigger a checkpoint arrest. How the individual components of the network contribute to these dynamical phenotypes remains an open question. Here we evaluate the capability of a simplified network model hypothesized to contain key elements of the regulation of cell-cycle progression to reproduce observed transcriptomic behaviors. We match time-series data from both cycling and checkpoint arrested cells to the predictions of a relatively simple cell-cycle network model using an asynchronous…

Tables5

Table 1. Table 1: Proxy choices for the mini wavepool.

Mini wavepool proxy decisions
	Mini wavepool nodes
Proxies	SBF	Nrm1/Yox1	SFF	Clb2	Swi5
Swi5-Nrm1	Swi4	Nrm1	Ndd1	Clb2	Swi5
Swi5-Yox1	Swi4	Yox1	Ndd1	Clb2	Swi5
Ace2-Nrm1	Swi4	Nrm1	Ndd1	Clb2	Ace2
Ace2-Yox1	Swi4	Yox1	Ndd1	Clb2	Ace2

Table 2. Table 2: The mapping between dynamical and transcriptional phenotypes and their associated datasets [ 1 , 29 , 51 ] .

Dynamical phenotype

Transcriptional phenotypes

Datasets

I: WT cycling

WT

II: Clb2 mutant cycling

Clb2 ON, Clb2 OFF,

Clb2 INT-H & Clb2 INT-L

cdc20

Δ

, clb

Δ

,

cdc14-3 & cdc15-2

III: checkpoint arrest

SAC & DRC

cse4 & cdc8

Table 3. Table 3: The fixed points representing the SAC and DRC based on the comparison between the WT dataset and the cse4 and cdc8 datasets, respectively (see Fig 2 ).

Checkpoint	Proxies	Swi4	Nrm1/Yox1	Ndd1	Swi5/Ace2	Clb2
SAC	All proxies	low	low	high	high	intermediate/high
DRC	Yox1 proxies	low	low	high	high	intermediate/high
DRC	Nrm1 Proxies	low	high	high	high	intermediate/high

Table 4. Table 4: Dynamical phenotype I: wild-type cycling results. The raw number of matches together with proportions of phenotype-permissible parameters are shown in columns 1 and 2. Column 3 shows the number of distinct mini pulse generator (MPG) parameters within the WT cycling matches. These numbers are the normalization constants in Fig 4 .

DSGRN Phenotype I
proxies	matches	% of matches	MPG parameters
Swi5-Nrm1	7073137	4.3%	1541616
Ace2-Nrm1	7073137	4.3%	1541616
Swi5-Yox1	6071292	3.7%	1347895
Ace2-Yox1	6071292	3.7%	1347895

Table 5. Table 5: The order of the nodes for the checkpoint phenotype FPs is Swi4, which has activity levels in the range 0-3, Nrm1/Yox1, which has levels 0-1, Ndd1, which has levels 0-2, Clb2, which has levels 0-3, and Swi5/Ace2, which has levels 0-1. See Section 5.1 for an explanation of the differing state discretizations for each of the nodes.

Checkpoint FPs: FP(Swi4, Nrm1/Yox1, Ndd1, Swi5/Ace2, Clb2)

SAC FPs:

All proxies:

FP(0, 0, 2, 1, 1)

FP(0, 0, 2, 1, 2)

FP(0, 0, 2, 1, 3)

DRC FPs:

Yox1 proxies:

FP(0, 0, 2, 1, 1)

FP(0, 0, 2, 1, 2)

FP(0, 0, 2, 1, 3)

Nrm1 Proxies:

FP(0, 1, 2, 1, 1)

FP(0, 1, 2, 1, 2)

FP(0, 1, 2, 1, 3)

Equations59

\overset{x}{˙}_{i} = - γ_{i} x_{i} + λ_{i} (x), i = 1, ..., n,

\overset{x}{˙}_{i} = - γ_{i} x_{i} + λ_{i} (x), i = 1, ..., n,

i f j \to i t h e n σ_{i, j}^{+} (x_{j})

i f j \to i t h e n σ_{i, j}^{+} (x_{j})

i f j ⊣ i t h e n σ_{i, j}^{-} (x_{j})

l_{X, Y} l_{X, Z} < {l_{X, Y} h_{X, Z}, h_{X, Y} l_{X, Z}} < h_{X, Y} h_{X, Z}

l_{X, Y} l_{X, Z} < {l_{X, Y} h_{X, Z}, h_{X, Y} l_{X, Z}} < h_{X, Y} h_{X, Z}

l_{X, Y} l_{X, Z} < {l_{X, Y} h_{X, Z}, h_{X, Y} l_{X, Z}} < h_{X, Y} h_{X, Z} < θ_{Y, X}

l_{X, Y} l_{X, Z} < {l_{X, Y} h_{X, Z}, h_{X, Y} l_{X, Z}} < h_{X, Y} h_{X, Z} < θ_{Y, X}

l_{X, Y} l_{X, Z} < {l_{X, Y} h_{X, Z}, h_{X, Y} l_{X, Z}} < θ_{Y, X} < h_{X, Y} h_{X, Z}

l_{X, Y} l_{X, Z} < l_{X, Y} h_{X, Z} < θ_{Y, X} < h_{X, Y} l_{X, Z} < h_{X, Y} h_{X, Z}

l_{X, Y} l_{X, Z} < h_{X, Y} l_{X, Z} < θ_{Y, X} < l_{X, Y} h_{X, Z} < h_{X, Y} h_{X, Z}

l_{X, Y} l_{X, Z} < θ_{Y, X} < {l_{X, Y} h_{X, Z}, h_{X, Y} l_{X, Z}} < h_{X, Y} h_{X, Z}

θ_{Y, X} < l_{X, Y} l_{X, Z} < {l_{X, Y} h_{X, Z}, h_{X, Y} l_{X, Z}} < h_{X, Y} h_{X, Z}

1 :

1 :

2 :

3 :

4 :

5 :

6 :

3 :

3 :

10 :

i = 1 \prod n F G (i) = P G (N) .

i = 1 \prod n F G (i) = P G (N) .

Z : θ_{X, Z} < l_{Z, Y} < h_{Z, Y}

Z : θ_{X, Z} < l_{Z, Y} < h_{Z, Y}

Z : l_{Z, Y} < h_{Z, Y} < θ_{X, Z}

Z : l_{Z, Y} < h_{Z, Y} < θ_{X, Z}

D S G

D S G

X_{1} :

X_{2} :

D S G

D S G

X :

Y :

Z :

D S G

D S G

X :

Y :

Z :

D S G

D S G

X :

Y :

Z :

H_{y, x}^{+} (x) = (h_{y, x} - l_{y, x}) \frac{x ^{n}}{θ _{y, x}^{n} + x ^{n}} + l_{y, x}

H_{y, x}^{+} (x) = (h_{y, x} - l_{y, x}) \frac{x ^{n}}{θ _{y, x}^{n} + x ^{n}} + l_{y, x}

H_{y, x}^{-} (x) = (h_{y, x} - l_{y, x}) \frac{θ _{y, x}^{n}}{θ _{y, x}^{n} + x ^{n}} + l_{y, x}

H_{y, x}^{-} (x) = (h_{y, x} - l_{y, x}) \frac{θ _{y, x}^{n}}{θ _{y, x}^{n} + x ^{n}} + l_{y, x}

\dot{S}

\dot{S}

\dot{N}

\dot{D}

\dot{W}

\dot{C}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene Regulatory Network Analysis · Fungal and yeast genetics research · Microbial Metabolic Engineering and Bioproduction

Full text

A yeast cell cycle pulse generator model shows consistency with multiple oscillatory and checkpoint mutant datasets

Julian Fox Corresponding author: [email protected] Department of Mathematical Sciences, Montana State University, Bozeman, MT, USA

Breschine Cummins

Department of Mathematical Sciences, Montana State University, Bozeman, MT, USA

Robert C. Moseley

Department of Biology, Duke University, Durham, NC, USA

Marcio Gameiro

Department of Mathematics, Rutgers University, New Brunswick, NJ, USA

Steven B. Haase

Department of Biology, Duke University, Durham, NC, USA

Abstract

The regulatory mechanisms driving progression of the yeast cell cycle appears to be comprised of an interacting network of transcription factors (TFs), cyclin-dependent kinases (CDK) and ubiquitin ligases. From a systems perspective the controlling regulatory network must produce robust periodic behavior during proliferative phases, but have the capability to halt the cycle when unfavorable conditions trigger a checkpoint arrest. How the individual components of the network contribute to these dynamical phenotypes remains an open question. Here we evaluate the capability of a simplified network model hypothesized to contain key elements of the regulation of cell-cycle progression to reproduce observed transcriptomic behaviors. We match time-series data from both cycling and checkpoint arrested cells to the predictions of a relatively simple cell-cycle network model using an asynchronous multi-level Boolean approach. We show that this single network model, despite its simplicity, is capable of exhibiting dynamical behavior similar to the datasets in most cases, and where it does not, we identified hypotheses that suggest missing components of the network.

1 Introduction

The study of the cell cycle of simple organisms like the budding yeast Saccharomyces cerevisiae can provide insight into more complex systems such as the mammalian cell cycle due to the similarity of the cellular machinery across eukaryotic organisms [9, 14, 34]. Cell proliferation is a fundamental property of cells and understanding the regulation of the cell cycle is critical for explaining and controlling many biological processes from development to cancer progression.

Models for the mechanisms controlling cell cycle progression have evolved over time. Early biochemical studies in marine invertebrates identified cyclins and cyclin dependent kinases (CDKs) as key regulators of cell-cycle oscillations along with the anaphase promoting complex (a ubiquitin ligase) [23, 24, 25]. By forming a negative feedback loop, it was hypothesized that this simple cell-cycle network was capable of producing periodic behavior [19, 35]. Genetic studies in both S. cerevisiae and *S. pombe *also identified cyclins, CDKs, and APC complexes as critical regulators of cell cycle progression, indicating that machinery was highly conserved across phyla and that these components are the basic components of the cell-cycle oscillator [21, 22, 36, 37]. Cell-cycle models became more complex over time and mounting evidence indicated that oscillations of cell cycle events could continue when the CDK/APC motif was prevented from oscillating [20]. The advent of genome technologies identified large programs of dynamic gene expression associated with the cell cycle, and new models emerged suggesting the importance of transcriptional networks on producing cell-cycle oscillations [31, 38, 39, 45].

Although ODE models of the yeast cell-cycle that are focused primarily on biochemical interactions have been remarkably predictive of mutant phenotypes [10], there is a compelling argument that transcription factors (TFs), cyclin-dependent kinases (CDKs), and ubiquitin ligases all play key roles in regulating cell-cycle progression [12]. Multiple studies have demonstrated that temporally ordered, high-amplitude transcript dynamics were present in budding yeast with non-oscillating levels of CDK [8, 11, 12, 38, 44] indicating that the CDK/APC oscillator identified in early embryonic systems may not be the core motif driving periodic behavior during the yeast cell cycle.

Recent work suggested that a pulse-generating TF network containing an oscillatory mechanism was responsible for a transcriptional pulse that was thought to drive global phase-specific transcription [11]. The pulse generator seems to operate in a start-stop manner, where the network is first quiescent and then, after receiving a start signal, creates a wave of sequential transcription through the network which is hypothesized to be driven by the interaction of CDKs with a TF network [1, 29, 52]. The transcriptional pulse driving progression through the cell cycle operates consistently, meaning that the gene products express in a stereotyped order [52, 30], and the timing and robustness of this periodic transcription was affected when certain CDKs were knocked out or up-regulated, creating a weaker and less robust cycle [44]. It is important to note that periodic transcription, while weaker and less robust, was still present in some of the mutants. Therefore, the dynamics expressed by any hypothesized network model should exhibit oscillations under both wild-type (WT) and CDK mutant phenotypes.

Synchronous and autonomous Boolean models of yeast cell-cycle networks containing TFs and CDK have produced robust oscillatory behavior [38, 43], but more sophisticated dynamical models that match observed dynamics (including transcriptional dynamics) across wild-type and mutant strains have not been reported. Moreover, these models have not addressed the observation that certain environmental perturbations (e.g. DNA damage or spindle assembly defects) can reversibly arrest the cell cycle until damage is repaired.

Here we describe a previously developed modeling framework, Dynamic Signatures Generated by Regulatory Networks (DSGRN) [3, 5, 3, 33], which is a type of asynchronous multi-level Boolean approach [13] via the mechanism of switching systems [48, 50, 15, 17, 26]. The DSGRN software package can exhaustively compute all the dynamics a genetic regulatory network (GRN) can produce, allowing for a comprehensive description of potential network behaviors. It has been used to model and predict genetic network behavior in similar biological systems, such as the epithelial-to-mesenchymal transition in cancer [53] and the Rb-E2F mechanism of the mammalian cell cycle [46].

We investigated a simple integrated yeast cell-cycle network model that contains both TFs and CDKs, (Fig 1 (Right)). We observed that this network can match observed transcriptional dynamics from a variety of conditions where network components are mutated or checkpoint arrests are induced [1, 29, 51, 52] as well as wild-type data, indicating that it may encode important features of cell-cycle regulation.

2 Network modeling approach

In this section, we discuss the evidence-based network model that we check for consistency with multiple wild-type and mutant datasets. Each of these datasets exhibits a cellular phenotype, namely, the cell cycle is either progressing or arrested. We distinguish between these cellular phenotypes and transcriptional phenotypes based on wild type and mutant microarray and RNAseq time series datasets from [1, 29, 51, 52]. Transcriptional phenotypes consist of observed cycling or steady equilibrium behavior seen in the time series. We also define dynamical phenotypes based on DSGRN predictions of network model dynamics that allow us to determine model consistency with the data. Dynamical phenotypes are graphical structures that capture stable or unstable cycling behavior as well as equilibria. In order to be declared consistent with the observed data, a network model must be able to reproduce cyclic patterns in WT and mutant data and needs to support the arrest of cycling behavior during a triggered checkpoint.

2.1 Wavepool model

A network oscillator model [51] representing a collection of regulatory interactions hypothesized to be capable of exhibiting multiple cell cycle behaviors is visualized as a GRN in Fig 1 (Left). We will call this network oscillator the wavepool model and the transcriptional subnetwork involving the nodes Nrm1/Yox1, MBF/SBF, SFF, Hcm1, Swi5, and Sic1 will be referred to as the pulse generator [51]. The edges in the GRN reflect different regulatory mechanisms. A TF regulates another TF or CDK through transcriptional control of gene products. A CDK regulates another CDK or TF only when bound in a cyclin/CDK complex (post-transcriptional control). Once assembled, the complex is able to phosphorylate the target protein, which can have either an activating or inhibiting effect depending on the target. In Fig 1, black arrows indicate activation, red arrows indicate inhibition, solid arrows indicate transcriptional control, and dashed arrows post-transcriptional control.

Each node within the network in Fig 1 (Left) may not pertain to a specific TF or cyclin/CDK; in some cases the node refers to a complex of TFs or cyclins and CDKs. By convention, the name of a complex of proteins has all letters capitalized, the name of an individual protein has only the first letter capitalized, and the name of a gene or mutant gene has all letters lowercase and italicized.

While it is understood that the CDKs and TFs identified in Fig 1 (Left) are important for progression of the cell cycle, some research has focused on investigating the importance of smaller subnetworks [1, 52, 29]; for example, concentrating on the impact that Clb2 activity has on its targets and the progression of the transcriptional program. We continue along these lines by computationally investigating a small network derived from the wavepool model. We will refer to this network as the mini wavepool, seen in Fig 1 (Right). The mini wavepool was chosen to focus on the interactions between the single CDK, Clb2, and a cyclic arrangement of TFs that we refer to as the mini pulse generator. Clb2 is the B-type cyclin of choice because it is thought to play a role in the activation of multiple checkpoints and targets TFs [29] and because it acts broadly across cell cycle phases.

The size of the mini wavepool model was partially chosen for tractability of computation and many nodes and edges from the original hypothesized pulse generator are absorbed or disregarded. The self-edge on SBF comes from the length three path from MBF/SBF to itself through Cln1/2 and Whi5 in Fig 1. Since there is a double repression, this indirect self-regulation is activating. Likewise, the edge from Swi5 to SBF in the mini wavepool is a collapsed path through double repression in Cln3 and Whi5. The removal of Sic1 is justified by evidence that Sic1 may not be necessary for cell-cycle oscillations [41]. Post-transcriptional regulation is not explicitly modeled as such; this is a recent advance in DSGRN [2].

2.2 Transcriptional Phenotypes

We describe seven different transcriptional phenotypes expressed in seven datasets. The transcriptional phenotypes are called WT, Clb2 OFF, Clb2 ON, Clb2 intermediate-low (INT-L), Clb2 intermediate-high (INT-H), spindle assembly checkpoint (SAC), and DNA replication checkpoint (DRC).

The wild-type (WT) microarray dataset comes from [1] where genome-wide transcription was analyzed for S. cerevisiae. The WT dataset acts a baseline for the dynamics present within the cell cycle. Oscillations were seen not only within the transcription factor network but within CDKs as well.

We identify the Clb2 ON mutant with a cdc20 $\Delta$ knockout mutant [29]. Cdc20 acts to promote progression through the cell cycle in M phase from metaphase to anaphase. This knockout leads to cells arrested at the metaphase-to-anaphase border in mitosis containing high levels of non-oscillating Clb2 protein. However, transcriptional oscillations in the pulse generator persist. We identify the Clb2 OFF mutant with a clb $\Delta$ mutant in which clb1-6 are knocked out [1]. These cyclins (Clb1-6) act as the S-phase and mitotic cyclins; therefore this mutant was arrested at the G1/S border. It was seen that these mutants were not able to replicate DNA, separate spindle pole bodies, undergo isotropic bud growth, or complete nuclear division, indicating that these cells do not have active Clb-CDK complexes [1] and that therefore these mutants were void of any nontrivial Clb2 activity. Again, the data show that transcriptional oscillations in the pulse generator persist despite the cell cycle arrest.

The two mutant phenotypes with intermediate expression are identified with the cdc14-3 and cdc15-2 mutants from [51]. Instead of knockouts these mutants are temperature sensitive, thus when incubated at 37°C there is a drop off in the activity of these genes. Cdc14 and Cdc15 are key players in a cell’s exit from mitosis [30, 32]. There are two APC complexes responsible for lowering Clb activity, APC/Cdc20 and APC/Cdh1. Cdc15 and Cdc14 are important for the formation of the APC/Cdh1 complex, therefore the cdc14-3 and cdc15-2 mutants are thought to only have APC/Cdc20 present. It was observed that the temperature sensitive cdc14-3 and cdc15-2 mutants had moderate levels of non-oscillating Clb2 activity [51, 32]. Cdc14 acts to lower Clb2 activity directly through the phosphorylation of CDKs, while Cdc15 simply acts to re-initiate Cdc14. Thus, in the cdc14-3 mutant there is no degradation of Clb2 via Cdc14 activity and within the cdc15-2 mutant there is the initial degradation of Clb2 through Cdc14, but Cdc14 is never re-initiated by Cdc15. This indicates that the cdc14-3 mutant should exhibit higher levels of Clb2 activity than the cdc15-2 mutant, and this reasoning is supported by the experimental data. We name the corresponding phenotypes Clb2 INT-H and Clb2 INT-L accordingly. In these two mutant datasets, pulse generator transcriptional oscillations are present.

We identify the SAC phenotype with a cse4 mutant that contains a mutant allele of the kinetochore protein, which acts to disrupt the spindle assembly [29] due to improper orientation of the kinetochore. If this occurs, signals are sent from the SAC mechanism to initiate the formation of the mitotic checkpoint complex (MCC), which binds to APC and inhibits the formation of the APC/Cdc20 complex through competition [54, 47]. In this mutant, pulse generator transcriptional oscillations are silenced and the resulting steady state serves as the transcriptional SAC phenotype.

The DNA replication checkpoint (DRC) mutant is defined as the cdc8 mutant from [29] that contains a temperature sensitive allele of cdc8 that disrupts the thymidylate kinase responsible for DNA synthesis [42]. Given a disruption to the progression of the DNA replication fork, the thymidylate kinase cascade regulates timing of S-phase events, such as spindle elongation, allowing the cell to deal with insults to the DNA replication fork by arresting the cell cycle. The pulse generator is non-oscillatory in this dataset, with the resulting steady state defining a transcriptional DRC phenotype.

To investigate the mini wavepool it is necessary to ascertain qualitative transcript levels for all nodes in the network. As is clear from the presence of complexes and boxed TFs in Fig 1, multiple proxies are available for each node in the mini wavepool. The determination of qualitative transcript levels for the mini wavepool involved the choice of proxies for SBF and SFF complexes, which we chose to be Swi4 and Ndd1 respectively. The node Nrm1/Yox1 naturally has two proxies, Nrm1 and Yox1. Both Swi5 and Ace2 were explored as proxies to represent the Swi5/Ace2 node as pictured in Fig 1. We evaluate each of the transcriptional phenotypes for each of the proxy sets listed in Table 1.

2.3 Dynamical Phenotypes

In the traditional ODE modeling paradigm, a mechanistic model is constructed, often using Hill functions in the genetic network setting, and numerous parameters are either fit to data or drawn from the literature. Frequently, a sensitivity analysis is performed to check the variability of model output to small perturbations in parameters. This approach, while valuable, has limitations. First, the area of parameter space that can be explored is minuscule, and second, the large number of parameters can lead to overfitting. We offer an alternative framework in which large regions of parameter space are excluded as unable to produce the desired dynamical behavior using a multi-level Boolean modeling approach. After identifying a reduced parameter space, parameterized Hill models can be created to replicate the desired behavior.

The modeling framework Dynamical Signatures Generated by Regulatory Networks (DSGRN) [3, 5, 3, 33] is based on an ODE system with switching functions (Section 5.1) that leads to a fundamentally different approach. First, DSGRN provides a searchable database of coarse dynamics over the entirety of parameter space (Sections 5.2-5.4). This is possible because DSGRN divides parameter space into a finite number of regions, each called a DSGRN parameter node. The coarse dynamics exhibited by a network model are constant for all real-valued parameter sets sampled from the same DSGRN parameter node. Second, DSGRN uses only coarse information from a time series dataset, which avoids overfitting.

DSGRN output contains information on the number and type of fixed points (equilibria) and oscillations that the network can exhibit at a given DSGRN parameter node. The fixed points identified by DSGRN are not quantitative, they only indicate whether the associated gene product is predicted to have high, low, or intermediate concentrations. Likewise, cycling behavior is not modeled by a smooth trajectory, but rather a sequence of maxima and minima for each gene’s expression level.

Model consistency in cycling behavior is determined by what we call a pattern match (Sections 5.5-5.6) between the sequence of maxima and minima predicted by a network model and the observed sequence of maxima and minima in the data. Model consistency for a fixed point involves a judgment call on whether gene expression is best described as high, low, or intermediate at the time when steady behavior is observed. For both types of model consistency, the proportion of DSGRN parameter nodes that exhibit the observed behavior indicates how robustly the network model recapitulates the data.

Each DSGRN parameter can be decomposed into a collection of independent DSGRN factor parameters, one per node in a GRN. In the mini wavepool, we will distinguish between the DSGRN factor parameter for Clb2 (the Clb2 parameter, Section 5.3) and the collection of remaining DSGRN factor parameters, which we will call the mini pulse generator parameter. In particular, we explore the behavior of the mini wavepool as the Clb2 factor parameter changes, but the mini pulse generator parameter remains fixed, mimicking a control mechanism of Clb2 on the mini pulse generator.

We devise three different dynamical phenotypes based on DSGRN output that are intended to represent important features of the experimental data and associated transcriptional phenotypes; see the correspondence in Table 2. Dynamical phenotype I (WT cycling) is a pattern match to the wild-type data within a stable DSGRN cycle. This is roughly analogous to ensuring similar phase relationships between expression levels that are robust to small perturbations. We identify the DSGRN parameter nodes at which consistency with the WT cycling in the data occurs.

Dynamical phenotype II captures Clb2 mutant cycling in which only the expression levels in the mini pulse generator are stably oscillating. We identify those DSGRN parameter nodes that exhibit pattern matches to Clb2 mutant time series when Clb2 is fixed at an appropriate high, low, or intermediate level, depending on the dataset. In the mini wavepool model in Fig 1 (Right), Clb2 can attain four discrete states, low (0), intermediate-low (1), intermediate-high (2), and high (3), which is dictated by the number of regulatory targets of Clb2 (see Section 5.1). We can fix Clb2 at any state $i$ by restricting ourselves to an appropriate collection of DSGRN parameter nodes (see Section 5.3). For the Clb2 ON transcriptional phenotype, we fix the Clb2 state at 3, for Clb2 INT-H the fixed state is 2, for Clb2 INT-L, the state is 1, and for Clb2 OFF, the state is 0. For example, a pattern match between the mini wavepool model and the mutant dataset cdc20 $\Delta$ is only considered successful if it occurs at a DSGRN parameter node where the Clb2 state is fixed at 3; similarly for the other datasets listed in Table 2, row 2. We say that a DSGRN parameter is “phenotype-permissible” if it obeys the appropriate constraint for a given dataset.

Suppose two DSGRN parameter nodes differ only in Clb2, are identical in the mini pulse generator and there is WT cycling at one parameter node and there is mutant cycling at the other. This dynamical connection is made more rigorous in Section 5.3. In dynamical phenotype II, we seek such parameter pairs, and the implication of the existence of such a pair is that the mini pulse generator is predicted to operate independently of changes to Clb2.

In dynamical phenotype III (checkpoint arrest), we identified a DSGRN fixed point (FP) surrogate of the SAC equilibrium and of the DRC equilibrium. We examined the data in [29] for the cse4 mutant (SAC), and the cdc8 mutant (DRC). Qualitatively high, intermediate, and low values at equilibrium were assigned for each network node based on a comparison to the wild-type transcriptomics data. We remark that all seven datasets were normalized together to permit such comparisons (see Data Availability). The activity of each node was determined by comparing the transcript level at the end of the cycle in the cse4 mutant data or the cdc8 mutant data to the transcript level of the same node in the WT data, see Fig 2 and Table 3. Data points from the end of the time series were used as the best representation of the resulting steady state.

The transcript levels to determine the SAC FP were based upon a maximum level of transcript seen across the SAC mutant compared to the maximum transcript level seen in the WT time series. Low represents any transcripts judged to be substantially below 50% maximum transcript level and high represents any transcript judged to be substantially above 50% maximum transcript level. Intermediate levels are judged to be in-between. The SAC FP is given by Swi4 = low, Nrm1/Yox1 = low, Ndd1 = high, Clb2 = not low, and Swi5/Ace2 = high (see Table 3 first row). Notice that Clb2 expression levels are largely indeterminate since they have not achieved steady state by the end of the time series.

Similar analysis was done for the DRC mutant (Table 3 second row). For the Yox1 proxies, the SAC and DRC FPs are identical, indicating that the mini wavepool has an insufficient diversity of nodes to distinguish between these two checkpoints. For the Nrm1 proxies, the fixed points differ at the Nrm1 value, where the SAC exhibits low Nrm1 activity and the DRC exhibits high Nrm1 activity.

As in mutant cycling, we checked for the existence of DSGRN parameter node pairs where one showed WT cycling and the other a SAC/DRC FP with the same mini pulse generator parameter. Unlike in mutant cycling, the arrest at a checkpoint in phenotype III indicates different dynamical behavior in the mini pulse generator as compared to WT. Therefore, the existence of such a pair indicates that regulation through Clb2 alone is sufficient to control entry into a checkpoint within the mini wavepool model without additional external regulation at other network nodes. It is important to stress that this does not exclude the existence of other regulators in a larger network in the cell; it merely indicates that the mini wavepool model as constructed can replicate the mutant phenotype of interest.

3 Results

We assess the consistency of the predictions of the mini wavepool model with the datasets by checking for the existence of the DSGRN dynamical phenotypes at a collection of DSGRN parameters. The number of DSGRN parameter nodes for the mini wavepool is quite large: 275,466,240 total. We first divide DSGRN parameters into five distinct groups as described below (see Section 5.3 for technical details). This division of DSGRN parameter space is based solely on the Clb2 DSGRN factor parameter, and not on the mini pulse generator parameter.

A full 60% of DSGRN parameter space allows for changing levels of Clb2 and therefore has the capacity for exhibiting the wild-type cycling phenotype. The remaining 40% of DSGRN parameter space is composed of parameters that have a fixed value of Clb2, with 10% each at high, low, and intermediate-high/low. We enforce that the mini wavepool can only be consistent with the Clb2 mutant datasets at the appropriately assigned parameters for Clb2 ON, OFF, INT-H, and INT-L (Clb2 mutant cycling phenotypes). The checkpoint datasets do not provide any a priori parameter constraints and consistency is assessed over 100% of DSGRN parameter space (checkpoint arrest phenotypes). The subsets of DSGRN parameter nodes over which matches to given datasets are searched are called are called phenotype-permissible parameters. In particular, WT cycling phenotype-permissible parameters comprise 60% of parameter space, mutant cycling phenotype-permissible parameters comprise 10% of parameter space each, and checkpoint phenotype-permissible parameters are unconstrained.

Our results are reported as proportions over subsets of DSGRN parameter space. Dynamical phenotype I results are the percentages of WT cycling matches over 60% of parameter space. From this set of DSGRN parameters, the number of mini pulse generator (MPG) parameters was extracted and used as the normalization term for dynamical phenotypes II and III. This was justified because we sought to discover regions of parameter space where a change in Clb2 was the only impact on the network behavior; therefore, we looked for MPG parameters that showed both WT cycling and matched at least one other phenotype.

3.1 Dynamical Phenotype I: Wild-Type Cycling

The purpose of dynamical phenotype I is to show that the mini wavepool model can match the normal function of the cell cycle, namely the experimental WT dataset, by identifying the percent of phenotype-permissible DSGRN parameter nodes that can be pattern matched in a stable cycle. Table 4 shows that the proxy groups have WT pattern matches ranging from 3.7% up to 4.3% of phenotype-permissible DSGRN parameters. This demonstrates that every proxy group can recapitulate the oscillating sequence of maxima and minima seen in the WT data, and therefore the mini wavepool model cannot be excluded as a reasonable network model for controlling the yeast cell cycle.

In addition, matching WT cycling greatly reduced parameter space. We parameterized Hill models from this reduced region (see Section 5.7) to demonstrate the utility of DSGRN in locating useful regions of parameter space. The result of one such simulation is shown in Fig 3(b) with the comparable WT data shown in Fig 3(a). Both sets of curves are normalized between 0 and 1 to emphasize the order of maxima and minima in the time series, which is the behavior that DSGRN predicts. We see that in both cases, the peak of Swi4 precedes the peaks of Nrm1 and Ndd1, which themselves precede the peaks of Swi5 and Clb2. Swi5 and Clb2 have nearly indistinguishable peaks in both panels. The order of the Ndd1 and Nrm1 peaks are indistinguishable in the WT data in panel (a), but are ordered with Ndd1 first in the simulation in (b). Since Ndd1 has a faster rise in the WT data, the simulation ordering shows consistency with the WT data. We regard the recapitulation of this order of extrema as a successful model; however, we made no attempt to match either amplitude or period of the WT data.

We determined the number of unique mini pulse generator parameters in the set of WT cycling pattern matches, shown in Table 4, column 3 for each proxy. These numbers are the normalization factors for the dynamical phenotypes II and III results shown in Fig 4.

3.2 Dynamical Phenotype II: Clb2 mutant cycling

We checked for the existence of dynamical phenotype II (mutant cycling) at Clb2 ON, Clb2 OFF, Clb2 INT-H, and Clb2 INT-L phenotype-permissible parameters; i.e., we checked for the co-existence of mutant and WT cycles at a single MPG parameter, which would indicate that a change in Clb2 alone does not disrupt mini pulse generator oscillations in the mini wavepool network. The percentages of MPG parameters where this occurs are shown in Figure 4. The nonzero results indicate that a perturbation of Clb2 allows the mini wavepool to transition from WT cycling to every type of mutant cycling in three of the four proxy groups.

The four Clb2 mutant datasets are shown in panels (a)-(d) of Fig 5 normalized between 0 and 1. It can be seen that, given noise, they all roughly have the same order of maxima and minima. Therefore, we can demonstrate the ability of DSGRN to reduce parameter space for a Hill model. We chose to sample mini pulse generator parameters for the Swi5-Nrm1 proxy group that are predicted by DSGRN to be able to exhibit all five types of WT and mutant cycling behavior for varying Clb2 factor parameters (see Section 3.4). Fig 5 (e)-(f) show simulations for two different MPG parameters that match the expected order of maxima and minima. We remark that Clb2 expression also oscillated in these simulations (not shown). This case is not excluded by DSGRN, since DSGRN will not predict oscillations that are too small to impact downstream targets.

It may seem inconsistent that there are Hill models that can recapitulate all four mutant cycling datasets, and yet the percent of pattern matches for the Clb2 ON, OFF, INT-L, and INT-H in Fig 4 are not identical within each proxy group. This occurs because the DSGRN pattern matching methodology is sensitive to noise.

3.3 Dynamical Phenotype III: Checkpoint Arrest

In DSGRN phenotype III, we identified MPG parameters that were predicted to exhibit a WT pattern match at one Clb2 factor parameter and SAC or DRC FPs (Table 3) at others. In Fig 4, the “SAC” bars correspond to the percentages of MPG parameters exhibiting WT cycling and the SAC FP, which is identically the DRC FP for the Yox1 proxies seen by looking at the “DRC” bars. The fact that these percentages are nonzero indicates that the Yox1 proxies are consistent with both mutant datasets, which must be true given that the dynamical phenotype representation of the two datasets is identical.

On the other hand, the Nrm1 proxies are only consistent with the cse4 mutant dataset representing the SAC phenotype. The DRC FP for the Nrm1 proxies was absent not only in MPG parameters with WT cycling, but also absent across all of DSGRN parameter space. This indicates that the mini wavepool topology contains insufficient regulatory interactions to recapitulate the DRC phenotype when the Nrm1 time trace is considered.

An example of a simulation from a Hill model parameterized with an MPG predicted to show both WT cycling and a SAC FP is shown in Fig 6 for the Swi5-Nrm1 proxy group. It is easily seen by examination that the simulation traces are consistent with the SAC FP in Table 3.

3.4 Co-existing Phenotypes

We looked for mini pulse generator parameters that could support all four mutant cycling phenotypes in addition to WT cycling at phenotype-permissible parameters. There were none such, indicating that either an internal change in or an external regulation of the mini pulse generator occurs between mutant phenotypes. This result is dependent on the modeling choice of using only phenotype-permissible parameters associated to the Clb2 mutant cycling datasets. If this restriction is relaxed, the percentage of mini pulse generator parameters at which there is both WT cycling and a single mutant cycling dataset increases (compare Fig 7 to Fig 4). More precisely, relaxing the phenotype-permissible restriction means that mutant cycling phenotypes are searched over all values of the Clb2 factor parameter, not just the 10% of DSGRN parameter nodes originally specified.

However, relaxing the phenotype-permissible modeling restriction still does not allow a mini pulse generator parameter that can show matches to every dataset in all three dynamical phenotypes, including checkpoint arrest. The closest achiever was the Swi5-Nrm1 proxy set for which the mini wavepool model was able to exhibit the WT cycling phenotype, the four mutant cycling phenotypes, and the SAC phenotype at 3575 mini pulse generator parameters.

4 Discussion

We have demonstrated that a small network model (mini wavepool, Fig 1 (Right)) approximating the pulse generation capacity of the yeast cell cycle transcriptional oscillator (Fig 1 (Left)) is capable of matching multiple datasets with different experimental perturbations. This process provides further evidence for the validity of a pulse generator mechanism with CDK control in the yeast cell cycle and highlights the flexibility of dynamical behavior that a single network is capable of producing. We conclude that the mini wavepool network is controllable, in the sense that a large number of dynamical phenotypes are accessible via parameter perturbation, and that the region of wild-type behavior is relatively small within parameter space. In contrast, a network might instead be robust, wherein wild-type behavior is prevalent across parameter space, making it difficult to push network behavior into a different dynamical regime.

We devised three computational phenotypes using the software DSGRN [5, 33] that associate to seven different datasets (Table 2). Dynamical phenotype I corresponds to wild-type cell cycle behavior under standard conditions. Dynamical phenotype II encompasses mutant cycling—oscillations of the mini pulse generator under fixed values of the CDK Clb2 induced by various knockout experiments. In dynamical phenotype III, checkpoint behavior was evaluated by computationally seeking qualitative equilibrium values determined by each of two datasets, one representing the spindle assembly checkpoint and one representing the DNA replication checkpoint.

We interpret model consistency with the data in the following way. If the mini wavepool model is consistent with wild-type cycling, then the model faithfully captures the behavior of the undisrupted cell cycle. If it is consistent with mutant cycling, then the model is capable of reproducing observed oscillatory behavior in the mini pulse generator regardless of the state of Clb2. If it is consistent with checkpoint behavior, then it provides the hypothesis that regulation through Clb2 is sufficient to induce the checkpoint in the mini pulse generator, even if in reality other regulators are implicated in cell cycle control. If the mini wavepool model is consistent with all of these dynamical phenotypes, then it is a complete hypothesis for explaining the seven datasets, while the non-existence of any mutant phenotype indicates network model incompleteness.

We found that 3.7-4.3% of potential wild-type DSGRN parameters predicted dynamical behavior consistent with the WT dataset, indicating that the network model is capable of reproducing standard cellular conditions. Of these WT pattern-matched parameters, up to 20.0% were consistent with various mutant datasets at various proxy groups, although not all phenotypes were exhibited at every proxy group. There were no mini pulse generator parameters that could support WT cycling and all four types of mutant cycling within a single proxy group at phenotype-permissible DSGRN parameters, i.e. fixed Clb2 levels hypothesized to be associated to particular datasets. However, if this modeling choice is not enforced, mini pulse generator parameters were located that support WT and all four mutant cycling phenotypes, as well as (in some cases) SAC arrest. Relaxing phenotype-permissibility in the mutant cycling phenotype is akin to acknowledging that Clb2 expression in the datasets may not be sufficiently close to constant for our model assumptions to hold. If we allow for this possibility, then there exists a highly narrowed selection of mini pulse generator parameters that can recapitulate the dynamics seen in multiple datasets. One interesting and open problem is whether this selection of parameters is most likely to contain biologically reasonable parameterizations for the mini pulse generator, since only Clb2 parameter modifications are necessary to induce phenotypic changes. This is an area of future research.

We observed that when using Nrm1 proxy sets, the DRC phenotype was not supported at any DSGRN parameter in the network model. This indicates that the model lacks important regulatory elements that are necessary for the DRC when considering Nrm1 as a proxy in the network. We hypothesize that distinguishing between SBF and MBF in the mini wavepool may help rectify this issue. Current models like the network seen in [51] define MBF and SBF as the same node, yet results from [7, 49] indicate a mechanism for DRC arrest dependent solely on MBF activity. Given replication stress, the protein Rad53p inactivates the MBF co-repressor Nrm1. This activation of MBF induces up-regulation of G1/S genes within S-phase. As a result of this MBF pathway being activated, the DRC is initiated. We suggest that enlarging the mini wavepool to include MBF and Rad53p may permit consistency with the DRC phenotype.

There is another network enhancement that our work suggests. We showed that SAC arrest is supported in the network model by only modifying the parameterization of Clb2, suggesting the presence of a regulatory element controlled by the SAC mechanism that impinges solely on Clb2. Research has shown that one of the physiological goals of the SAC is to inhibit APC/Cdc20 activity which in turn inhibits Clb2 degradation [27]. In [40] it was seen that the rapid degradation of Cdc20 is necessary for SAC arrest. Using ordinary nonlinear differential equations and spatial simulations, the authors of [6] were able to identify a SAC mechanism that acts to fully sequester Cdc20 and inhibit APC activity. It was also suggested within [40] that there is potential for indirect regulation of Cdc20 through substrates of APC, such as Clb2 and Clb5. By expanding the mini wavepool to include feedback with APC/Cdc20, we can test the hypothesis that APC and Cdc20 interaction can initiate the Clb2 parameter change required to transition from WT cycling to SAC.

In its original implementation, DSGRN does not differentiate between transcriptional regulation vs protein modifications such as phosphorylation or ubiquitination, although these generalizations are newly available [2], as well as modeling transport across the nuclear membrane [18]. In this manuscript the original DSGRN package was used, and as a consequence transcriptional interactions and post-transcriptional modifications were not distinguished in the modeling of the mini wavepool in Fig 1 (Right). This means that we used mRNA expression levels of Clb2 as a substitute for protein levels and activity. Protein and RNA levels are certainly correlated at least under some conditions [28]. An obvious next step is to expand the mini wavepool to model post-transcriptional modifications, as well as to incorporate other regulators. We are currently in the midst of an effort to explore the endocycling phenomenon in a larger network using the new methodology from [2].

5 Methods

In this section, we discuss the basic properties of DSGRN [3, 5, 46, 33] that are used when interpreting WT and mutant transcriptional phenotypes as DSGRN dynamical phenotypes. We discuss in detail how the dynamical phenotypes are constructed and interpreted with respect to the data.

DSGRN comprehensively computes coarse features of the dynamical behavior of a genetic regulatory network over a combinatorial representation of parameter space that is finite [5]. These coarse features include oscillatory behavior with stereotyped orders of maximum and minimum concentrations of gene products, and the number and type of equilibria. DSGRN uses techniques from ordinary differential equations and graph theory to compute these behaviors.

Four kinds of graphs provide a framework for understanding DSGRN. A gene regulatory network (GRN) is the input to DSGRN and an undirected parameter graph (PG) is the basic structure of DSGRN. The output of DSGRN is a collection of directed state transition graphs (STGs) and their corresponding directed Morse graphs (MGs), one for each node in the parameter graph. As defined earlier a GRN is a system of interacting gene products that inhibit or activate one another. The GRN modeled in this paper is in Fig 1 (Right), which is built as a directed graph. As a running example for explaining DSGRN computations in this section we will be talking about the simpler networks in Fig 8. On the left, node $X_{1}$ inhibits node $X_{2}$ and $X_{2}$ inhibits $X_{1}$ , otherwise known as the “toggle switch” [46]. On the right, we see a three-node network that exhibits more complex dynamical behavior than the toggle switch, but remains amenable to manual construction of the various DSGRN graphs.

5.1 Switching systems and DSGRN parameters

DSGRN is, at its heart, a rigorous connection between Boolean modeling and ordinary differential equation (ODE) modeling. It can equivalently be viewed as an asynchronous multi-level Boolean modeling approach [13] or as a dynamical systems approach [5]. In this exposition, we will introduce DSGRN using the formalism of dynamical systems.

A GRN can be modeled using a system of discontinuous ODEs called a switching system [48, 50, 15, 17, 26], where each node in the network is modeled by an ODE of the form:

[TABLE]

where $x=(x_{1},x_{2},\dots,x_{n})$ . For every node in the network, $x_{i}$ is the concentration of species $i$ , $\gamma_{i}$ is the decay rate of species $i$ , and $\lambda_{i}(x)$ is a product of sums of step functions, one for each regulating edge $j$ on node $i$ . The step functions $\sigma^{\pm}_{i,j}(x_{i})$ are given by:

[TABLE]

representing activation and inhibition respectively, where $0<l_{i,j}<h_{i,j}$ are low and high constant values. For an activating step function $\sigma_{i,j}^{+}(x_{j})$ , an increasing concentration of $x_{j}$ causes an increasing rate of production of $x_{i}$ as $x_{j}$ crosses the threshold value $\theta_{i,j}$ . Similarly, a repressing step function $\sigma_{i,j}^{-}(x_{j})$ indicates a decreasing rate of production of $x_{i}$ as $x_{j}$ increases across the threshold $\theta_{i,j}$ . In the case of the toggle switch we only see a single inhibiting edge per node so both nodes are defined by a single inhibiting step function, i.e. $\lambda_{1}=\sigma^{-}(x_{2})$ and $\lambda_{2}=\sigma^{-}(x_{1})$ .

Suppose that node $j$ regulates both $i$ and $k$ . Then we require that $\theta_{k,j}\neq\theta_{i,j}$ so that the effect of $x_{j}$ on each of its downstream regulatory targets is distinct and therefore totally ordered. For example, the thresholds on the two out-edges from $Y$ in Fig 8 (Right) are required to be different; however there are no requirements on the relationship between any other pair of thresholds.

A DSGRN parameter is a collection of inequalities governing the relationship of the low, high, and threshold values for each node within the network. Each DSGRN parameter consists of two parts for each node in the network: a logic parameter and a order parameter [5]. A key observation is that the logic and order parameters for a node are independent of all other nodes in the network, and therefore may be chosen independently. An order parameter defines the order of the threshold values for a node. For example, the $Y$ node in Fig 8 (Right) has two threshold values due to its two out-edges, $\theta_{X,Y}$ and $\theta_{Z,Y}$ . There are then two order parameters for Y: $\theta_{X,Y}<\theta_{Z,Y}$ and $\theta_{X,Y}>\theta_{Z,Y}$ . All other nodes in our examples have a single out-edge and are trivially ordered.

The number of thresholds associated to each node determines the discretization of the corresponding gene product’s expression level. Considering the node $Y$ above, the two thresholds mean that $Y$ has three discrete expression levels, 0, 1, and 2 (low, intermediate, and high), where the integer indicates how many thresholds the value of $Y$ exceeds; i.e. $Y<\theta_{X,Y},\theta_{Z,Y}$ corresponds to 0, and so forth. The consequence is that nodes in a GRN can have, and generally do have, different discretization levels. In the mini wavepool, Swi4 and Clb2 each have 3 out-edges and therefore 4 states (low, intermediate low, intermediate high, and high); Ndd1 has 2 out-edges and therefore 3 states (low, intermediate, and high); and Nrm1/Yox1 and Swi5/Ace2 each have 1 out-edge and therefore the Boolean states 0 and 1 (low and high).

The logic parameter for each node within the network orders the input values to the node with respect to the output thresholds of the node. For example, the $X$ node in Fig 8 (Right) has two in-edges, one from X and one from Y. Therefore, it has four possible input values:

[TABLE]

that are partially ordered due to the constraint that $0<l<h$ . In particular, notice that the input values $l_{X,Y}h_{X,Z}$ and $h_{X,Y}l_{X,Z}$ cannot have a determined order until real values are assigned to $l$ and $h$ . The node $X$ has a single out-edge to node $Y$ , giving it one threshold $\theta_{Y,X}$ . A logic parameter is the insertion of this threshold into the partial order of inputs. For $X$ , there are six possible logic parameters, since there are six possible ways to insert the threshold into the partial order:

[TABLE]

A DSGRN parameter is a collection of inequalities: a choice of one order parameter and one logic parameter for each node in the network. We will call the inequalities for node $i$ a factor parameter for $i$ . We remark that the number of DSGRN parameters scales poorly with the number of edges in a network. As the number of in-edges to a node grows, the size of the partial order grows exponentially. As the number of out-edges from a node grows, the number of order parameters grows factorially. Additionally, the complexity of inserting multiple thresholds into the partial order causes a large increase in the number of logic parameters.

5.2 Parameter graph and remainder parameter

An important element of this work is the transition between WT and mutant dynamical phenotypes in parameter space. The following two sections explain the details of this transition.

The collection of all possible factor parameters for $i$ can be represented as an undirected graph, called the factor graph for node $i$ . These factor graphs have nodes representing each factor parameter and the edges between these nodes represent a single change in an inequality.

Each factor graph for a node $i$ can be written as the product of $M$ logic graphs, where the nodes of a logic graph are the collection of all logic parameters for $i$ and an edge exists between two nodes when there is a single change in a logic parameter inequality (for a fixed order parameter). The number of logic graphs $M$ is the number of order parameters for node $i$ . Connections between logic graphs exist whenever there is a single change in the order parameter while the logic parameter remains the same, provided that the two swapped thresholds in the order parameter have no intervening logic value. This is most easily seen by an example.

Let us now construct the factor graph for $Y$ from the three-node network in Fig 8 (Right). Node $Y$ has a single in-edge and two out-edges, meaning that two thresholds are inserted between the low and high production rates of $Y$ : $0<l_{Y,X}<h_{Y,X}$ . This results in six logic parameters for $Y$ :

[TABLE]

where $\theta_{1}<\theta_{2}$ is some ordering of the two thresholds of $Y$ , $\theta_{X,Y}$ and $\theta_{Z,Y}$ . These six inequalities are the nodes of the logic graph of $Y$ . Due to the two order parameters of $Y$ , there are two copies of this logic graph, one for each ordering, as shown in Fig 9. The numbering of the logic parameters above is associated to the logic graph left of the dashed line, which is associated to order parameter $\theta_{X,Y}<\theta_{Z,Y}$ . The isomorphic logic graph for order parameter $\theta_{Z,Y}<\theta_{X,Y}$ is on the right. Edges only exist between the two logic graphs when there is a single inequality change in the order parameter between thresholds that are adjacent in the logic parameter. An example of this type of edge can be seen in red in Fig 9.

For a GRN $N$ with nodes $i=1,\dots,n$ , the product of the factor graphs $FG(i)$ is the DSGRN parameter graph $PG(N)$ :

[TABLE]

The parameter graph contains all possible DSGRN parameters as nodes and encodes adjacency between real-valued parameter regions as edges [5].

As an example, the DSGRN parameter graph for the three-node network in Fig 8 (Right) is shown in Fig 10. Each choice of a node from the $X$ factor graph, a node from the $Y$ factor graph, and a node from the $Z$ factor graph is a DSGRN parameter node. Two such DSGRN parameter nodes are shown in red in the top and bottom panels respectively in Fig 10. By examining the factor graphs, we see that the DSGRN parameter graph for the three-node network has a size of $6\times 12\times 3=216$ .

Suppose a single factor parameter for a node $x_{i}$ is fixed, for example, representing a mutant phenotype such as a knock-out. The combination of the factor parameters for all of the remaining nodes form a remainder parameter. We demonstrate the idea of a remainder parameter using Fig 10 by allowing the parameter node of $Z$ to vary while those of $X$ and $Y$ remain fixed. In this case we have the remainder parameter composed of the $X$ and $Y$ factor parameters marked in red as $Z$ varies from its highest parameter to its lowest. The high and low $Z$ factor parameters could represent an up-regulation or knock-out of a gene in a biological scenario. In the first case, $Z$ is continuously expressed at its highest level while in the second, it is expressed at its lowest level. The lowest expression level may as well be taken to be zero, as in a knock-out, since $Z$ is never expressed highly enough to regulate its downstream target.

5.3 Application to the mini wavepool

Recall from Fig 1 (Right) that Clb2 has one in-edge and three out-edges. This means it has $3!=6$ order parameters and ten logic parameters (see Fig 11). Four of the ten logic parameters are taken to be representative of various Clb2 mutants (see the caption of Fig 11). We propose that the WT phenotype is associated to one of the logic parameters in black. Notice that each of the logic parameters that denote the state of a Clb2 mutant (inequalities in color) have both the low and high values $l_{Clb2,SFF}$ and $h_{Clb2,SFF}$ together between thresholds. In other words, the model of Clb2 activity implies that even if the molecular concentration isn’t perfectly constant, there will never be a sufficiently large change in concentration to trigger a change in regulation at downstream genes. In contrast, the logic parameters for Clb2 WT ensures that changes in Clb2 concentration will impact at least one downstream target. We do not require regulatory activity at all downstream nodes, because the only information provided by the data is that Clb2 has noticeable oscillations. The DSGRN parameters at which we see consistency with the experimental data can give us guidance in assessing which Clb2 interactions may be important in the WT phenotype.

Dynamical phenotypes II and III involve transitions in the Clb2 logic graph from WT to mutant parameters that exhibit consistency with the corresponding mutant time series when the remainder parameter (referred to as the mini pulse generator parameter) is constant. As an example, consider the Clb2 ON transcriptional phenotype. A match according to dynamical phenotype II is a single mini pulse generator parameter that exhibits (1) mutant cycling consistent with the cdc20 $\Delta$ dataset at the phenotype-permissible green node at the top of the Clb2 logic graph as well as (2) WT cycling at any one of the six black nodes in the Clb2 logic graph.

5.4 State transition graphs and Morse graphs

Each DSGRN parameter has a corresponding state transition graph (STG) that graphically represents the dynamics of the network at that DSGRN parameter. For ODE systems, the dynamics of a network are described by time-dependent trajectories in phase space in which all gene products associated to the network are changing concentration. Phase space is the $N$ -dimensional real-valued and positive space where each coordinate represents a node in the GRN. DSGRN does not concern itself with these trajectories but instead looks at directional flow across thresholds and considers paths through an STG.

An example phase space for the toggle switch is shown in Fig 12. We discretize phase space by dividing it up into rectangular boxes called domains using the collection of thresholds for the switching system; these are the dotted lines in Fig 12, showing the division of the positive plane into four domains. Each domain corresponds to a level of $X_{1}$ and $X_{2}$ where $1$ indicates above-threshold values and [math] indicates below-threshold values. As an example the domain labeled (01) corresponds to high $X_{2}$ and low $X_{1}$ . We say that $(01)$ is the state corresponding to the upper left domain.

We can define flow across the boundaries of the domains by specifying a DSGRN parameter. In Fig 12, we use

[TABLE]

to determine the arrows. For example, the top arrow pointing left and the right arrow pointing down are determined by considering the starting domain, where $X_{1}$ and $X_{2}$ have high concentrations. In this case, the concentrations of $X_{1}$ and $X_{2}$ are both decreasing since the nodes are effective in repressing each other. However, note that $X_{2}$ is always decreasing. This happens because of the DSGRN parameter for $X_{2}$ . Regardless of the presence or absence of $X_{1}$ , $X_{2}$ will never increase across its threshold $\theta_{1,2}$ since its high concentration is still below $\theta_{1,2}$ . There is also a DSGRN parameter where $X_{2}$ (or symmetrically $X_{1}$ ) is always increasing, indicating that one or both nodes are ineffective repressors.

The STG for the toggle switch is superposed on phase space in Fig 12. The nodes of the STG correspond to the states of the domains and the directed edges between states generally correspond to the flow across the intervening boundary. The one exception is that self-edges are added whenever it is not possible to leave that domain because all of the flow points inward; see the lower right domain in Fig 12.

The size of the STG grows with the size of the GRN, since the addition of a node to the GRN adds another dimension to phase space and the addition of an edge, or threshold value, adds another set of domains to an existing dimension of phase space. Both of these mechanisms increase the number of nodes in the STG, which rapidly becomes large and difficult to interpret. It is therefore useful for clarity to examine a summary of the STG called a Morse graph (MG). We build the MG from the recurrent components of the STG. A recurrent component, or Morse set, is a maximal set of nodes in the STG that contain a path from any domain $u$ to any other domain $v$ for all $u,v$ within the recurrent component. Each of these Morse sets are represented in the MG as Morse nodes, where an edge between Morse nodes indicates that there is a path between some domain $u$ in Morse set 1 to some domain $v$ in Morse set 2.

Each Morse node in the MG has an annotation indicating the dynamics in the associated Morse set. The annotations consist of the labels full cycle (FC), partial cycle (PC), and fixed point (FP). The FC annotation indicates a Morse set in which there is a looped path, or cycle, in the STG that crosses at least one threshold value for each $x_{i}$ . The PC annotation indicates a Morse set in which the looped path crosses thresholds for only a subset of the $x_{i}$ . The FP annotation indicates a Morse set consisting of a single state with a self-edge. The Morse graph for the toggle switch is the FP with domain coordinates (10), denoted FP(10); see Fig 12. The SAC and DRC FPs for the mini wavepool network that are shown qualitatively in Table 3 are represented by the domain coordinates shown in Table 5.

The Morse graph also encodes the stability of a Morse set. Stability in the sense of dynamical systems roughly means that trajectories close to a stable manifold in phase space will approach that manifold asymptotically over time. Stability of a Morse node in the Morse graph, whether a fixed point or a cycle, is identified with having no possible exit from a Morse set once a path enters the Morse set; i.e., a Morse node is stable if and only if it has no out-edges in the MG. Morse nodes with out-edges are unstable, since there are potential exit paths from the associated Morse set.

In the mini wavepool model, wild-type cycling is evaluated within stable FCs, mutant cycling is evaluated in stable PCs where Clb2 is in a fixed state and the mini pulse generator nodes are oscillating, and checkpoints are evaluated as FPs.

As an example, consider again the single recurrent component in Fig 12, labeled FP(10). By inspection of the STG we can see that the domain (10) has no out edges to other domains therefore it is a stable FP. One can see that the key feature of the STG, namely the single equilibrium, is more readily identifiable from the MG rather than the STG. In general, the simplicity of the MG makes it much easier to interpret the dynamical behavior of a GRN at a particular DSGRN parameter.

In order to understand the implications of the different dynamic phenomena described above, we will analyze by hand the example 3-node network seen on the right in Fig 8, which exhibits all possible annotations. We will be looking at three different DSGRN parameters that give rise to three different Morse graphs. Since this network has three nodes, phase space will have three dimensions where the states are represented in the order ( $XYZ$ ); for example, (010), (110), etc. The $X$ and $Z$ dimensions are divided into two domains each since they have only one out-edge apiece. The $Y$ dimension contains three domains since there are two out-edges. This means that instead of two states, 0 and 1, $Y$ has three states, 0, 1, and 2, representing the three possible positions with respect to two distinct thresholds.

The first example DSGRN parameter we will investigate gives rise to an MG containing a stable FC. The corresponding STG and inequalities can be seen in Fig 13. The nodes have been arranged in the same spatial order as the rectangular domains in phase space. The red arrows and nodes indicate the path that contains the FC. It can be seen through inspection of the STG that starting anywhere within the STG leads to this stable cycle. This cycle is considered an FC because the set of domains that are passed through encompass all three directions in phase space.

The second example DSGRN parameter contains both a stable and an unstable PC in the variables $X$ and $Y$ . The STG and DSGRN parameter inequalities are shown in Fig 14. Again the stable cycle is represented by the red arrows and nodes while the unstable cycle is represented in the blue arrows and nodes. Through inspection of the STG it can be seen why the red cycle is stable and the blue cycle is unstable. At any node within the blue cycle, one could keep traveling on the blue cycle or drop down to the red cycle. Once moving in the red cycle one can not make it back to the blue cycle which dictates whether a cycle is stable or unstable. Unlike the FC parameter we only see oscillations in a subset of gene product concentrations in the GRN, seen because the cycle only moves through two of the three dimensions. In particular, the stable cycle does not exist outside of the domain where Z is 0.

The last DSGRN parameter we will look at in this network exhibits an FP. The corresponding STG and inequalities can be seen in Fig 15. The FP for this DSGRN parameter can be seen as the single red node with a self-edge within the STG, FP(100). Similar to the last example there exists an unstable PC(X,Y) that is represented in the blue nodes and arrows. Through inspection it can be seen that the (100) domain has no out-edges and therefore must be an FP. Also notice that once the unstable cycle is left one can never return to that cycle and will eventually end up at the FP(100).

5.5 Time Series Discretization

To evaluate DSGRN network model consistency with a dataset exhibiting oscillations, we need to extract the sequence of maxima and minima (together called extrema) from a time series dataset. We briefly discuss the methodology from [16], in which for each extremum, a time interval is assigned representing experimental uncertainty in the timing of extrema. The idea is that at a specified noise level, an extremum can occur anywhere within the assigned time interval, perhaps due to sparse temporal sampling and/or measurement error. Since the time intervals assigned to different extrema may overlap, the ordering of some extrema with respect to others is indeterminate. However, within any single time series the order of extrema is known.

Fig 16 schematically shows the process for assigning a time interval for an extremum. The example shown is the gene expression level for Ndd1 in the WT dataset. The blue line is the collected data (with linear interpolation) and the orange and green lines are the original curve $\pm$ 10% of the difference between the global maximum and global minimum. This is called a 10% noise level of the Ndd1 data. The purpose of choosing a noise level is to smooth out small, spurious extrema and also account for imprecision in the timing of large extrema.

The example computation in Fig 16 assigns a time interval to the second minimum in the time series which occurs at 158 minutes, called an extremal interval, by locating a region of the graph around the minimum that is below the $+$ 10% curve and above the $-$ 10% curve. It was proven in [4] that any perturbation of the original data that is bounded within $\pm\epsilon$ curves for some noise level $\epsilon$ is guaranteed to have a minimum located within the extremal interval. That is, the existence of the minimum is robust within the extremal interval. The extremal interval for a maximum is recovered in a similar manner.

This procedure is repeated for every extremum in every time series, so that the end result is a collection of time intervals. The intervals across time series are compared to see if they are overlapping. When two intervals overlap, it is possible for the extrema associated to the intervals to occur in either order in time. In other words, the ordering of the extrema is indeterminate and the extrema are called incomparable. However, when intervals are non-overlapping, then the order between the two extrema is known, and they are called comparable.

A pattern diagram is a discrete, graphical representation of all the comparability relationships between extremal intervals. This is known mathematically as the Hasse diagram of a partially ordered set. In the most straightforward case, all extrema of a dataset are comparable and form a linear sequence of events in time. This is true for any single time series. However, for more than one time series this never happens, since the start of every time series co-occurs and the concentrations for each node are either at a maximum or a minimum.

Fig 17 shows the pattern diagram for the WT data (Fig 2b) for the nodes in the mini wavepool. The time intervals for each extremum were computed at 10% noise, as were all time series discretizations in this manuscript. Each node in the pattern diagram is a type of extremum, with nodes at the top occurring at the beginning of the time series. A directed arrow between a source node and target node means that the source extremum is known to occur earlier in time than the target extremum. If there is no path between two nodes, then the associated extrema are incomparable.

5.6 Pattern Matching

Pattern diagrams are the basis for testing DSGRN network model consistency with oscillating time series data. While the technical details differ [4], a straightforward demonstration of pattern matching can be done by labeling the edges of a DSGRN state transition graph with the extrema labels that occur in the nodes of the pattern diagram. We begin our example of pattern matching by discussing how to use the pattern diagram, and then proceed to the labeling of the STG.

The key idea is that of a linear extension of the pattern diagram. A linear extension is any sequence of all the nodes in a pattern diagram that does not contradict any arrows in the pattern diagram. For example, in Fig 17, the top two nodes Clb2 min and Swi5 max may occur in either order in a linear extension because they are incomparable. However, the arrow from Clb2 min to Clb2 max enforces that in any linear extension, Clb2 min occurs before Clb2 max. We claim that any of the linear extensions arising from a pattern diagram might be the “true” sequence of extrema in the biological system given the measurement error and sampling density of the data.

Continuing our example of the three-node network in Fig 8,we have constructed two example pattern diagrams that could have arisen from two different datasets of nodes $X$ and $Y$ (top row of Fig 18), with the idea of checking if a pattern match exists between either of these datasets and the unstable partial cycle in Fig 14 (blue nodes). Via inspection of the two example pattern diagrams it can be seen which extrema are comparable and incomparable. In the first pattern diagram on the left, the node $X_{min}$ is comparable with any other node in the pattern diagram, but $X_{max}$ and $Y_{max}$ are not comparable with each other. For the pattern diagram on the right, we see a shift in the ordering of the extrema where $X_{max}$ and $Y_{min}$ are incomparable. Each pattern diagram has two linear extensions (bottom row of Fig 18). To construct the linear extensions for the left pattern diagram a decision is made that defines the ordering of the two incomparable extrema $X_{max}$ and $Y_{max}$ . Similarly for the poset on the right, the decision is between the order of $Y_{min}$ and $X_{max}$ .

The nodes of an STG can be labeled with information about whether a node concentration is increasing ( $I$ ), decreasing ( $D$ ), or both ( $*$ ) within the corresponding domain in phase space (for details, see [4]). From this information, edge labels in the STG can be deduced that describe which extremum could occur between any two neighboring domains. To continue our example, Fig 19 contains the unstable PC from Fig 14 along with the $I$ , $D$ , and $*$ labels on each node in the order $(XYZ)$ . Consider the edge going from the node labeled $(DDD)$ to the node labeled $(*ID)$ . In the first domain, $Y$ is decreasing and in the second domain, it is increasing. Therefore $Y$ had to undergo a minimum on the edge between the two. Continuing to the node labeled $(I\!*\!D)$ , we note that we are moving along the $Y$ axis. $Y$ does not regulate itself, and therefore we know that starting from node $(*ID)$ means $Y$ must continue to increase, although it may decrease starting from another initial condition. Therefore, no extremum may occur in $Y$ on the arrow from $(*ID)$ to $(I\!*\!D)$ . However, on the next edge to $(*DD)$ , $Y$ transitions from increasing to decreasing, therefore must have achieved a maximum. Similar arguments allow the assignment of the other edge labels.

To continue with pattern matching, we compare the sequence of extrema in the STG to the linear extensions in Fig 18. It can be seen that within the STG $X_{min}$ is followed by $Y_{max}$ then $X_{max}$ and finally $Y_{min}$ . This is exactly the left-most linear extension in Fig 18, high-lighted in blue. On the other hand, looking at the rest of the linear extensions in Fig 18, it can be seen that none of them match the path through the STG. We conclude that the three-node network model in Fig 8 is consistent with the hypothetical dataset that generated the pattern diagram on the left, but is not consistent with the pattern diagram on the right.

This pattern matching procedure is conducted analogously for the WT and mutant datasets against the mini wavepool model in Section 3.

5.7 Hill models

To model an activating regulation $x\to y$ , we used a Hill function of the form:

[TABLE]

where $l$ , $h$ , and $\theta$ are the switching system parameters introduced in Section 5.1. Similarly, a repressing regulation $x\dashv y$ is given by:

[TABLE]

We created a Hill model of the mini wavepool network in Fig 1 of the following form:

[TABLE]

where $S$ (Swi4), $N$ (Nrm1), $D$ (Ndd1), $W$ (Swi5), and $C$ (Clb2) represent expression levels of the corresponding mRNA. The subscripts on the Hill functions have been suppressed for clarity.

The parameters for the simulations in Figs 3, 5, and 6 are given as supplementary data files. The Hill exponent $n$ was taken to be 10 in all simulations.

Data Availability

The figures and results within this paper were generated using the Github repository at https://github.com/julianfox8/2022-DSGRN-Phenotypes-Yeast.git. This repository includes the jointly normalized data used in this manuscript and the Jupyter notebooks and code used to jointly normalize the datasets, but does not include the original data. That is available by request from the Haase lab at Duke University.

Acknowledgements

JF was partially supported by a grant from the Undergraduate Scholars Program at Montana State University. BC and SBD were partially supported by NSF TRIPODS+X grant DMS-1839299 and NIH 5R01GM126555-01. RCM was partially supported by NIH 5R01GM126555-01. MG was partially supported by the National Science Foundation under awards DMS-1839294 and HDR TRIPODS award CCF-1934924, National Institutes of Health award R01 GM126555, FAPESP grant 2019/06249-7, and by CNPq grant 309073/2019-7.

Bibliography54

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Orlando D. A., Lin C. Y., Bernard A., Wang J. Y., Socolar J. E., Iversen E. S., Hartemink A. J., and Haase S. B. Global control of cell-cycle transcription by coupled CDK and network oscillators. Nature , 453(7197):944–947, Jun 2008.
2[2] Cummins B., Gameiro M., Gedeon T., Kepley S., Mischaikow K., and Zhang L. Extending combinatorial regulatory network modeling to include activity control and decay modulation. SIAM Journal on Applied Dynamical Systems , To appear, 2022.
3[3] Cummins B., Gedeon T., Harker S., and Mischaikow K. DSGRN: Examining the Dynamics of Families of Logical Models. Front Physiol , 9:549, 2018.
4[4] Cummins B., Gedeon T., Harker S., and Mischaikow K. Model Rejection and Parameter Reduction via Time Series. SIAM J Appl Dyn Syst , 17(2):1589–1616, 2018.
5[5] Cummins B., Gedeon T., Harker S., Mischaikow K., and Mok K. Combinatorial representation of parameter space for switching networks. SIAM J Appl Dyn Syst , 15(4):2176–2212, 2016.
6[6] Ibrahim B. Spindle assembly checkpoint is sufficient for complete cdc 20 sequestering in mitotic control. Computational and Structural Biotechnology Journal , 13:320––328, 2015.
7[7] Francisco M Bastos de Oliveira, Michael R Harris, Pijus Brazauskas, Robertus A M de Bruin, and Marcus B Smolka. Linking dna replication checkpoint to mbf cell-cycle transcription reveals a distinct class of g 1/s genes. The EMBO Journal , 31(7):1798–1810, 2012.
8[8] S. L. Bristow, A. R. Leman, L. A. Simmons Kovacs, A. Deckard, J. Harer, and S. B. Haase. Checkpoints couple transcription network oscillator dynamics to cell-cycle progression. Genome Biol , 15(9):446, 2014.