Multiplicatively closed Markov models must form Lie algebras
Jeremy G Sumner

TL;DR
This paper establishes that continuous-time Markov models are multiplicatively closed if and only if their rate matrices form a Lie algebra, overcoming key obstacles in applying the Baker-Campbell-Hausdorff formula.
Contribution
It proves a necessary and sufficient condition linking multiplicative closure of Markov models to Lie algebra structures of rate matrices, addressing a significant mathematical challenge.
Findings
Markov models form a Lie algebra if multiplicatively closed
Overcomes obstacles in applying Baker-Campbell-Hausdorff formula
Provides a fundamental characterization of Markov model structure
Abstract
We prove that the probability substitution matrices obtained from a continuous-time Markov chain form a multiplicatively closed set if and only if the rate matrices associated to the chain form a linear space spanning a Lie algebra. The key original contribution we make is to overcome an obstruction, due to the presence of inequalities that are unavoidable in the probabilistic application, that prevents free manipulation of terms in the Baker-Campbell-Haursdorff formula.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene Regulatory Network Analysis · Markov Chains and Monte Carlo Methods
Multiplicatively closed Markov models must form
Lie algebras
Jeremy G Sumner
University of Tasmania, Australia
Abstract
We prove that the probability substitution matrices obtained from a continuous-time Markov chain form a multiplicatively closed set if and only if the rate matrices associated to the chain form a linear space spanning a Lie algebra. The key original contribution we make is to overcome an obstruction, due to the presence of inequalities that are unavoidable in the probabilistic application, that prevents free manipulation of terms in the Baker-Campbell-Haursdorff formula.
Keywords: Lie algebras; continuous-time Markov chains; semigroups; phylogenetics
AMS classification codes: 17B45; 60J27
1 Background
In this note we prove a result which makes explicit the requirement that a multiplicatively-closed Markov model must form a Lie algebra (definitions will be provided). We consider continuous-time Markov chains and work under the general assumption that a model is determined by specifying a subset of rate matrices (or rate generators). These models are used in a wide array of scientific modelling problems and have been previously studied in the context of Lie group theory by [8, 9]. Although the results given here are general, we are motivated primarily by applications to phylogenetics.
Phylogenetics consists of the mathematical and statistical methods applied to reconstructing evolutionary history from observed molecular sequences such as DNA [3]. Recent theoretical work [4, 12] has discussed the relevance of Lie groups and algebras to this applied area. The importance of Lie algebras to robust phylogenetic modelling has been demonstrated using simulation in [13], as well as on a diverse set of biological data sets in [15]. The class of Markov models that form Lie algebras is discussed in the recent textbook on mathematical phylogenetics [11], and this approach also has important applications outside of phylogenetic modelling [7]. However, previous work on this topic has not established the necessity of a Lie algebra in the general setting. In Theorem 1, we establish that the Lie algebra property is a consequence of model assumptions which we claim are natural, easily understandable, and well justified in the applied setting.
Our results fit into the general theory of Lie semigroups and Lie semialgebras as developed by Hilgert and Hofmann [6]. However, the approach we follow here gives the most direct path explicitly tailored to the practical setting of Markov chains and does so with minimal abstract theory.
2 Main result
Fixing notation, we denote as the set of real valued matrices with zero-column sums and as the subset of matrices with non-negative off-diagonal entries. We then have the interpretation that corresponds to a valid rate matrix for a continuous-time Markov chain if and only if . To distinguish from members of we refer to the members of as stochastic.
The reader who prefers to use row sums for matrices associated to a Markov chain, may simply modify the definitions above appropriately and read what follows without variation.
We assume that a given model is then specified as a subset which is defined either using a parameterization, or by giving some (polynomial) constraints on the matrix entries. In phylogenetics, the former situation is the norm and it is standard to use methods such as maximum likelihood to provide estimates of these model parameters. However, the former specification can usually be reinterpreted using the latter — which also plays a role in some formulations (such as the ‘group-based’ [10, Chap. 8] and ‘equivariant’ [2] model classes). An example of a popular phylogenetic model will be given in the next section. This motivates:
Property 0**.**
A model is expressible as an intersection where is determined by a finite set of polynomial constraints on the matrix entries of members of . That is, there exist polynomials on the variables such that . Additionally, we demand that is minimal in the sense that there is no similarly constructed set such that also.
Although Property ‣ 2 does not imply that is necessarily unique, the minimality condition ensures that does not contain any members superfluous to the determination of . A simple example gives a clear motivating precedent for this condition, as follows.
Consider and the line restricted to the positive orthant:
[TABLE]
Clearly, we most simply obtain this set by taking the intersection of the positive orthant with the line (defined as the subset of satisfying the polynomial constraint ). However, we may also obtain this set by taking the intersection of the positive orthant with the pair of lines defined by the quadratic constraint . In this case, analogous application of Property ‣ 2 would ensure that we choose the former possibility.
Following general Markov chain theory in the time-homogeneous setting, given some amount of elapsed time , the probability substitution matrix associated with is computed as the matrix exponential (using the power series ). Since may take on any non-negative value, it is sensible to consider:
Property 1**.**
A model is closed under non-negative scalar multiplication. That is, for all and , it follows that also.
If Property ‣ 2 is assumed, Property 1 follows if and only if the polynomial constraints defining are homogeneous. Up to conventions of exactly how models are parameterized (possibly obscured by conventions of overall scaling, and ‘normalisation’), as far as we are aware all phylogenetic models proposed in the literature have Property 1. When Property 1 is assumed we may simplify notation by writing in place .
We now place a third reasonable restriction on a model by imposing, what we refer to as, multiplicative closure. This property is relevant in any setting that generalises from the time-homogeneous to time-inhomogeneous formulation of continuous-time Markov chains. In rough terms, what we mean by this is that, if are in a model, then there exists another also in the model such that . This question rouses the BCH (Baker-Campbell-Hausdorff [1]) formula for all matrices :
[TABLE]
(where is the matrix-log power series and is the ‘Lie bracket’, or ‘commutator’). This naturally leads to a discussion of Lie algebras in the context of continuous-time Markov chains. Precisely how this arises is developed in the argument that follows. Careful attention to detail must be shown however, since, for certain cases, it is possible that either (i) does not belong to or (ii) the BCH series does not converge. The obstruction we overcome in this note is that there is no immediate means available to isolate terms in the BCH series since, by construction, a model does not form a linear space.
The definitions and notation required to state and prove our main result are given in the following steps:
Let be the semigroup generated by the set . Equivalently, is the intersection of all semigroups that contain . (Notice includes the identity matrix, since , so is technically a monoid.) 2. 2.
Let be the set of (scaled) logarithms of the members of . Specifically, for what follows it is sufficient to take where is the matrix-log power series (wherever it converges). Since for sufficiently small , we see that . In general, this definition allows for the circumstance that may contain rate matrices that are non-stochastic and/or not members of — the latter is our crucial observation.
We are now in a position to state our third proposed property for continuous-time Markov chains:
Property 2**.**
A model satisfies .
We claim that Property 2 is a very reasonable demand on a model since it is saying that all expressions of the form produce rate matrices which satisfy the same constraints as the matrices (up to possible relaxation of the stochastic condition of membership in ). When this is the case, we say that the model is multiplicatively closed.
Theorem 1**.**
Suppose a model satisfies Property ‣ 2. Then satisfies Properties 1 and 2 if and only if and this space forms a real Lie algebra
Proof.
Assume throughout that satisfies Property ‣ 2.
- •
Assume satisfies Properties 1 and 2.
For and we have also (by Property 1). Then for some choice of small enough such that the series converges. By Property 2, we have also. Choosing and rescaling by (using Property 1) we have, in the limit , .
Using Property 1, we observe that this generalizes to for all . More specifically, since and , we have for all . Iterating this result, shows that for all choices and .
However, since the constraints defining are polynomial, this result must be true more generally for all choices . Thus:
[TABLE]
which yields:
[TABLE]
so equality follows, and the minimality condition demanded by Property ‣ 2 shows .
Having established that is a linear space, we can now freely isolate terms in the BCH formula and be guaranteed to stay in . In particular, taking and , we see that
[TABLE]
so, rescaling by and taking the limit , we have also. Thus forms a real Lie algebra, as required.
- •
Assuming shows that the constraints defining are linear, which implies Property 1 is satisfied. Further assuming is a real Lie algebra and applying the BCH formula shows that each member of is a member of . Hence Property 2 is satisfied.
∎
3 Example
We illustrate this process with the Hasegawa-Kishino-Yano (HKY) [5] model of DNA substitutions. This is an example of a time-reversible model [14], and is defined via the parameterization (rows and columns ordered as ):
[TABLE]
where the missing entries are chosen to ensure unit column sums. The parameters are proportional to the equilibrium nucleotide frequencies of the Markov chain and is included to accommodate ‘transition/transversion’ ratio (distinguishing substitutions within both, the ‘purines’ and, the ‘pyrimidines’ , from substitutions across these two groups).
Equivalently, we may express the HKY model as the subset of rate matrices
[TABLE]
(where the displayed constraints are sufficient to determine the model). We immediately see that is not multiplicatively closed since the defining constraints are not linear. To illustrate the issue, we give a numerical example.
We chose via and respectively, and computed (using Mathematica):
[TABLE]
Attempting to find this matrix in the set , we are immediately led to
[TABLE]
but no consistent solution for is obtainable (in fact four different values are required). Therefore, is not a member of (or indeed under relaxation of the stochastic conditions).
Following the definitions given in the previous section, the form obtained in this example does however suggest that all rate matrices in the closure are of the form
[TABLE]
That this is indeed the case is confirmed by two simple computations:
- (i).
2. (ii).
This set forms a Lie algebra (in fact, this is Model 8.8 in the Lie-Markov hierarchy [4]).
Thus, the span of the HKY model forms a Lie algebra (without the additional need to take closure under Lie brackets).
4 Discussion
The contribution of this work was to lay out conditions on a continuous-time Markov chain (Properties ‣ 2,1,2) in order to show that multiplicative closure necessitates that the associated rate matrices are minimally contained inside a Lie algebra. The conditions need to be set up carefully in order to, firstly, be convincingly well-motivated to the applied setting and, secondly, allow for the relatively elementary proof of the main result (Theorem 1). This result provides a solid justification (albeit post-hoc) for recent work exploring the classification, enumeration, and application of this class of Markov models [4, 12].
We focussed on continuous-time models and hence, naturally, assumed a model is defined in terms of its rate matrices (c.f. Property ‣ 2). This does however leave open the possibility that a Markov chain defined solely at the level of substitution matrices could be multiplicatively closed without necessitating the existence of an associated Lie algebra (constructed as the tangent space at the identity). We conjecture that this is not possible, but leave the details for future work.
Acknowledgement
This research was supported by Australian Research Council (ARC) Discovery Grant DP150100088. I would like to thank Michael Woodhams, Barbara Holland, and the anonymous reviewers for their helpful comments on this work.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J. E. Campbell. ‘On a law of combination of operators’. P. Lond. Math. Soc. 29 (1898), 14–32.
- 2[2] J. Draisma and J. Kuttler. ‘On the ideals of equivariant tree models’. Math. Ann. 344 (3) (2009), 619–644.
- 3[3] J. Felsenstein. Inferring Phylogenies (Sinauer Associates, Sunderland, 2004).
- 4[4] J. Fernández-Sánchez, J. G. Sumner, P. D. Jarvis, and M. D. Woodhams. ‘Lie Markov models with purine/pyrimidine symmetry’. J. Math. Biol. 70 (4) (2015), 855–891.
- 5[5] M. Hasegawa, H. Kishino, and T. Yano. ‘Dating of human-ape splitting by a molecular clock of mitochondrial DNA’. J. Mol. Evol. 22 (1985), 160–174.
- 6[6] J. Hilgert and K. H. Hofmann. ‘Semigroups in lie groups, semialgebras in lie algebras’. T. Am. Math. Soc 288 (2) (1985), 481–504.
- 7[7] T. House. ‘Lie algebra solution of population models based on time-inhomogeneous markov chains’. J. Appl. Probab. 49 (2) (2012), 472–481.
- 8[8] J. E. Johnson. ‘Markov-type lie groups in GL ( n , r ) GL 𝑛 𝑟 \text{GL}(n,r) ’. J. Math. Phys. 26 (2) (1985), 252–257.
