The matrix product approximation for the dynamic cavity method
Thomas Barthel

TL;DR
This paper introduces a matrix product edge message algorithm for the dynamic cavity method, enabling efficient and accurate analysis of stochastic dynamics on tree-like graphs with controllable precision.
Contribution
It presents a novel matrix product approximation technique for the dynamic cavity equations, improving computational efficiency and extending applicability to models with continuum time limits.
Findings
Computation costs grow linearly with time for Glauber-Ising dynamics.
The method outperforms Monte Carlo in error scaling, especially for low probability events.
Optimized truncation schemes enhance accuracy and efficiency.
Abstract
Stochastic dynamics of classical degrees of freedom, defined on vertices of locally tree-like graphs, can be studied in the framework of the dynamic cavity method which is exact for tree graphs. Such models correspond for example to spin-glass systems, Boolean networks, neural networks, and other technical, biological, and social networks. The central objects in the cavity method are edge messages -- conditional probabilities of two vertex variable trajectories. In this paper, we discuss a rather pedagogical derivation for the dynamic cavity method, give a detailed account of the novel matrix product edge message (MPEM) algorithm for the solution of the dynamic cavity equation as introduced in Phys. Rev. E 97, 010104(R) (2018), and present optimizations and extensions. Matrix product approximations of the edge messages are constructed recursively in an iteration over time. Computation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The matrix product approximation for the dynamic cavity method
Thomas Barthel
Department of Physics, Duke University, Durham, NC 27708, USA
(November 11, 2019)
Abstract
Stochastic dynamics of classical degrees of freedom, defined on vertices of locally tree-like graphs, can be studied in the framework of the dynamic cavity method which is exact for tree graphs. Such models correspond for example to spin-glass systems, Boolean networks, neural networks, and other technical, biological, and social networks. The central objects in the cavity method are edge messages – conditional probabilities of two vertex variable trajectories. In this paper, we discuss a rather pedagogical derivation for the dynamic cavity method, give a detailed account of the novel matrix product edge message (MPEM) algorithm for the solution of the dynamic cavity equation as introduced in Phys. Rev. E 97, 010104(R) (2018), and present optimizations and extensions. Matrix product approximations of the edge messages are constructed recursively in an iteration over time. Computation costs and precision can be tuned by controlling the matrix dimensions of the MPEM in truncations. Without truncations, the dynamics is exact. Data for Glauber-Ising dynamics shows a linear growth of computation costs in time. In contrast to Monte Carlo simulations, the approach has a much better error scaling. Hence, it gives for example access to low probability events and decaying observables like temporal correlations. We discuss optimized truncation schemes and an extension that allows to capture models which have a continuum time limit.
Contents
-
II.5 Computing edge trajectory probabilities from edge messages
-
IV Algorithm for the time evolution of matrix product edge messages
I. Introduction
Statistical physics provides a comprehensive framework for the study of many-body systems in equilibrium with their environment. It is a foundation of modern physics, tracing back the laws of thermodynamics to characteristics of the microscopic degrees of freedom. Beyond physics, its formalism is successfully applied in many different fields such as information theory, computer science, economy, and sociology. In contrast, our methodological toolbox for dynamics in probabilistic systems is rather limited, which restricts our understanding of stochastic dynamical phenomena.
Stochastic dynamics, i.e., dynamics that are governed by probabilistic instead of deterministic rules, are ubiquitous in nature as well as in social and technological systems VanKampen2007 ; Freund2010 ; Capasso2004 ; AitSahalia2014 . Here, we focus in particular on stochastic dynamics in networks Barrat2008 ; Newman2010 . Some challenging examples are the thermal dynamics in spin glasses, avalanche dynamics, the dynamics of infectious diseases, the evolution of opinions in social networks, the stability of functionality under perturbations in technological networks, or synchronization phenomena. The probabilistic features can be intrinsic or due to our ignorance of certain details that are not essential for the observed (macroscopic) phenomena. A simple method for the simulation of a stochastic system with states on vertices of a network is to propagate the probability distribution in time. Unfortunately, the corresponding computation costs are exponential in the system size and it is also difficult to asses temporal correlations in this way.
For the investigation of network systems, a strong simplification can be achieved when cycles in the interaction graph are rare or sufficiently long. This is the case for locally tree-like graphs [see Fig. 1(a)] such as random regular graphs, Erdős-Rényi graphs, and Gilbert graphs. For such random graphs with vertices, almost all cycles have length such that their effect is negligible in the thermodynamic limit Mezard2009 . For static problems, this feature is used in the cavity method Mezard1986-1 ; Mezard2001-20 , where conditional nearest-neighbor probabilities are computed iteratively within the Bethe-Peierls approximation. This efficient method has been applied very successfully to study, for example, equilibrium properties of spin glasses Mezard2001-20 , computationally hard satisfiability problems Mezard2002-297 ; Mezard2002-66 , and random matrix ensembles Rogers2008-78 .
Subsequently, the cavity method has been generalized to dynamical problems, resulting in the dynamic cavity method or dynamic belief propagation Neri2009-08 ; Karrer2010-82 . The central objects in this approach are not the time-dependent probabilities for global system states, but conditional probabilities for state trajectories on neighboring sites and (edges ). These so-called edge messages are generated in an iteration over time. Consequently, the computational complexity is now linear instead of exponential in the system size . Unfortunately, the number of possible trajectories and, hence, the computational complexity increase exponentially in time. Applications have thus been restricted to either very short times Neri2009-08 ; Kanoria2011-21 , oriented graphs Neri2009-08 , a no-backtracking approximation Shrestha2015-92 ; Castellano2018-98 , or unidirectional dynamics with local absorbing states Karrer2010-82 ; Lokhov2015-91 ; Lokhov2014-90 ; Altarelli2014-112 ; Altarelli2014-10 ; Shrestha2014-89 ; Lokhov2017-114 . In the latter case, one can exploit that vertex trajectories can be parametrized by a few switching times.
For general stochastic network dynamics, it is a challenging endeavor to find good approximative solutions to the dynamic cavity equations with polynomial computation costs. A drastic simplification is to neglect temporal correlations completely as in the one-time approximation (a.k.a. time factorization) Neri2009-08 ; Aurell2011-04 ; Aurell2012-85 ; Zhang2012-148 or to retain only short-time correlations as in the one-step Markov ansatz DelFerraro2015-92 . While this can be expected to work well for stationary states at high temperatures, such approximations are usually severe for short to intermediate times or low temperatures. Other approximative approaches are the cluster variational method Pelizzola2013-86 ; Vazquez2017-3 ; Pelizzola2017-7 (applicable for short-range spatio-temporal correlations) or perturbative schemes Roudi2011 ; Aurell2012-85 ; BachschmidRomano2016-49 , generating functional analysis Hatchett2004-37 ; Mimura2009-42 ; Mozeika2011-106 ; Mozeika2012-92 ; Coolen2012-92 , and the generalized mean field approximation Mahmoudi2014-7 .
In Ref. Barthel2018-97 , we have introduced a novel efficient algorithm for precise solutions of the parallel dynamic cavity equations for locally tree-like graphs and general bidirectional dynamics. The approach is based on matrix-product approximations for the edge messages . The computational complexity is decreased from exponential to polynomial in the duration of the dynamical process. As we demonstrated for non-equilibrium Glauber dynamics in the kinetic Ising model, one can obtain quasi-exact results with relatively small matrix dimensions. Computation costs and accuracy can be tuned through controlled truncations of the matrix dimensions. Matrix-product approximations are widely used in condensed matter theory to encode spatial correlations of quantum states Accardi1981 ; Fannes1992-144 ; Rommer1997 ; Schollwoeck2011-326 . Here, we employ matrix products to encode temporal correlations in the edge messages.
This paper gives a pedagogical introduction to the dynamic cavity method in Sec. II. Sections III-V give a detailed account of the novel matrix product edge message (MPEM) algorithm Barthel2018-97 for the solution of the dynamic cavity equation. Section VI provides an improved MPEM truncation scheme that substantially reduces computation costs. For Glauber-Ising dynamics, we demonstrate a linear growth of computation costs in Sec. VII. In Sec. VIII, the MPEM method is generalized to systems where the transition probability for the state of a vertex does not only depend on the state of its neighbors in the interaction graph, but also on the current state of the vertex itself.
II. Stochastic dynamics on locally tree-like graphs
II II.1. Global equation of motion
Let us consider a graph and denote the state at vertex by . The state of the full system at time will be denoted by
[TABLE]
Given the state probabilities for time , the probabilities for the subsequent time step are obtained by applying the transition matrix , where the notation of the arguments indicates that is a conditional probability.
[TABLE]
In correspondence with the structure of the graph , we assume that, in every time step, the probability for only depends on the states of nearest-neighbors at the previous time step such that the global transition matrix is a product of local transition matrices ,
[TABLE]
where the vicinity of vertex , contains and its nearest neighbors. Equations (1) and (2) specify the parallel stochastic dynamics of the system.
II II.2. Probability for global trajectories
With the definitions above, the probability for a trajectory of states up to time takes the form
[TABLE]
For simplicity, let us assume initially uncorrelated states, i.e.,
[TABLE]
II II.3. Motivation and definition of edge messages
An exact solution of the global equation of motion (1) is usually not possible. In principle, one can employ a simple iteration over time. However, the corresponding memory costs (to keep track of ) and computation costs scale exponentially in the system size such that this approach is limited to very small systems.
A very useful technique is the Markov chain Monte Carlo method. In this approach, one repeatedly generates states according to the probability , and propagates them in time by choosing a state with probability for . While the technique can be used to study many interesting problems, it has several disadvantages. First of all, one cannot address the thermodynamic limit directly and needs to perform a costly finite-size scaling analysis. Secondly, the number of required samples can be very large if temporal correlations are non-trivial. Also, the error scaling is not very favorable. When increasing the number of samples , errors decrease slowly as . Especially if the absolute value of the desired observable is small, as is usually the case when studying temporal correlations, it is often impossible to achieve a sufficient precision.
To resolve these issues in our approach for locally tree-like graphs, we shift the focus from global-state probabilities to so-called edge messages. These are conditional probabilities for the trajectory of the variable at vertex ,
[TABLE]
and the trajectory on a neighboring site . If the graph is a tree, the exact evolution of the system can be formulated in terms of the edge messages and leads to the dynamic cavity equations Neri2009-08 ; Karrer2010-82 discussed in the following. If the graph contains some longer loops, i.e., is only locally tree-like, the dynamic cavity equations give an approximation to the exact dynamics with the accuracy depending on the number and lengths of loops.
When we remove an edge from a tree graph , it is decomposed into two parts and such that contains and the subgraph that is still connected to (and all subgraphs of that are disconnected from and ) while contains and the subgraph that is still connected to . See Fig. 1(a). Now, an edge message for edge is the probability of the trajectory on vertex for the dynamics being restricted to the subgraph under the condition that we impose the trajectory on vertex . Specifically, it is the product of all transition matrices and for all and , summed over all with , i.e.,
[TABLE]
A pictorial representation for a one-dimensional graph is given in Fig. 2.
II II.4. Simplification of edge messages
Due to the normalization constraint of the transition matrices, the expression (6) for the edge message can be simplified substantially by removing all and outside the causal cone of edge such that
[TABLE]
In this equation, we use the (recursively defined) subgraphs
[TABLE]
and the causal cone of edge which is a subset of ,
[TABLE]
See Figs. 3 and 4 for examples.
II II.5. Computing edge trajectory probabilities from edge messages
Before showing that edge messages can be generated in an iteration (Sec. II.6), note that they allow us to evaluate easily many observables of interest. In particular, the product of edge messages and yields the joint probability for the occurrence of trajectories and on vertices and ,
[TABLE]
See Fig. 5.
By marginalizing Eq. (10) over certain subsets of the variables and , one obtains the probabilities required for the evaluation of time-local observables or temporal correlation functions.
II II.6. Dynamic cavity equation
Inspection of the simplified expression (7) for edge message shows that it can be constructed from the time- edge messages according to
[TABLE]
with . This recursion relation for the edge messages, the dynamic cavity equation Neri2009-08 ; Karrer2010-82 , is very useful and central for the matrix product edge message time-evolution algorithm presented in Sec. IV. An illustration of Eq. (11) for the example of a Y-junction graph is given in Fig. 4.
III. Matrix product edge messages (MPEM)
This section and Sec. IV give a detailed account of an evolution algorithm that employs precise approximations to edge messages (7) in matrix product form Barthel2018-97 . Extensions will be described in Secs. VI and VIII.
III III.1. General idea and motivation
In the following let denote the number of different vertex states (). In Sec. II, we have seen that the stochastic dynamics on a tree graph can not only be simulated by sampling trajectories of the full system according to their weights (1) using a Monte Carlo algorithm, but also by an iterative construction of edge messages which give access to many observables of interest such as the evolution of time-local observables or temporal correlators. Importantly, the computation costs and memory requirements for evolving and storing an edge message () are independent of the system size. However, without approximations, the costs increase exponentially with time . So, an exact treatment is limited to short times.
To resolve this problem, we can employ an idea from the density matrix renormalization group (DMRG) White1992-11 ; White1993-10 ; Schollwoeck2005 which is a numerical algorithm for the simulation of strongly-correlated quantum many-body systems (predominantly for one-dimensional lattices) which has been used very successfully for a wide range of applications in condensed matter physics, the physics of ultracold atomic gases, and quantum chemistry. In principle, the encoding of a many-body state for a lattice of sites,
[TABLE]
label orthonormal basis states (), requires storage of the expansion coefficients . However, in ground states of typical quantum many-body systems, spatial correlations like decay quickly in the distance , exponentially or algebraically. Due to this fact and corresponding entanglement properties, one can approximate by much fewer effective degrees of freedom, given by the elements of matrices in the approximation Hastings2007-76 ; Brandao2013-9 ; Verstraete2005-5 ; Barthel2017_08unused
[TABLE]
The are called bond dimensions. In order for the matrix product to yield a scalar, we require dimensions at the system boundaries. The elements of matrices are effective degrees of freedom that encode correlations around site . The larger the bond dimensions are chosen, the more effective degrees of freedom are taken into account, the higher the computation costs, and the higher the precision of the approximation. In any case, the memory costs () and computation costs () are now linear in the system size . The right-hand side of Eq. (13) is called a matrix product state (MPS) Accardi1981 ; Fannes1992-144 ; Rommer1997 ; Schollwoeck2011-326 . More recently, MPS are also discussed in the more mathematical literature, sometimes under the name tensor train decomposition Oseledets2011-33 . DMRG is a class of algorithms operating on MPS – most importantly, to compute ground state approximations, thermal equilibrium states, or to study non-equilibrium phenomena. Observables can often be computed to machine precision.
The idea is now to similarly exploit that (connected) temporal correlations in edge messages often decay quickly in time and/or in the time difference and to approximate edge messages in the form of a matrix product.
III III.2. Canonical form
Let us define the canonical form of a matrix product edge message (MPEM) as
[TABLE]
The graphical representation is shown in Fig. 6. The particular choice of assigning state labels and to the matrices occurring in the matrix product (14) is advantageous for the implementation of the dynamic cavity equation (11) for MPEMs as will become clear in Sec. IV. In order for the matrix product (14) to yield a scalar, we set .
III III.3. Controlled truncations
When advancing an MPEM one step in time according to the dynamic cavity equation (11), we need to contract 111With the contraction of two tensors, we refer to their product, summed over certain sets of common indices. several MPEMs, with from the neighborhood of vertex , and local transition matrices and, subsequently, express the result again in MPEM form. In general, this generates an MPEM with increased bond dimension . In order to control the growth of the bond dimensions, and hence the computation costs, we would like to reduce the bond dimension of the new MPEM. This can indeed be done in a controlled fashion such that the resulting truncated MPEM is close to the original. Here, we discuss the most simple truncation scheme Barthel2018-97 and an optimized scheme is presented in Sec. VI. The iterative time evolution of MPEMs is described in Sec. IV.
Because the notation is a little simpler, let us demonstrate truncations using the example of an MPS for a quantum system of sites,
[TABLE]
Let us split the system into two parts , containing sites , and , containing sites . Let and be orthonormal bases for left and right parts, respectively, such that
[TABLE]
The objective is to find a good approximation of in a reduced vector space, where the 2-norm distance is used to quantify the accuracy. This can be done using a singular value decomposition (SVD) of . In matrix form, it reads , where and are unitary matrices and is the diagonal matrix containing descendingly ordered singular values such that
[TABLE]
The fact that this procedure yields the best rank- approximation of for the given bipartition corresponds to the Eckart-Young theorem Golub1996 ; Stewart1993-35 .
So, to truncate in this fashion bond dimensions of an MPS (and analogously for MPEM), we need to express it in suitable orthonormal bases for and . This can be achieved by exploiting the freedom to replace two subsequent matrices in the matrix product (15) by . The inserted non-singular matrices and clearly leave invariant as they cancel in the matrix product. Using this gauge freedom, we can bring the matrix product (15) into the form
[TABLE]
The left and right orthonormality constraints (18b) are imposed by sequences of singular value decompositions, sweeping from site 1 to site and from site to site with details discussed in Sec. IV.2. The resulting matrix product (18) is in fact of the form (16), with , and orthonormal basis states
[TABLE]
The orthonormality of these states follows from Eq. (18b):
[TABLE]
and similarly for the states . With a singular value decomposition of the matrix , we then obtain an optimally truncated state [Eq. (17b)] in MPS form.
IV. Algorithm for the time evolution of matrix product edge messages
Given MPEMs [Eq. (14)] for all edges at time , we want to do one time step according to the dynamic cavity equation (11) and obtain MPEM approximations
[TABLE]
for the edge messages at time .
Let us assume for now that vertex is not member of its neighborhood , i.e., that the local transition matrix is independent of . The more general case can also be handled as well but requires a slightly more complicated algorithm. The corresponding extension is described in Sec. VIII.
IV IV.1. One exact MPEM evolution step
First, we generate a non-canonical matrix product representation of the evolved edge message (11), in particular, choosing the non-canonical form
[TABLE]
To this purpose, the matrices for the bulk, , are obtained by contracting the transition matrix for vertex with the tensors from the time- MPEMs for edges with vertices , where is the degree of vertex . In this contraction, we sum over the common indices as illustrated in Fig. 7. The resulting matrices
[TABLE]
have left and right multi-indices of dimensions and , corresponding to the direct products and of the left and right indices of the matrices . For notational simplicity, we have assumed here that the bond dimensions are the same for all time- MPEMs of the edges with .
Specifically, for the case with , the contractions take for example the form
[TABLE]
At the extremal time slices and , the contractions differ slightly. They are specified graphically in Fig. 8.
At this point, it has become clear why the specific choice for the assignment of state indices and to tensors in the canonical MPEM form (14) is favorable. It allows for an entirely local construction of tensor from tensors , i.e., only -tensors of a single time-slice are involved.
IV IV.2. Truncating MPEMs and recovering the canonical form
The exact progression by one time step yields the non-canonical MPEM (22) with bond dimensions being increased from to . If we would proceed without any approximation, the computation cost would increase exponentially in time and, hence, restrict the simulation to very short times. Therefore, we are faced with two objectives: (a) reducing bond dimensions by a controlled truncation of the MPEM as described in Sec. III.3 and motivated physically in Sec. III.1, and (b) rearranging the assignment of physical state indices and to attain the canonical form (21) of the evolved MPEM. This can be achieved in different ways: A relatively simple method Barthel2018-97 is described in the following and an optimized scheme is introduced in Sec. VI.
If we want to truncate bond , i.e., reduce the bond dimension to something smaller, we need to take care of orthonormality as discussed in Sec. III.3. In particular, we need to express the edge message in orthonormal reduced bases for left block and right block . This can be done by imposing left and right orthonormality constraints (18b) for the -tensors. If we would not do so and truncate bond dimension , say, through an SVD of tensor without having the other tensors in orthonormal form, the approximation error would be uncontrolled. Truncating the smallest singular values of would then not correspond to the best rank- approximation.
In a first sweep from right () to left (), we can sequentially impose the right orthonormality constraint on all -tensors. As the original -tensors do not obey the orthonormality constraints, we can only truncate singular values that are (up to machine precision) zero. At the right boundary, we start with the singular value decomposition , continue for the bulk tensors with
[TABLE]
as shown in Fig. 9 and end at the left boundary with such that all -tensors except now obey the right orthonormality constraint. The computation cost for each such SVD is proportional to .
In a subsequent sweep from left to right, again based on singular value decompositions, we can now do the actual truncations to decrease bond dimensions (), e.g., by setting a threshold and truncating all singular values with . All -tensors except now obey the left orthonormality constraint and
[TABLE]
We now need to reorder the indices and of the vertex states. In a sweep from right to left, we go from the index assignment in the truncated and orthonormalized version of the matrix product in Eq. (25) to the assignment in the matrix product
[TABLE]
At the right boundary, we start with a singular value decomposition and controlled truncation (SVT) as described in footnote 222The resulting matrices and , defined as and are isometric in the sense that . These matrices are obtained from the singular value decomposition and subsequent truncation of small singular values , where . The isometric property of , corresponds to the right orthonormality constraint as defined in the second part of Eq. (18b)., continue for the bulk tensors with
[TABLE]
and end at the left boundary with .
To finally obtain the evolved edge message in canonical form (21), in a sweep from left to right, we go from the index assignment in Eq. (26) to the canonical assignment . At the left boundary, we start with , continue for the bulk tensors with
[TABLE]
and end at the left boundary with and . The matrix products for the different stages in the evolution and truncation of the edge message are illustrated in Fig. 10.
V. Evaluation of observables
As described in Sec. II.5, the joint probability of trajectories and for the vertices of an edge is given by the product of the two corresponding edge messages, . Given matrix product representations (approximations) of these edge messages in canonical form (14), time-local observables and temporal correlation functions can be evaluated efficiently. In order to evaluate, for example, the probability of the edge being in state at time , we simply contract all indices that occur in both MPEMs as depicted in Fig. 11. The contractions can be started at the left boundary () in such a way that the total computation cost scales as .
To this purpose we start at the left boundary, i.e., at time slice , with the multiplication E^{(0)}(\sigma_{i}^{0},\sigma_{j}^{0}):=A^{(0)}_{i\to j}(\sigma_{j}^{0})\big{[}A^{(0)}_{j\to i}(\sigma_{i}^{0})\big{]}^{\intercal}, continue for the bulk tensors with
[TABLE]
and finish with E^{(t)}:=\sum_{\sigma_{i}^{t-1}}A^{(t)}_{i\to j}(\sigma_{i}^{t-1})\big{(}\sum_{\sigma_{j}^{t-1}}E^{(t-1)}(\sigma_{i}^{t-1},\sigma_{j}^{t-1})\big{[}A^{(t)}_{j\to i}(\sigma_{j}^{t-1})\big{]}^{\intercal}\big{)} and the final step E^{(t+1)}(\sigma_{i}^{t},\sigma_{j}^{t}):=A^{(t+1)}_{i\to j}(\sigma_{i}^{t})\big{(}E^{(t)}\big{[}A^{(t+1)}_{j\to i}(\sigma_{j}^{t})\big{]}^{\intercal}\big{)}. Now, are scalars ( matrices because ) that yield the desired joint probability
[TABLE]
During the time-evolution of the MPEMs (Sec. IV) it is natural to normalize the edge messages according to the 2-norm. In that case, it can be necessary to add a renormalization in Eq. (29) in order to avoid the generation of very large matrix elements, e.g., by replacing in each step by .
VI. An improved truncation scheme
In order to recast the evolved state (22) into canonical form (21) and truncate bond dimensions, the first step in Sec. IV.2 was to reorthonormalize the tensors () in a sweep from right to left before doing truncations in a subsequent sweep from left to right (). The preparatory first sweep can be avoided as described in the following. This also reduces computation costs substantially from \mathcal{O}\big{(}M^{3z-3}\big{)} to , where denotes MPEM bond dimensions and denotes vertex degrees.
In extension of Sec. III.3, let us first discuss how controlled truncations (of Schmidt coefficients) can be done when the basis of the left part , containing sites , is orthonormal while the basis for the right part , containing sites , is not orthonormal. Recall that the purpose of the preparatory first sweep in Sec. IV.2 was to orthonormalize both bases. For notational simplicity, let us again use the example of a quantum many-body system with lattice sites. We are given a state
[TABLE]
with orthonormal states for the left part and arbitrary states for the right part. Now one can find a good approximation of (with respect to the 2-norm distance) by diagonalization of the reduced density matrix for the left part. This density matrix,
[TABLE]
is obtained by a partial trace over . With the overlap matrix we have which can be diagonalized according to
[TABLE]
where, as in Sec. III.3, is a unitary matrix and is the diagonal matrix containing descendingly ordered Schmidt coefficients (square roots of density matrix eigenvalues).
[TABLE]
The approximation after truncating all Schmidt coefficients with is
[TABLE]
The orthogonality and norms of the states follow as
[TABLE]
The truncation (34b) is suitable for an MPS (and analogously for MPEM) given in the form
[TABLE]
This is the form (31) with orthonormal basis states |a\rangle=\sum_{n_{1},\dotsc,n_{\ell}}\big{[}Y_{1}^{n_{1}}\dotsb Y_{\ell}^{n_{\ell}}\big{]}_{1,a}\,|n_{1}\dots n_{\ell}\rangle for [cf. Eq. (20)], non-orthogonal states |b\rangle=\sum_{n_{\ell+1},\dotsc,n_{L}}\big{[}A_{\ell+1}^{n_{\ell+1}}\dotsb A_{L}^{n_{L}}\big{]}_{b,1}\,|n_{\ell+1}\dots n_{L}\rangle for , and . The overlap matrix can be computed in an iteration with and for . With a diagonalization of , as in Eq. (33), we then obtain an optimally truncated state in MPS form.
With this type of truncation, the preparatory first orthonormalization sweep for the evolved MPEM (right to left, ), described in Sec. IV.2, can be avoided and computation costs can be reduced substantially. Going from right to left, one can first compute all overlap matrices for the evolved MPEM (22) with
[TABLE]
as illustrated in Fig. 12(a). With these, we can directly truncate the evolved MPEM, in a sweep from left to right. At the left boundary, we start with
[TABLE]
where we diagonalize and truncate (diagt) with the diagonalization as in Eq. (33) and the truncation as in Eq. (34b). The new tensor obeys the left orthonormality constraint [cf. Eq. (36)] and we need to include the matrix right of to keep the matrix product invariant such that in Eq. (22). For the bulk tensors , we continue with
[TABLE]
See Fig. 12(b). We end at the right boundary with such that all -tensors except now obey the left orthonormality constraint.
Truncating in this way, we arrive at Eq. (25) of the conceptually simpler truncation scheme of Sec. IV.2 and can continue in the same way as described there, rearranging vertex variables in the evolved MPEM to bring it into the canonical form (21). The major advantage of the more elaborate density matrix truncation scheme is the reduced computation cost. With bond dimensions of the MPEM (14) before the evolution step, the -tensors of the exactly evolved MPEM (22) have increased bond dimensions , where is the degree of vertex . The computationally most expensive step in the simple truncation scheme of Sec. IV.2 is the singular value decomposition (24) for the orthonormalization of the -tensors; the cost scales as \mathcal{O}\big{(}M^{3z-3}\big{)}. The most expensive steps in the more efficient density matrix truncation scheme are the computations of overlap matrices (37) and the reduced density matrices (39a). Exploiting the structure of the -tensors (23), these operations can be accomplished with a cost if we assume that the bond dimensions of tensors of the evolved MPEM after truncation are similar to that of the -tensors in the original MPEM (14), i.e., approximately . This is generally the case. When we fix a truncation threshold for the discarded 2-norm weight as discussed in Sec. IV.2, bond dimensions evolve smoothly in time.
VII. Computation costs in Glauber-Ising dynamics
As the number of possible trajectories on a vertex increases exponentially in time, the computation costs for exact dynamic belief propagation (11) also grow exponentially. As discussed in the introduction, this has, so far, substantially limited the applicability of the dynamic cavity method. The MPEM approach allows for a controlled reduction of the computational complexity, exploiting the decay of temporal correlations to truncate unimportant components of the edge messages.
Concerning computation costs, a decisive question is now, what bond dimensions in MPEMs are required to achieve a certain approximation accuracy in comparison to the exact evolution. Also, how does the required bond dimension evolve with time? Generally, it is to be expected that the maximum bond dimension, , will converge to a constant as a function of time if connected temporal correlations decay exponentially. In simulations, it is often favorable not to fix the truncation dimension, but to fix instead a threshold and to truncate all singular values with . The threshold controls the 2-norm loss in each truncation. Allowing bond dimensions to evolve accordingly, avoids wasting computation time and gives a measure for the information-theoretic complexity of the edge messages.
As an example, we consider Glauber dynamics of the kinetic Ising model Glauber1963-4 on random regular graphs with vertex degree in the thermodynamic limit. Specifically, Ising spins interact ferromagnetically with local transition matrices
[TABLE]
At time , all spins have magnetization , i.e., .
Figure 13 shows the evolution of maximum bond dimensions with time. They increase with decreasing truncation threshold . As expected, they become constant at longer times. The number of tensors in an MPEM increases linearly with , and we need to sweep a few times through the matrix product in each iteration. With converging bond dimensions, this implies that the computation cost per iteration grow only linearly in time, instead of exponentially. For fixed , the largest bond dimensions are needed for inverse temperature . This is so, because it is the closest to the phase transition in the Ising model. For this temperature, the bond dimensions do actually not yet show saturation on the time interval displayed in Fig. 13. They will converge at larger .
VIII. Models with vertex-state dependence
Except for the algorithms described in Secs. IV and VI our description was generic in the sense that the local transition matrices were allowed to depend on the time- state of vertex itself, in addition to the states on neighboring vertices. However, in Secs. IV and VI we considered the case where does not contain itself, i.e., that is independent of . Here, we generalize to MPEM algorithms that allow for the dependence on . This is important for many applications like stochastic models for the dynamics of infectious diseases Murray1989 ; Dangerfield2009-6 ; Karrer2010-82 or of opinions is social networks Castellano2009-81 . Naturally, transition matrices with dependence on also arise in time-discretized versions of stochastic continuous-time dynamics, i.e., all models that have a well-defined continuum-time limit as, for example, the Glauber dynamics of Ising spin systems Glauber1963-4 . In particular, when decreasing the time step , one should have
[TABLE]
Vertex is now contained in its neighborhood , where denotes the vertex degree. In this more general scenario, the evolved MPEM (22) cannot be constructed in an entirely local fashion anymore. Instead, the -tensors are constructed in a sweep from right () to left (), imposing at the same time the right orthonormality constraint, \sum_{\sigma,\sigma^{\prime}}\tilde{C}^{(s)}_{i\to j}(\sigma|\sigma^{\prime})\big{[}\tilde{C}^{(s)}_{i\to j}(\sigma|\sigma^{\prime})\big{]}^{\dagger}=\mathbbm{1} [cf. Eq. (18b)].
We start at the right boundary () by doing a singular value decomposition (SVD) of the tensor
[TABLE]
in order to obtain the tensor , where is the diagonal matrix of singular values, obeys the right orthonormality constraint, and, similarly, \sum_{\sigma}\big{[}U^{(t+1)}(\sigma)\big{]}^{\dagger}U^{(t+1)}(\sigma)=\mathbbm{1}. For time , the process continues with
[TABLE]
For the bulk tensors, , the corresponding equations read very similarly
[TABLE]
as illustrated in Fig. 14. The process ends at the left boundary with the assignment
[TABLE]
The remaining truncation and vertex variable reordering sweeps, done in order to transform the non-canonical form (22) into the canonical form (21), can be executed exactly as discussed in the somewhat simpler situation covered in Sec. IV.2.
The more efficient density matrix truncation scheme discussed in Sec. VI, similarly, can be adapted to the situation where transition matrix depends on . Like the unitaries in Eqs. (42)-(44), the -matrices in Eqs. (38) and (39) of the density matrix truncation scheme will then depend on .
Continuous-time stochastic dynamics on locally tree-like networks can be simulated with the described approach after discretization of the time axis using a small time step . It should also be possible to, alternatively, work with continuous matrix products Verstraete2010-104 ; Osborne2010-105 as introduced in the context of condensed matter physics . This however, entails some technical difficulties and one can expect that using discrete-time MPEMs and a small time step should be the best option. Similarly, quantum many-body systems in continuous real-space have been treated efficiently using MPS with a sufficiently fine space discretization Stoudenmire2012-109 ; Dolfi2012-109 . Other approximative schemes to simulate continuous-time stochastic dynamics are the dynamical replica analysis Mozeika2008-41 that captures macroscopic observables and the cavity master equation method Aurell2017-95 that operates on local conditional probabilities.
IX. Discussion
The described MPEM algorithm for the simulation of stochastic dynamics is based on the cavity method, applicable for locally tree-like interaction graphs, and on the matrix product approximation for edge messages which exploits the decay of (connected) temporal correlations. The MPEM method lifts restrictions of earlier approaches for the solution of dynamic cavity equations, mentioned in the introduction. It can also be used to simulate in the thermodynamic limit. Compared to Monte Carlo simulations, errors decrease much faster as a function of the invested computation time. This has been demonstrated for Glauber dynamics of the kinetic Ising model in Ref. Barthel2018-97 . Hence, the MPEM method is particularly suited for the precise analysis of temporal correlations, decay processes, and low-probability events. For the Glauber-Ising dynamics and fixed truncation thresholds, we have seen here that MPEM bond dimensions converge to a constant as a function of time. This should generally apply when connected temporal correlations decay exponentially. It implies that required computation costs per iteration grow only linearly instead of exponentially in time.
There are several ways in which the MPEM method can be developed further. Here, we have discussed a more efficient truncation scheme that reduces computation costs considerably from to , and we have generalized the method to models where the transition probabilities for a vertex depend explicitly on the vertex state at the previous time step, in addition to the states for its nearest neighbors.
Efficient codes for the MPEM algorithms, including the optimized truncation scheme of Sec. VI, are available from the author and under www.manyparticle.org/barthel/mpem. They were, for example, employed for the simulations in Ref. Barthel2018-97 . We are also happy to collaborate on specific applications.
I gratefully acknowledge discussions with Silvio Franz and Caterina De Bacco that initiated this work and support through US Department of Energy grant DE-SC0019449.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) N. G. Van Kampen, Stochastic Processes in Physics and Chemistry , North-Holland personal library , 3rd ed. (Elsevier, Amsterdam, 2007).
- 2(2) J. A. Freund and T. Pöschel, Stochastic Processes in Physics, Chemistry, and Biology , Lecture Notes in Physics (Springer, Heidelberg, 2010).
- 3(3) V. Capasso and D. Bakstein, An Introduction to Continuous Time Stochastic Processes: Theory, Models, and Applications to Finance, biology, and medicine (Birkhäuser, Boston, 2004).
- 4(4) Y. Aït-Sahalia and J. Jacod, High-Frequency Financial Econometrics (Princeton University Press, Princeton, 2014).
- 5(5) A. Barrat, M. Barthélemy, and A. Vespignani, Dynamical Processes on Complex Networks (Cambridge University Press, New York, 2008).
- 6(6) M. E. J. Newman, Networks: An Introduction (Oxford University Press, Oxford, 2010).
- 7(7) M. Mézard and A. Montanari, Information, Physics, and Computation (Oxford University Press, New York, 2009).
- 8(8) M. Mézard, G. Parisi, and A. Virasoro, SK model: The replica solution without replicas , Europhys. Lett. 1 , 77 (1986) . · doi ↗
