Complex Langevin dynamics and zeroes of the fermion determinant
Gert Aarts, Erhard Seiler, Denes Sexty, Ion-Olimpiu Stamatescu

TL;DR
This paper investigates how zeroes of the fermion determinant affect the validity of complex Langevin dynamics in QCD, showing that with careful monitoring, the method remains justifiable despite meromorphic drift issues.
Contribution
It analyzes the impact of determinant zeroes on complex Langevin dynamics and demonstrates that the approach can be justified with proper oversight in QCD.
Findings
Poles in the drift influence the formal justification of complex Langevin.
Lessons from simple models inform the application to QCD.
The method remains justifiable with careful monitoring despite meromorphicity.
Abstract
QCD at nonzero baryon chemical potential suffers from the sign problem, due to the complex quark determinant. Complex Langevin dynamics can provide a solution, provided certain conditions are met. One of these conditions, holomorphicity of the Langevin drift, is absent in QCD since zeroes of the determinant result in a meromorphic drift. We first derive how poles in the drift affect the formal justification of the approach and then explore the various possibilities in simple models. The lessons from these are subsequently applied to both heavy dense QCD and full QCD, and we find that the results obtained show a consistent picture. We conclude that with careful monitoring, the method can be justified a posteriori, even in the presence of meromorphicity.
| complex Langevin | exact | ||
|---|---|---|---|
| 1 | |||
| 2 | |||
| 3 | |||
| 4 | |||
| 1 | |||
| 2 | |||
| 3 | |||
| 4 | |||
| 1 | |||
| 2 | |||
| 3 | |||
| 4 |
| exact | CL[] | exact[] | CL[] | exact[] | |
|---|---|---|---|---|---|
| 2.05781 | 1.9589(29) | 1.94847 | 5.9554(90) | 5.94936 | |
| 1.74691 | 1.87106(43) | 1.87036 | 2.6473(11) | 2.64655 | |
| 0.316378 | 0.33450(11) | 0.334309 | 0.32181(6) | 0.321777 | |
| 0.0702397 | 0.07013(9) | 0.0697774 | 0.08675(10) | 0.0866928 |
| 1.375 | () | |||||
| CL | 0.972 | 1.064 | 0.729 | 0.537 | .1+.6 | |
| CLD | 0.972 | 1.065 | 0.729 | 0.537 | .1+.6 | |
| CL | 0.972 | 1.064 | 0.730 | 0.538 | 1.+5.6 | |
| CL | 0.970 | 1.063 | 0.730 | 0.538 | 2.+22. | |
| CLD | 0.970 | 1.063 | 0.730 | 0.538 | 2.+22. | |
| CL | 0.972 | 1.063 | 0.731 | 0.538 | 2.+22. | |
| exact | 0.972 | 1.065 | 0.730 | 0.537 | ||
| ex. | 0.972 | 1.065 | 0.730 | 0.537 | ||
| ex. | 1.906 | 0.575 | 0.038 | 0.724 | ||
| ex.pq | 1.003 | 1.003 | 0.628 | 0.628 | ||
| 1.425 | () | |||||
| CL | 0.885 | 0.892 | 0.597 | 0.595 | .6+2.8 | |
| CLD | 1.069 | 1.073 | 0.648 | 0.640 | .6+2.8 | |
| CL | 1.069 | 1.072 | 0.649 | 0.640 | 1.+5.6 | |
| CL | 1.062 | 1.066 | 0.647 | 0.639 | 2.+22. | |
| CLD | 1.069 | 1.073 | 0.649 | 0.640 | 2.+22. | |
| CL | 1.069 | 1.073 | 0.649 | 0.640 | 2.+22. | |
| CL | 1.139 | 1.117 | 0.106 | 0.181 | 2.+22. | |
| exact | 1.069 | 1.073 | 0.649 | 0.640 | ||
| ex. | 1.069 | 1.073 | 0.645 | 0.640 | ||
| ex. | 0.798 | 1.606 | 0.211 | 0.070 | ||
| ex.pq | 1.071 | 1.071 | 0.644 | 0.644 |
| 1.375 | ||||
|---|---|---|---|---|
| CL | 0.6660.005 | 0.829+0.021 | 0.7790.037 | 0.490+0.032 |
| CLD | 0.6530.025 | 0.817+0.050 | 0.7740.105 | 0.489+0.098 |
| CL | 0.6700.005 | 0.832+0.021 | 0.7800.002 | 0.490+0.031 |
| exact | 0.660+0.011 | 0.823+0.014 | 0.7740.010 | 0.4880.004 |
| ex. | 0.659+0.011 | 0.823+0.014 | 0.7740.010 | 0.4890.004 |
| ex. | 2.170+0.294 | 0.309+0.298 | 0.184+0.159 | 0.8480.140 |
| ex.pq | 0.704+0.007 | 0.717+0.015 | 0.6250.011 | 0.6050.020 |
| 1.425 | ||||
| CL | 0.7270.039 | 0.748+0.044 | 0.6660.020 | 0.626+0.019 |
| CLD | 0.8050.010 | 0.825+0.016 | 0.6930.034 | 0.650+0.071 |
| CL | 0.8080.020 | 0.827+0.024 | 0.6920.060 | 0.653+0.059 |
| CL | 2.5 +1.1 | 2.51.1 | 0.4 +1.2 | 0.21.8 |
| exact | 0.794+0.007 | 0.809+0.012 | 0.6840.007 | 0.6540.001 |
| ex. | 0.795+0.007 | 0.810+0.012 | 0.6840.007 | 0.654+0.001 |
| ex. | 1.0210.219 | 1.385+0.219 | 0.111+0.097 | 0.0260.101 |
| ex.pq | 0.797+0.007 | 0.806+0.012 | 0.6780.007 | 0.660+0.001 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
aainstitutetext: Department of Physics, College of Science, Swansea University, Swansea, United Kingdombbinstitutetext: Max-Planck-Institut für Physik (Werner-Heisenberg-Institut), München, Germanyccinstitutetext: Bergische Universität Wuppertal, Wuppertal, Germanyddinstitutetext: Institut für Theoretische Physik, Universität Heidelberg and FEST, Heidelberg, Germany
Complex Langevin dynamics and zeroes of the fermion determinant
Gert Aarts b
Erhard Seiler c
Dénes Sexty d
Ion-Olimpiu Stamatescu
Abstract
QCD at nonzero baryon chemical potential suffers from the sign problem, due to the complex quark determinant. Complex Langevin dynamics can provide a solution, provided certain conditions are met. One of these conditions, holomorphicity of the Langevin drift, is absent in QCD since zeroes of the determinant result in a meromorphic drift. We first derive how poles in the drift affect the formal justification of the approach and then explore the various possibilities in simple models. The lessons from these are subsequently applied to both heavy dense QCD and full QCD, and we find that the results obtained show a consistent picture. We conclude that with careful monitoring, the method can be justified a posteriori, even in the presence of meromorphicity.
Keywords:
Lattice Quantum Field Theory, Phase Diagram of QCD
††arxiv: 1701.02322
1 Introduction
The determination of the QCD phase diagram in the plane of temperature and baryon chemical potential is one of the outstanding open questions in the theory of the strong interaction, as it is relevant for the early Universe, ongoing heavy-ion collision experiments at the Large Hadron Collider and the Relativistic Heavy Ion Collider, nuclear matter and compact objects such as neutron stars.
Ample progress has been made along (or close to) the temperature axis, where lattice QCD can be used to solve the theory numerically, and in recent years it has been possible to simulate QCD with flavours of light quarks using physical quark masses while taking the continuum limit Borsanyi:2013bia ; Bazavov:2014pvz . This is directly relevant for ultrahigh-energy heavy-ion collisions. The remainder of the phase diagram has not yet been established from first principles. As is well-known Barbour:1986jf ; deForcrand:2010ys , at nonzero baryon chemical potential, the quark determinant in the standard representation of the QCD partition function is complex, rather than real and positive, ruling out the immediate use of standard numerical methods based on importance sampling. This is generally referred to as the sign problem.
There are various proposals available to circumvent the sign problem, see e.g. the reviews deForcrand:2010ys ; Aarts:2013bla ; Gattringer:2014nxa ; Sexty:2014dxa ; Scorzato:2015qts and lecture notes Aarts:2015tyj . One approach which has generated substantial attention in the past years is the complex Langevin (CL) method, since it has so far proved to be quite successful in simulating systems with a complex action , or complex weight , from simple toy models to QCD Aarts:2008rr ; Aarts:2008wh ; Aarts:2010gr ; Aarts:2011zn ; Aarts:2012ft ; Seiler:2012wz ; Sexty:2013ica ; Langelage:2014vpa ; Aarts:2014bwa ; Aarts:2016qrv ; Sinclair:2015kva ; Sinclair:2016nbg . While the method was suggested already in the 1980s Parisi:1984cs ; Klauder:1983nn , recent progress has come in several ways: the theoretical justification has been provided Aarts:2009uq ; Aarts:2011ax (see also Refs. Salcedo:2015jxd ; Salcedo:2016kyy for related theoretical developments); numerical instabilities can be eliminated using adaptive stepsizes Aarts:2009dg ; explicit demonstrations that the sign problem can be solved in spin models and field theories have been given, even when it is severe Aarts:2011zn ; Aarts:2008wh ; Aarts:2009hn ; and finally, for nonabelian theories, controlling the dynamics via gauge cooling Seiler:2012wz , possibly adaptive Aarts:2013uxa (see also Ref. Nagata:2015uga ), has been shown to be necessary and effective, resulting in the first results for full QCD Sexty:2013ica ; Aarts:2014bwa ; Sinclair:2015kva ; Sinclair:2016nbg ; Nagata:2016mmh . Promising steps beyond gauge cooling have also been taken Attanasio:2016mhc .
There is, however, a serious conceptual problem that has to be faced. It is by now quite well established that when the weight is free from zeroes in the whole complexified configuration space, the only worry is the possibility of slow decay in imaginary directions Aarts:2009uq ; Aarts:2011ax , which will result in incorrect convergence. However, for theories which include fermions, such as QCD, integrating out the latter will yield a determinant which will always have zeroes for some complexified configurations. These zeroes lead to a meromorphic drift; the formal justification for the CL method Aarts:2009uq ; Aarts:2011ax requires holomorphicity, however (this will be reviewed below), and poles may cause convergence to wrong results. The relevance of this has first been pointed out by Mollgaard and Splittorff Mollgaard:2013qra ; Mollgaard:2014mga in the context of a random matrix model and has been further investigated in Refs. Greensite:2014cxa ; Nishimura:2015pba ; Nagata:2016vkn ; Ito:2016efb . Possible consequences for the behaviour of the spectrum of the Dirac operator Splittorff:2014zca have been studied in random matrix theory Ichihara:2016uld , as has the interplay with gauge cooling Nagata:2016alq .
This problem has both theoretical and practical aspects. Concerning the former, it requires a re-analysis of the derivation and justification of the method, given for the holomorphic case in Refs. Aarts:2009uq ; Aarts:2011ax . To do so is the first aim of this paper and is the topic of Sec. 2. In practice, it has been observed in a number of papers that a meromorphic drift will not necessarily cause convergence to wrong results – sometimes without this issue being explicitly flagged up (one example being when the meromorphicity is due to the Haar measure). However, this aspect is not yet properly understood; while there is a collection of results for a variety of models, an overall understanding is lacking. In Sec. 3 we address this issue using simple models, in which a detailed understanding can be obtained. Lessons from this analysis are summarised in Sec. 4. In Sec. 5 we then move to a more intricate SU(3) model and see how the lessons apply in that context. Finally, in Sec. 6 we turn to lattice QCD – heavy dense QCD and full QCD – and compare our findings with the understanding developed previously. A discussion of the results obtained in the various models is contained in Sec. 7. We conclude that an overall consistent picture can be extracted, applicable across all models considered, and give guidance on how to tackle this problem in future simulations. The Appendices contain some additional material, including proposals on how to handle poles in the drift in special cases. We note that partial results have already been presented in Refs. Seiler-sign ; Aarts:2016mso ; Aarts:2016bdr .
2 Formal justification in the presence of poles
We briefly recall the basic principles of the CL method, adapting the results for its justification Aarts:2009uq ; Aarts:2011ax to include a meromorphic drift, i.e. a drift with a pole.
Given a holomorphic action we denote by the (normalised) complex density
[TABLE]
on the original real field space. For simplicity we assume here a flat configuration space, i.e. . A complex drift is defined by analytic continuation as
[TABLE]
The CL equation, a stochastic differential equation in the complexified field space, with the drift given by the real and imaginary parts of ,
[TABLE]
leads to the Fokker-Planck equation describing the evolution of the (positive) probability density ,
[TABLE]
with
[TABLE]
where and . We used here ‘complex noise’ () for presentation purposes; below we specialise to real noise (), as advocated earlier Aarts:2009uq ; Aarts:2011ax .
Averaging over the noise, the evolution of holomorphic observables is governed by the equation
[TABLE]
with
[TABLE]
where in the last step we used the Cauchy-Riemann equations, i.e. holomorphy of , and .
The consistency of the complex Langevin method with the original problem hinges on the quantity
[TABLE]
which is supposed to interpolate between
[TABLE]
and
[TABLE]
where is the complex density evolved according to
[TABLE]
Here it is necessary to choose the initial density positive, typically a -function.
Correctness of the CL method requires that the two quantities and are equal, i.e. , at least as . To show this equality, in Ref. Aarts:2011ax it was argued that
[TABLE]
this required that formal integration by parts, without possible boundary terms, is correct.
For holomorphic actions, this requires care in the imaginary directions, . Slow decay, for instance power-law decay in polynomial models, does not allow partial integration to be carried out for all holomorphic observables without picking up contributions at the boundary. On the other hand, if the distribution is strictly localised in a strip in the complex configuration space, no boundary terms will appear and the results from the CL simulation can be justified, see for instance Ref. Aarts:2013uza for an explicit example.
In the case of a meromorphic drift, the topic of this paper, we have to introduce two boundaries: one at large and near the location(s) of the pole(s), which we denote generically as . Let us first reconsider Eq. (2.12): by definition we have
[TABLE]
We may consider a single trajectory starting at , which means choosing
[TABLE]
We then find
[TABLE]
This is well defined. Furthermore, provided , the time-evolved observable is holomorphic for . However, according to Eq. (2.8), we have to expect that has an essential singularity at , since formally
[TABLE]
and each term of the series in general will produce a pole of higher order. This is the first finding.
Now let us look at Eq. (2.14): For simplicity we assume that there is only a single pole at and consider a one-dimensional configuration space. Integration by parts can be used at first only for the domain
[TABLE]
in which the dynamics is nonsingular; later we have to take the limits and . The first integral in Eq. (2.14) is of the form
[TABLE]
where is the ‘probability current’
[TABLE]
with
[TABLE]
Using the divergence theorem (Gauss’s theorem) one finds that the first integral is equal to
[TABLE]
where stands for the line element of the boundary and denotes the outer normal. The boundary has 3 disconnected pieces: two straight lines at and a circle at . We assume now as usual that has sufficient decay so that the contributions from disappear for . Then the question remains if the circle around gives a nonvanishing or even divergent contribution.
Numerically it has been found that always
[TABLE]
and furthermore that vanishes at least linearly with the distance from , with some angular dependence. But the expected essential singularity of the evolved observable at could lead to a finite or even divergent contribution as . Numerically, however, we never found divergent behaviour, so presumably the boundary terms are finite. But they may be nonzero, spoiling the proof of correctness. This is the second finding, the appearance of boundary terms, similar to the ones that may appear at .
Let us apply integration by parts a second time to the bulk integral
[TABLE]
where denotes the ‘diffusive current’
[TABLE]
The integral above is then
[TABLE]
Green’s first identity (also a consequence of the divergence theorem) says that this is equal to
[TABLE]
The discussion of the new boundary terms is almost identical to the one above; again what happens depends on the detailed behavior of near .
In practice we found (numerically) no indication of any divergence caused by the existence of an essential singularity of .111There are exceptions to the claim that a meromorphic drift will cause an essential singularity in , but unfortunately they are nongeneric. One example is discussed in App. A. The reason for this seems to be that both and have nontrivial angular dependence. In Sec. 3.2 we discuss a probably typical situation in which vanishes identically in two opposite quadrants near the pole. So if shows strong growth only in those quadrants, the product may well be integrable, i.e. the boundary terms near the pole remain bounded.
To summarise, we find that the time-evolved observable will generically have an essential singularity at the pole, which, however, is counteracted by the vanishing distribution. Concerning the justification, partial integration at the boundaries now also includes integration around the pole, which requires the distribution to vanish rapidly enough for partial integration to be possible without picking up boundary terms. In the following section, we will study this first in simple models, focussing on the essential elements.
3 Poles: inside or outside the distribution
From the formal derivation in the previous section, it is clear that the essential question concerns the interplay between the pole (and observables evaluated close to the pole) and the equilibrium distribution. Logically there are three possibilities:
poles are outside the distribution; 2. 2.
poles are on the edge of the distribution; 3. 3.
poles are inside the distribution.
It can be expected that in the first case poles are not dangerous, as they are avoided in the Langevin process (possibly after thermalisation). What happens in the second and third possibility is not a priori clear. In this section we will discuss each of these cases using simple zero-dimensional models, with the aim of extracting insight that can be carried over to more complicated theories, including QCD. Some additional remarks on simple models with poles are given in App. A and B.
3.1 One-pole model
The simplest model of a system with a pole is given by the density on ,
[TABLE]
where we take real. When is real, the weight is real as well, but the model has a sign problem for odd , while for even the zero in the distribution may potentially lead to problems with ergodicity. When is complex, the weight is complex of course.
The complex drift appearing in the Langevin process is given by
[TABLE]
While the original weight vanishes at , the drift diverges and is hence meromorphic. We will refer to this model as the “one-pole model”. Special cases (with ) have been considered long ago Salcedo:1993tj ; Fujimura:1993cq , while recently this model has been studied again, in particular for a large range of values of Nishimura:2015pba . Our focus is somewhat different; we are mostly interested in the interplay between the location of the pole and the distribution and, for real , the difference between and .
This model captures the presence of a meromorphic drift in QCD in a very rudimentary way, as follows. Consider the QCD partition function for degenerate flavours,
[TABLE]
with
[TABLE]
where in the last expression we have written the fermion determinant in terms of the eigenvalues of the Dirac operator, , which depend on the gauge field configuration, as indicated with the dependence. The drift contributing to the update of link will now have a contribution from the fermion determinant as
[TABLE]
where denotes the derivative. When goes to zero (and the determinant vanishes), the drift has a pole. In the one-pole model, the complicated dependence of on is replaced by a simple pole located at , i.e.
[TABLE]
In QCD, the links are of course fluctuating and the dependence is considerably more complicated. The relation between the number of flavours () and the order of the zero () depends on details of the fermion determinant.
3.1.1 Strips in the complex plane
To continue, we allow the location of the pole in the drift to be complex in general and take real and positive. The drift has fixed points () at
[TABLE]
A fixed point is attractive (repulsive) if . We find
[TABLE]
and hence, for real or imaginary , this yields
- (a)
real both fixed points are real and attractive;
- (b)
imaginary, complex: both fixed points are attractive;
- (c)
imaginary, imaginary: the fixed point closer to the real axis is attractive, the other one repulsive.
In order to find where the pole is with respect to the equilibrium distribution , and be able to discuss the three cases above (pole is outside, on the edge or inside the distribution), we note the following. The drift in the imaginary direction is given by
[TABLE]
Without loss of generality we take . Hence it immediately follows that the drift is pointing downwards when and upwards when . In the case of real noise (which we use from now on), this implies that the equilibrium distribution will be nonzero only in the strip Aarts:2009uq ; Aarts:2013uza . Hence generically in this model the pole will be on the edge of the distribution. Moreover, since the distribution is strictly zero outside the strip, partial integration at is not a problem and therefore this aspect of the justification is under complete control.
Following the analysis of Refs. Aarts:2009uq ; Aarts:2013uza , we can in fact derive a stronger result. It follows from the FPE that the equilibrium distribution has to satisfy the condition
[TABLE]
Since , it follows that if has a definite sign as a function of for given , has to vanish for this value. Following exactly the same steps as in Sec. 4.2 of Ref. Aarts:2013uza , we find the following. As a function of , has an extremum at and the value at the extremum is given by
[TABLE]
The zeroes of , at
[TABLE]
determine the presence of additional boundaries at , provided they are real Aarts:2013uza . We find that
- a)
: no additional boundaries;
- b)
: additional boundaries at , when .
This situation is sketched in Fig. 1.
In the latter case, no conclusion from this argument can be drawn regarding the strips and . However, an additional analysis of the classical flow pattern shows that the strip is an attractor, while the strip can only be visited when the process starts at . The drift inside this strip is pointing mostly towards and hence this region will eventually be abandoned. It will therefore at most be present as a transient.
We conclude that in this model the pole is either on the edge of (case a) or outside (case b) the distribution. In the following we address each of these possibilities.
3.1.2 Ergodicity and bottlenecks for real poles
We first discuss the real case, with , since this allows us to introduce the concept of a ‘bottleneck’, which will turn out also be relevant for the complex case. In this case, the distribution is real, but with a sign problem for odd . As follows from the analysis above, both fixed points are attractive and the equilibrium distribution lies on the real axis. This is illustrated in Fig. 2 for and . We note that close to the pole, the drift is repulsive along the real direction and attractive along the imaginary direction; it is easy to see that this is true in general.
In the limit of continuous Langevin time, trajectories of a real Langevin process will not cross the poles naga . This leads to a ‘separation phenomenon’, a point made some time ago flower . In an actual simulation, because of the finite step size, crossing of the poles may happen (depending on the step size) Fujimura:1993cq . It is instructive to look at the corresponding stationary Fokker-Planck equation (on the real axis)
[TABLE]
Clearly is a solution, but wherever there is a sign problem, it cannot be the stationary probability distribution, since should be nonnegative. Instead we find two linearly independent, nonnegative solutions:
[TABLE]
any linear combination of and with nonnegative coefficients is likewise a possible long time average. Hence the Fokker-Planck Hamiltonian has two ground states.
If the simulation manages to slip through the barrier sufficiently easily, we expect to get
[TABLE]
i.e. the phase-quenched model, as already found in Ref. Fujimura:1993cq . We have verified this numerically for . One way to cross the bottleneck and facilitate tunneling through the pole is by adding a small amount of imaginary noise. However, the drift (3.2) is insensitive to sign changes in and the phase-quenched result is recovered. We conclude that the Langevin process cannot give correct results for odd .
For even , there is no sign problem, but the lack of ergodicity exists as well. In this case, because of the stronger repulsion away from the pole, our simulations typically do not cross the pole, and hence produce incorrect results when started on one side of the pole. In this case, adding a small imaginary noise term does facilitate the crossing and leads to correct results.222For the special case the symmetry allows one to start the process with equal probability on either side of the pole and obtain correct results as well.
In conclusion, we find that zeroes in the distribution lead to a bottleneck and hence ergodicity problems. Whether this zero is crossed depends on the order of the zero: the higher the order, the more difficult the crossing is. We will see that the same is true in the complex case, even though it is easier to go around the pole in the complex plane in that case.
3.1.3 Poles outside the distribution
We now consider the complex case and take () and three values: . The relevant parameter determining the distribution is , which takes the values 5/2, 5/4 and 5/6 respectively. Hence for the distribution is confined to the strip and the pole is outside the strip, while for the other values the distribution touches the pole and .
The classical flow diagrams are shown in Fig. 3, for and . It is easy to see from the flow patterns that the general conclusions apply. For completeness, the corresponding thimbles333In short, (stable) thimbles correspond to deformations of the original integral: they emerge from the classical fixed points and along the thimbles the imaginary part of the weight is constant Cristoforetti:2012su . Thimbles may end at singularities of the drift Aarts:2014nxa . are shown in Fig. 4. Here the full (blue) lines are the stable, contributing thimbles and the dashed lines are the unstable, noncontributing thimbles. We note that at the unstable thimble for the lower fixed point is the stable thimble for the upper fixed point. At the lower value the thimbles meet at the pole, while at the higher value the stable thimble avoids the pole, consistent with the Langevin analysis.
We first consider the case . The histogram for is shown in Fig. 5 and is confined between , as it should be. The results for the observables () from a complex Langevin simulation are shown in Table 1; we observe excellent agreement with the exact result. It is clear that this is in line with the formal derivation. We hence state the following
Proposition: If the drift is such that the equilibrium distribution is confined to a simply connected region not containing any poles of the drift, the complex Langevin process converges to the exact results.
3.1.4 Poles on the edge of the distribution
We now turn to and , with the pole at the edge of the distribution. The corresponding histograms for are shown in Fig. 6 and the results for are listed in Table 1. Here we note that the Langevin results for are wrong, while the results for appear to be correct (within the error). To understand this better we employ two methods.
First we note that the histograms look quite different. At the distribution is nonzero very close to the pole, which one expects yields boundary terms in the formal justification, which invalidate the outcome. On the other hand, at the distribution is peaked predominantly away from and the pole appears to be avoided. We make this more precise by computing the partially integrated distribution
[TABLE]
The results are shown in Fig. 7 on a linear scale (left) and on a logarithmic scale (right). On a linear scale it is easy to see that at , is nonzero up to and goes to zero linearly (at the pole the distribution is zero of course). Based on the formal justification, we conclude that this slow decay invalidates the applicability of the approach. On the other hand, at the distribution appears to drop exponentially in an extended interval , possibly with two exponentials. Hence expectation values of polynomials can be computed safely, as illustrated in Table 1.
For , it can be seen that there is a nonvanishing boundary term around the pole. Instead of a small circle surrounding the pole at we may consider a horizontal line approaching the pole for . Then the boundary term in Eq. 2.23 becomes (for )
[TABLE]
The smooth terms can be replaced by their values for and a boundary term arises because
[TABLE]
Next we try to elucidate in more detail what is causing success or failure. For this purpose let us remember that in Ref. Aarts:2011ax we established a criterion for correctness that went as follows: if the consistency conditions hold for ‘all’ observables and a bound of the form
[TABLE]
holds, then the process produces correct results. Since for the original complex integral such a bound obviously holds, it is a necessary condition for correctness. Now the consistency condition simply expresses the fact that we have reached convergence, so it should be satisfied; the bound Eq. (3.19), however, may fail. We can see from the CL simulation that Eq. (3.19) apparently fails for , but not for the other two values. In order to see this, define
[TABLE]
These functions are related to the expectation values of by
[TABLE]
In Fig. 8 we show the functions for integer values . In all three cases the shape of the functions seems to stabilise with growing , whereas there is approximately constant shift upwards with . This suggests the following asymptotic behaviour,
[TABLE]
with some constant , so
[TABLE]
How can this remain bounded for ? The only possibility is that the Fourier transform decays exponentially; this will be the case if is analytic in a strip . In particular has to be smooth. Looking at Fig. 8 one can see clearly that for is developing a kink, wheres in the other two cases it at least appears to be smooth and the effect of the pole appears to be negligible. Hence we may conclude that the incorrect convergence is due to the failure of the bound (3.19).
To summarise the findings in the one-pole model, we conclude that if close to the pole the distribution drops to zero fast enough, e.g. exponentially in the case considered here, the meromorphicity of the Langevin drift is not necessary an obstacle and correct results can still be obtained. When on the other hand the distribution is not falling rapidly at the pole, incorrect convergence is observed.
3.2 U(1) one-link model
In order to analyse what happens when poles are inside the distribution, we switch to the following U(1) integral with a complex weight,
[TABLE]
This model was introduced in Ref. Aarts:2008rr (for ) as a toy model for QCD, with a complex ‘fermion determinant’
[TABLE]
satisfying . Complex Langevin dynamics was studied extensively in Ref. Aarts:2008rr for , while problems for were first reported in Ref. Mollgaard:2013qra . Subsequently thimbles were analysed in Ref.Aarts:2014nxa .
When the weight is positive when , while for there is already a sign problem at . Concerning Langevin dynamics, we note that good results are obtained when (and not too large and negative), while problems emerge for and not too large Mollgaard:2013qra ; Aarts:2014nxa . It should be noted that in view of the later sections even values of can be physical as the QCD determinant has double zeroes when the Wilson fermion formulation is used.
The complex drift reads
[TABLE]
When there is an attractive fixed point at and repulsive fixed points at , with poles located at , where . When , poles are at , with . We start with a brief discussion of three sets of parameters, all with :
- (1)
: pole at ;
- (2)
: poles at ;
- (3)
: poles at .
Results of CL dynamics for the observables () are shown in Fig. 9. For set (1), we observe good results, except when is large and negative, . For those values, fluctuations are large and increasing the simulation time does not improve this, a sign of non or poor convergence. For set (2), excellent agreement with exact results is obtained. For set (3), we observe agreement for large and positive , but increasingly worse behaviour as is reduced. The results for have larger errors, but the values of the averages are robust as the Langevin time is increased, hence here we find incorrect convergence. Since for our choice of parameters the poles are located at , we note that exponentials with () will be less (more) sensitive to the presence of the poles, as a suppression (enhancement) with () arises naturally. This is indeed supported by the data.
In the following we focus on the case where and , since this is where complex Langevin dynamics converges, but possibly to an incorrect result. Moreover, we will compare and 4.
3.2.1 Poles inside the distribution
We consider parameter set (3), with and and 4. Classical flow diagrams are given in Fig. 10 for (note the periodicity in ). Besides the attractive ‘perturbative’ fixed point at , there is an additional attractive fixed point at . The other two fixed points are repulsive. It is clear to see from the flow diagrams, and can be confirmed following a similar analysis as above, that the equilibrium distribution will be contained in a horizontal strip between the two attractive fixed points. Finally, the pole is attractive in the imaginary direction and repulsive in the real direction (as always), making the pole an approximate bottleneck, just as in the real case considered in Sec. 3.1.2. Hence, as the attractive fixed points move closer together in the imaginary direction, the distribution gets narrower and narrower.
In Fig. 11 we show logarithmic contour plots of the equilibrium distribution sampled during the CL process in the complex plane for (left) and 2 (right). Note that the darker colours correspond to the most frequently visited regions. The position of the pole can clearly be identified as the place where the distribution is pinched, resulting in a bottleneck; this effect gets stronger with increasing . The distribution is strictly zero outside the strip set by the attractive fixed points.
To better understand this structure, we note that the approximately disconnected regions (i.e. the ‘head’ and the ‘ears’ in Fig. 11) are characterised by the sign of the real part of the determinant,
[TABLE]
and hence we will refer to them as ,
[TABLE]
with the ‘head’ and the ‘ears’. For , we observed frequent crossings between the two regions. For , the crossings are rarer but still frequent enough such that both regions are visited during long runs. This might, however, be due to the finite time step. In the continuous time limit it is possible that the two regions that are not connected by the process, i.e. the process might not be ergodic. Of course rare crossings make it hard to collect good statistics. For (not shown) no crossings were observed and the distribution only has support in .
To translate these findings to an observable easily accessible also in more complicated models and lattice theories, we consider the complex determinant. Logarithmic contour plots of are shown in Fig. 12. We observe a similar structure, with the zero of acting as the bottleneck. We will use this diagnostics in the more complicated models discussed below.
In view of the formal justification, see Sec. 2, it is important to know the rate at which the distribution goes to zero at the pole. This is shown in Fig. 13 for the partially integrated distribution (left) and the real part of (right). For we observe a linear decrease at the pole (recall that ), while for the decay is faster. For , the pole is not crossed and the entire dynamics takes places in . Since the pole does not negatively influence the dynamics in this case, we expect good agreements with the exact results, although there may be problems with ergodicity, similar to the real case. This is demonstrated in Fig. 14, where Re is shown on a logarithmic scale, for 10 values of . For we find approximate agreement, especially for close to 0. For , good agreement is seen for all values considered. This is consistent with the formal derivation: for the pole is avoided and only the region sufficiently far from is relevant.
Finally, we stress once more that further support for the validity of the formal arguments comes from the observed interplay of the observables and the pole: it is possible that for some observables good agreement is found, while for others it is not. This crucially depends on the region in configuration space most relevant for the observable under consideration, as exemplified in this model by the observables , with .
3.2.2 What does the CL simulation actually compute?
In order to further understand the relevance of the contributions from the nearly disconnected regions , we have analysed the results from Langevin simulations for by separating the trajectories based on the sign of the real part of . The results are summarised in Table 2 in the columns labeled CL[]. We note that the results obtained when restricted to are close to the exact results, listed in the first column, but not quite equal.
We can understand this as follows: first we shift the contour of integration of the original integrals to go through the zeroes of . For set (3) this means Im . Next we split the integration into two contributions coming from the two inequivalent paths connecting the zeroes, one living in and the other in , and define
[TABLE]
and similarly
[TABLE]
The exact results, restricted to , are shown in Table 2 in the columns labeled exact[]. The agreement between the restricted Langevin and exact results is convincing. This should not be surprising, since the formal proof of correctness provided earlier is directly applicable to the model restricted to or .
Since the exact values for the full model can be obtained as
[TABLE]
a way to obtain the correct results would be to combine the restricted simulation results with the weights
[TABLE]
Note that since , the deviation between full results and those restricted to is on the order of a few percent as well, as illustrated in Table 2. The problem with this prescription is of course that in realistic models the weights are not known. However, below we will see that typically is tiny and can be approximated by zero; here it is nonnegligible because we chose the rather extreme value .
In the case of , the process never crosses into , which indicates a lack of ergodicity, similar to what was found in Sec. 3.1.2. The process simulates a version of the original integral restricted to a path running between the zeroes, which is not quite equal to the full integral. This causes a tiny systematic error which is, however, not visible in the data since it is highly suppressed; for , .
4 Lessons from simple models
The following lessons can be learned from the simple one-variable models.
Lesson 1: It has been suggested Mollgaard:2013qra ; Mollgaard:2014mga that the winding of the Langevin paths around the pole is the source of the problem, because the pole corresponds to a logarithmic branch point in the action. However, in the one-pole model of Sec. 3.1 we have demonstrated explicitly that no such winding occurs, since the pole lies either on the edge or outside the distribution. Nevertheless wrong results can be encountered. Further indication that it is not the winding which matters has been given in Ref. Nishimura:2015pba , see also Sec. 6 for the case of full QCD.
Lesson 2: It has been said (see for instance Ref. Nishimura:2015pba ) that it is sufficient for correctness if the distribution is ‘practically zero’ at the pole. Using again evidence from the one-pole model, we note that this is not correct in general: for small wrong results are obtained, but vanishes at the pole. On the other hand, we have shown (and demonstrated numerically for ) that it is sufficient for to be nonzero only in a simply connected region whose closure does not contain the pole(s). The intermediate case seems to have at least a distribution vanishing at very high (maybe infinite) order at the pole, also leading to good results. All this can be understood in the light of the fact discussed in Sec. 2: the observables evolving according to Eq. (2.8) typically develop an essential singularity at the location of the pole of the drift.
Lesson 3: A strong attractive fixed point sufficiently far from any poles of the drift leads to correct results. This almost obvious fact has been observed already earlier, e.g. in QCD with static quarks Seiler:2012wz .
Lesson 4: The existence of a ‘bottleneck’ between two regions and , such as in the U(1) one-link model of Sec. 3.2, is a signal for potential trouble. The best variable to analyse this is the determinant (not raised to any power), because it can also be used in more complicated lattice models, as we will see below.
Lesson 5: It is possible that the relative weight of one of the two regions is suppressed, i.e. . Then a modification of the process which includes only trajectories with Re , i.e. those contained in , or using long runs such that the weight of runs in is naturally suppressed, seems to produce reasonably good results. On closer inspection, however, it only gives approximate results, since only one part of the original complex integral is represented, namely the part contained in . However, if indeed , this may give a numerically accurate approximation to the complete problem.
Lesson 6: The effect of increasing the strength of the pole by increasing is twofold: On the one hand the ‘pull’ in the imaginary directions towards the pole is increased, which is bad; on the other hand the ‘push’ in the real directions away from the pole is strengthened, which is good.
In the one-pole model, with the pole on the imaginary axis, the first effect dominates: hence increasing makes the situation worse. We have checked that for , to obtain correct results, a larger value of is needed than for .
For parameter set (3) in the model the second effect dominates: increasing makes the bottleneck between the two regions and narrower and inhibits transitions between the two regions; furthermore it reduces the relative weight of . For this bottleneck does not prevent the process from moving between the two regions; for , transitions are already rarer and it seems that each of the regions around the two attractive fixed points supports an invariant measure by itself; for no transitions are observed even for extremely long runs. It should be noted that in lattice QCD with flavours of Wilson fermions the degrees of freedom make at least .
Lesson 7: The interplay between an observable and the distribution determines how close the expectation value of the former is to the correct one: if the observable is naturally suppressed/enhanced near the pole, it is possible to obtain, within the numerical error, correct/manifestly incorrect results. This explains why one can encounter both apparently correctly and manifestly incorrectly determined expectation values in a single analysis.
We will now take these lessons and see how they apply to more realistic models.
5 Effective SU(3) one-link model
In the following section we investigate the role of the zeroes and the ensuing lessons in a system with more degrees of freedom, which is however still exactly solvable, namely an effective SU(3) one-link model. Versions of this model have been considered before, see e.g. Refs. Aarts:2008rr ; Aarts:2012ft . Here, the form of the model and the choice of parameters is motivated by QCD with heavy quarks (HDQCD), to be discussed in Sec. 6.
The starting point is QCD with flavours of Wilson fermions. At leading order in the hopping expansion, the fermion determinant can be expressed as a product of factors involving Polyakov loops at each spatial site, see Sec. 6 below,
[TABLE]
where the remaining determinant is in colour space only444Note that these expressions are valid for SU() and SL(). and are the (inverse) Polyakov loops,
[TABLE]
with the number of time slices in the temporal direction. The parameters arise from the hopping expansion and read
[TABLE]
Employing the temporal gauge we can see that the product of local factors is equivalent to having only one temporal link in each factor. Using standard relations, again valid for both SU() and SL(), the remaining determinants can be expressed in terms of the traced Polyakov loops,
[TABLE]
Explicitly, for this gives
[TABLE]
and for ,
[TABLE]
For larger the relations become more complicated but the determinant always includes a term which dominates at large (making the sign problem increasingly harmless toward the saturation regime). Notice that for SU(), . In the following we concentrate on the case.
5.1 Effective one-link model for HDQCD
To define an effective model for HDQCD in four dimensions we consider the resulting fermion determinant on a single spatial lattice site, such that and are the only degrees of freedom. Here is the remaining temporal link in the temporal gauge. To approximate the Yang-Mills integration of the lattice model we consider the temporal link surrounded by its neighbours, see Fig. 15, and replace the contributions from the staples connected to by a single matrix , such that
[TABLE]
For an ordered lattice , while for a disordered lattice GL(3,) in general.
There are various ways to proceed Aarts:2012ft . Here we diagonalise , with eigenvalues (, ). The group integral then includes the reduced Haar measure
[TABLE]
The complete one-link action to consider now takes the form
[TABLE]
where the diagonal elements of are represented by the ’s (where we took out a factor of ). The Langevin drift is determined by and complex Langevin dynamics can be implemented for all three ’s or after eliminating the constraint Aarts:2012ft . Zeroes in the Haar measure also lead to poles in the drift, but these generally do not lead to problems and, in fact, stabilise the dynamics. This has been discussed in Ref. Aarts:2012ft . More details concerning the distribution of zeroes of are given in App. C.
As observables we consider
[TABLE]
Exact results are obtained by numerically integrating over the angles . When the action is real, .
In order to determine reasonable parameter values, relevant for HDQCD, we write the fermion determinant as
[TABLE]
where
[TABLE]
with
[TABLE]
Notice that have maxima at and , respectively, with the same value independently on . While the behaviour of the model does not depend on how are parametrised, the interpretation in terms of physical lattice parameters does.
From Eq. (5.3) it follows that the interesting values of are around
[TABLE]
the critical chemical potential for onset at zero temperature, i.e. the chemical potential at which the density changes from zero to nonzero Aarts:2016qrv . This is illustrated in Fig. 16 (left), where the dependence of and is shown for given and . With increasing , decreases and the peaks shift to the left, while with increasing the peaks become narrower. We also note that the (anti-quark) contribution becomes increasingly irrelevant as increases, as becomes exponentially small. Hence we will usually neglect . In the following we use , unless stated otherwise, and . Since in the one-link model there is no transition as is varied at , see Fig. 16 (right), we choose to work at , but we have also studied larger values. We considered two values
[TABLE]
where is the corresponding critical value (5.16), corresponding to . The sign problem is (nearly) absent exactly at onset, where , , and in Eq. (5.13) is real ( is exponentially close to 1). This will explain some of the results below and has been noted before Seiler-unp ; Rindlisbacher:2015pea . The behaviour for the two values is rather similar therefore we shall only show the results for .
Finally, to study the effect of the neighbouring links, represented by , we consider two cases:
ordered lattice: ; 2. 2.
(strongly) disordered lattice: .
5.2 Ordered lattice
We first consider the ordered lattice (). Fig. 17 contains results for the observables (), averaged over 100 trajectories, using random starting points. The runs are relatively short: the total Langevin time is around 130, with 20% thermalisation. Note that and are typically rather close together. We see very good agreement, except around . The same behaviour is found at the larger . We hence focus on three values: 1.375 (below onset, CL fine), 1.425 (close to onset, CL problematic), 1.475 (above onset, CL fine).
In Fig. 18 we show results for each of those values, using 50 independent, relatively short, trajectories. The figures on the left show the observables against trajectory index. When CL is fine, all trajectories fluctuate around the exact result. However, when CL is problematic (middle figure), the trajectories appear to split in two groups, indicated by the red and blue symbols. We identify those using the minimal absolute value of the determinant on the trajectory,
[TABLE]
Trajectories with always appear to lead to correct results, while the trajectories having lead to a wrong result (for definiteness we take in the following).
To investigate these two types of trajectories further, we show in Fig. 18 (right) the corresponding scatter plots for the determinant. When CL works well, the points from all trajectories appear similarly distributed, even when gets very small. At the middle value of a different picture appears: the trajectories of the first group () give a similar picture as at the lower and higher values (the “red fish”), while the second group () yields a very peculiar structure (the “blue whiskers”). The appearance of two essentially disjoint contributions in the determinant is very similar to what was observed in the U(1) one-link model. A red/blue code for identifying the disjoint (“regular”/“deviant”) contributions is used in Figs. 18, 19, 22, and explained in the captions.
In the scatter plot we showed results for the determinant which enters in the determination of the drift. More information, however, is provided by the unsquared factors . In Fig. 19 (bottom left) we show the scatter plot of the unsquared factors at . The “blue whiskers” have and come from trajectories which approach the pole (zero of the determinant) with . As in the simple models, a bottleneck separates them from the region with . Moreover, the contributions have a very small weight. Depending on the starting configuration, trajectories may run for a while in the region with negative Re , before switching to the positive side. The contributions from the “whiskers” therefore practically fades out after enough thermalization: already after all configurations appear in the region , see Fig. 19 (top). Some of the results are summarised in Table 3.
Similar as in Sec. 3.2.2, we defined here partition functions and weights restricted to subsectors, namely
[TABLE]
where is the original complex distribution. Table 3 also contains an estimate of how much time is spent in the region with , which is denoted with . It should be noted that depends on the details of transient behaviour and crossings, and is hence not immediately related to .
The exact relative weight of the region is easily found and is . We find that the process, once arrived in the positive region, only rarely visits the negative region and typically only very briefly. Hence random starts with give that region an artificially large weight and a considerate choice of the start configuration will reduce the necessity of long thermalisation times. These findings suggest it might be useful to discard trajectories with and thus sample the region only, to ensure nearly correct convergence. We find that the value of does not need tuning, in the above case any value between is acceptable, see Fig. 20 (left). Alternatively one can keep all trajectories but drop contributions from configurations with , as not to lose statistics. This is demonstrated in Fig. 20 (right). Keeping only trajectories with leads to the similar results.
As already suggested by Fig. 18 the signal for deviant contributions also shows up in observables, as can be seen in Fig. 19 (right). In this long run and rather representative case only three trajectories out of fifty contain configurations with , which lead to outlying contributions in the observables. These few contributions falsify the averages by nearly 1%, while the average over the other trajectories reproduces the exact result within the error ( 0.1 %). The irregular signal is clear in the scatter plot of the observables, which also suggests that they are related to certain initial configurations, such that their weight will diminish in long runs.
The above effects become evident by plotting the correlation between the determinant and various other quantities. In Fig. 21 we show the histograms of the probability distributions of and of one selected observable, (top). The scatter plots (middle, bottom) show a clear correlation between and the unitarity norm , the drift, and .
5.3 Disordered lattice
Next we consider a disordered lattice, i.e. we take into account the nontrivial effect of the neighbours when the lattice theory is reduced to a one-link model, see Eq. (5.10). We take and choose the values given at the end of Sec. 5.1. Fig. 22 demonstrates the behaviour of the unsquared determinant and for , see also Table 4, which is very similar as for the ordered case. We find that trajectories starting with (“blue whiskers”) need much more time to switch to the region with (“red fish”). Nevertheless the weight of the former is only about , and discarding the contribution with decisively improves the results. The results, however, deteriorate with increasing lattice disorder, which may indicate the effect of large excursions in the noncompact directions, for which the adaptive stepsize and gauge cooling become essential. Since this was studied in previous papers, we do not analyse this problem any further here.
5.4 Expansion
Finally we study the possibility to ameliorate the dynamics using an expansion of the determinant, which is also discussed in App. D for the simple models. We restrict ourselves here to the ordered case. The fermionic part of the drift is of the form
[TABLE]
Neglecting the factor , which cancels in the drift, we write
[TABLE]
and similar for . The pole is at . We then write a Taylor expansion centred at the shifted point , such that
[TABLE]
and again similar for , with parameter . Notice that the used here differs by one unit from used in App. D. Since the pole is at , the expansion around with conveniently chosen has an increased radius of convergence compared to the expansion centred at , see Fig. 23.
The “regularisation parameters” can be chosen conveniently, and can also be adapted during the simulation (dynamical analytic continuation, see App. D). We note here that this procedure can also be used for inverting matrices , where a simple choice for the regularisation term is . In lattice QCD, this can e.g. be applied to the fermion matrix. Notice that this shift is an exact procedure and does not represent an approximation for which a subsequent extrapolation is needed. In practice the expansions are of course truncated and their effectiveness depends on the radius of convergence, which is however improved by the -shift.
In Fig. 24 results for this procedure are shown. We have tested real, imaginary and 0 (no regularisation). For the latter, the expansion shows runaways and does not converge. With imaginary , the convergence was found to be not very good. On the other hand, for real the convergence and results are excellent. In Fig. 24 (left) convergence of the expansion is shown at , i.e. the value where the “whiskers” affect the result. The expansion is truncated, , and the dependence on is shown. Excellent convergence is observed. In Fig. 24 (right), observables are shown as a function of , using a fixed , leading to good agreement with the exact results (cf. Fig. 17).
We find therefore that meaningfully regularised expansions converge to the correct results, even at values for which the standard procedure does not work. This can be understood in the following way: Choosing the shift such that the expansion point lies in the “fish” region of the scatter plots, e.g. by choosing , the trajectories explore this region and never enter the “blue whiskers” region, due to the bottleneck. Hence it has the same effect as avoiding the region altogether. A dynamical expansion which adapts when approaching the edge of the domain of convergence is easy to implement as well and will cover the full analyticity domain. In this case, however, it will also collect data from the “blue whiskers”, leading to the same wrong results as when all trajectories are used.
5.5 Discussion and tentative conclusions
As long as the parameters are below 1, the Langevin process is found always to converge to the correct results, within the error. Significant discrepancies appear for where .
Although the measure is , the relevant factor for the analysis is . The sign of appears to identify two separate regions with two different contributions to expectation values. This conspicuous situation appears close to onset. The determinant also becomes squeezed in the imaginary direction. These regions show up in the scatter plots as “red fish” () and “blue whiskers” (), the former producing good expectation values for the observable but the latter leading to strongly deviant contributions. This outlying behaviour is also visible in the scatter plots of the observables directly sensitive to the pole or the fermionic degrees of freedom, which may provide a practical test in realistic lattice simulations at no cost, as the gauge invariant observables are calculated anyway. The clear correlation between the region and the outlying contributions to various quantities is shown in Fig. 21, which also illustrates the small weight of these regions.
At least in the examples analysed here the two regions appear separated by a bottleneck which can only be crossed by trajectories approaching below a certain threshold. The approach to the pole can therefore signal the possible sampling of “deviant” contributions from the region. The latter has a significantly smaller weight, which may only appear as a quasi-transient whose contribution for very long Langevin trajectories is extremely small. This suggests that discarding the contributions from the region with will automatically lead to good results. Long enough thermalisation times, or adequate starting points in the region can help by inhibiting the development of these regions. The expansion, on the other hand, seems to do just that if we set the expansion centre in the region of , leading to good results via this approach. We note that the appearance of the bottleneck, separating the complex configuration space into two regions, with , is consistent with the findings in the U(1) model.
Larger and/or larger show a picture consistent with the above one, supporting these conclusions, see Fig. 25. Larger seems to increase the transient character of the “blue whiskers” region; for only one out of 50 trajectories enters this region and it yields only a very small contribution.
We conclude that the development of regions signals failure in the simulation: although the weight of these regions is small, their contributions deviate strongly from the ones with and hence they may affect the results by many standard deviations. Dropping, one way or the other, those contributions leads to results agreeing with the exact ones, at the level of the statistical errors (at the permille level in these runs). The fact that the process fails to account correctly for the region with at certain parameter values suggests, however, that we should attribute a possible systematic error proportional with the weight of this region, which is in the examples studied here.
We may try to quantify the relevance of the region by calculating the logarithm of its relative weight, , using the exact integral expressions. As can be seen in Fig. 26 (left), appears bounded from below, and even increasing with for some values far from . The bound is about lower on the disordered lattice but shows similar behaviour. With increasing number of fermion species, the difference in free energy scales approximately with the number of flavours, , see Fig. 26 (right). Since in HDQCD the lattice determinant is a product of factors of the form over the spatial lattice, we may ask how the spatial volume would manifest itself in . Fully correlated Polyakov loops would then behave as if in the presence of many flavours, but the general case is not trivial and will be discussed in the next section.
Extrapolating the lesson from this discussion to realistic QCD lattice calculations, we conclude that one has to monitor the appearance of disconnected regions with a bottleneck at . This can be done by monitoring various quantities such as some selected observables, the drift or the lowest determinant modes avoiding time consuming procedures. Dropping by hand the occasional contributions of regions of type “blue whiskers” () should already produce good results, while an estimate of the relative impact of such contributions would suggest a measure for possible systematic errors.
6 Lattice QCD
In the following section we aim to apply the lessons found above to the case of QCD at nonzero quark density, first in the case of heavy quarks (HDQCD) and then for full QCD, using the staggered fermion formulation.
In QCD the partition function is, after integrating out the quarks fields, written as
[TABLE]
where is the Yang-Mills action, are the gauge links, and is the fermion matrix. The Langevin update for the gauge links reads PhysRevD.32.2736 , in a first-order discretised scheme, with Langevin time ,
[TABLE]
where are the Gell-Mann matrices, normalised as , and the sum over , is not written explicitly. More details can be found in Refs. Aarts:2008rr ; Seiler:2012wz ; Sexty:2013ica . The drift is generated by the action and reads
[TABLE]
Hence the zeroes of the determinant show up as poles in the drift.
6.1 Heavy dense QCD
To assess the importance of these poles, we need to specify the fermion matrix. We consider first heavy dense QCD. This approximation to full QCD can be obtained by a systematic hopping-parameter expansion of the fermion determinant, preserving terms that cannot be ignored for large chemical potential, as well as the terms related by symmetry under . For Wilson quarks, this amounts to an expansion in terms of , keeping fixed and preserving terms that go as . A detailed discussion can be found in Refs. Bender:1992gn ; DePietri:2007ak ; Aarts:2014bwa , see also Refs. Fromm:2011qi ; Fromm:2012eb for combined hopping and strong-coupling expansions.
Here we consider the resulting theory at leading order, using degenerate quark flavours, for which the fermion determinant reads Aarts:2008rr (see also Sec. 5)
[TABLE]
The remaining determinant is in colour space only. The parameter arises from the hopping expansion ( the number of time slices in the temporal direction) and are the (inverse) Polyakov loops, see Eq. (5.1). Note that the gluon dynamics is included in Eq. (6.3) via the usual Wilson Yang-Mills lattice action, with gauge coupling .
In order to study the zeroes of the determinant, we identify the basic building block of determinant, defined such that the full determinant in the path integral weight is written as
[TABLE]
Here the local determinants are
[TABLE]
where and
[TABLE]
We study the zeroes of the local determinants rather than the full determinant, since this is what is closest to the analysis carried out above and will allow us to focus on individual factors getting small.
HDQCD has been used extensively to justify the results obtained with CL, e.g. via reweighting Seiler:2012wz ; Aarts:2013uxa ; Aarts:2016qrv . Since reweighting and CL have very different systematic uncertainties, the agreement of the results obtained by both methods is a strong argument for the correctness of either approach. In particular, since reweighting does not suffer from potential problems caused by zeroes of the determinant, agreement indicates that the latter do not cause problems for CL.
We note here that HDQCD has also been used to test and compare variations of the hopping parameter expansion to higher order Aarts:2014bwa , in particular with regard to the standard hopping expansion, obtained via Seiler:2015uwe
[TABLE]
for which zeroes of the determinant do not appear. An expansion in spatial hopping terms only, for which HDQCD is the leading-order term, has been discussed and assessed in Ref. Aarts:2014bwa .
We now discuss some results in HDQCD, focussing on potential zeroes of the determinant. We mostly use the following choice of parameters
[TABLE]
Note that the lattice spacing (or gauge coupling ) and the spatial volume () are fixed, but we consider two temperatures (). In this theory, the quark number at zero temperature changes from 0 below onset to saturation above onset, with , and the critical chemical potential is given by
[TABLE]
At nonzero temperature, this transition is smoothed and the critical chemical potential , eventually connecting to the thermal confinement-deconfinement transition line at higher temperature and lower chemical potential. A study of the phase diagram at fixed lattice spacing can be found in Ref. Aarts:2016qrv . In Fig. 27 we show the density, in units of saturation density, for the parameters in Eq. (6.9) on the lattice. The rapid rise around is indeed observed.
Writing the determinant as a product of its absolute value and phase,
[TABLE]
and using the symmetry
[TABLE]
we can extract the average phase factor in the full (i.e. not in the phase-quenched) theory via a computation of Aarts:2008rr
[TABLE]
which is accessible using CL dynamics. The result is shown in Fig. 27 as well. The sign problem is severe in the onset region. At the critical chemical potential, the fermions are at ‘half-filling’, i.e. half of the available fermionic states are filled, and the theory becomes particle-hole symmetric Rindlisbacher:2015pea . Exactly at , the sign problem becomes very mild, as the first dominating factor in Eq. (6.6) becomes real (). The sign problem due to the second factor is very small, since .
To investigate the zeroes of the measure in HDQCD, we have analysed , see Eq. (6.6). Note that the corresponding factor in the effective SU(3) model was discussed in Sec. 5. We find that the simulations largely avoid the zeroes of the determinant, except in the vicinity of the critical chemical potential. This is illustrated in Fig. 28 (top), where a histogram of the absolute value of the local determinant is shown, on a double-logarithmic scale for two lattice sizes. For small chemical potential, the absolute value of the determinant is close to 1, as . As is increased, the distribution widens and its maximum shifts towards larger values. However, we also observe that the distribution is nonzero for smaller values, with an apparent power decay towards zero. Based on these simulations, we find the following behaviour
[TABLE]
Values close to zero are more likely when the chemical potential is close to the critical value, but remain suppressed. This behaviour is seen for both lattice sizes, and , with a broader distribution on the larger lattice.
Also shown in Fig. 28 are the distributions for the phase angle , defined via
[TABLE]
Note that and that the distributions are symmetric around zero. Away from the critical chemical potential, the distribution drop to zero rapidly; around we observe a decay . The relation between this phase and the phase of the full determinant is not immediate. However, we note that in general it is expected that the latter will vary rapidly, as the full determinant is a product of local determinants.
To compare with the results presented in the previous sections, we show in Fig. 29 the probability density of the local determinants on a logarithmic scale, for two values of the chemical potential close to , namely (left) and 1.425 (right). Note the very different horizontal and vertical scales. These distributions look remarkably similar to those encountered in the simpler cases, although with only a very thin presence of the ‘whiskers’, if at all. We find that at the critical point the distribution is highly elongated in the positive real direction, and that it shrinks again for . Exactly at , the distribution shrinks in the imaginary direction, leading to a much smaller typical phase of the determinant, and thus a milder sign problem. This explains the milder sign problem at , as observed via in Fig. 27. There are some configurations where Re , but these appear very infrequent and do not carry substantial weight.
However, at lower temperatures a clear sign of the whiskers appears, which indicates the possibility of contamination from configurations with Re . This is illustrated in Fig. 30, where we show results at a fixed lattice spacing, on a and lattice, such that the temperature is twice as low on the lattice. Close to , the weight of the region with Re is approx. 0.005% on the lattice and 0.08% on a lattice, indicating a growing importance as the temperature is lowered. At the lower temperature, the “whiskers” are remarkably similar to those encountered in the SU(3) one-link model. In Fig. 31 we aim to reduce the lattice spacing, while keeping the physical volume and the temperature constant. Here we see that the power decay towards zero remains approximately the same and hence the role of configurations with a small absolute value of the local determinant does not change when going (somewhat) closer to the continuum limit. We also investigated whether changing influences the appearance of the whiskers. While using a larger is beneficial, configurations with determinants in the whiskers do appear at low temperatures. Finally we have studied the volume dependence of the weights , signifying the importance of the regions , as suggested by the analysis of the one-link models, but did not find a clear suppression of the region with .
We hence conclude that the zeroes of the determinant for HDQCD appear to have no effect except close to the critical chemical potential. Since for those chemical potentials the sign problem is quite mild, this observation is not related to the severeness of the sign problem but to details of the CL process. Although the zeroes might influence the results in this region, the relative weight of configurations with Re is quite small and their influence is therefore suppressed.
6.2 Full QCD
Finally, we consider full QCD, with dynamical fermions. We note that full QCD is numerically much more costly than HDQCD, as the inversion of the fermion matrix has to be carried out numerically for every update. Similarly, an assessment of the zeroes of the determinant is harder, since it requires the computation of the full determinant. It should be noted that the determinant itself is not required for the Langevin update, only evaluated on a fixed vector Sexty:2013ica .
Here we show results obtained using staggered fermions, with the (unimproved) staggered fermion matrix
[TABLE]
where are the staggered sign functions, , , , . Note that this formulation describes four tastes, due to the fermion doubling.
There are several indications for full QCD that, at least at high temperatures and for the quark masses considered, the CL simulations are unaffected by poles in the drift: these include comparisons to systematic hopping-parameter expansions, which have holomorphic actions Aarts:2014bwa , comparisons to reweighting, which does not depend on the action being holomorphic Fodor:2015doa , and by observing spectral properties of the fermion matrix Sexty:2014dxa (see also below). At low temperature, it is at present not known whether poles affect the CL results, partly because simulations are more expensive due to the larger values of required, or are hindered by the ineffectiveness of gauge cooling on coarse lattices.
In order to study the phase of the determinant in full QCD we start from an initial configuration on the SU(3) submanifold. After thermalisation we then follow the evolution of the phase. The results are shown in Fig. 32, for a and a lattice. On the smaller volume and for the smaller chemical potentials and 1.2, one observes very mild time dependence with no winding of the phase around the origin. For the larger chemical potentials and 3.2, however, we see frequent crossings of the negative real axis. This signals that there is a sign problem in the theory which is hard to counter with reweighting, as the average phase factor gets close to zero. As expected, this behaviour gets worse as the volume is increased.
The corresponding histograms are shown in Fig. 34, for three different spatial volumes, namely , with fixed . As expected, the distribution is localised on the smallest volume, but gets increasingly wider as the volume and/or the chemical potential are increased.
To relate this behaviour to possible zeroes of the determinant, we have computed the eigenvalue spectrum of the Dirac operator for typical configurations in the ensemble, using the same setup as in Fig. 32. Fig. 35 contains the spectra, while Fig. 34 contains histograms of the absolute values of the eigenvalues, obtained by averaging over 100 configurations. We note that in spite of the frequent circlings of the origin by the fermionic determinant there are typically no eigenvalues close to zero, suggesting that the probability density of configurations around the singularities of the drift is very small, as in the simpler models discussed above. The change of the total phase is given by the sum of the changes over all the eigenvalues, so in contrast to toy models the frequent crossing of the negative real axis does not suggest that the poles are affecting the CL dynamics. We note that the increase of the volume, from to , leaves the shape of the spectrum very similar, but increases the density of eigenvalues.
To summarize our findings, in full QCD at high temperatures the singularities of the drift appears to be outside the support of the probability density of configurations. We conclude therefore that in this situation complex Langevin dynamics provides correct results, in line with the formal arguments and the lessons from the simple models. What happens at lower temperatures remains an open question.
7 Discussion
In this section we summarise the key findings and discuss them in the context of the various models. The main objective was to understand the role of zeroes in the path integral measure after analytic continuation in the context of complex Langevin dynamics, in which the zeroes show up as poles in the Langevin drift. Since the derivation of the justification of the complex Langevin approach for complex measures relies on holomorphicity of the drift, the presence of poles makes a re-analysis of this derivation necessary.
We have shown that a crucial role is again played by the behaviour of the observables considered and the (real and nonnegative) distribution, which is a solution of the associated Fokker-Planck equation and effectively sampled during the Langevin process. While for holomorphic drifts it is the behaviour at large imaginary directions in the complex configuration space that matters, for meromorphic drifts we have shown that an additional constraint arises from the behaviour close to the poles: to justify the method it is necessary to be able to perform partial integration, without picking up finite boundary terms, both around the poles and for large imaginary directions. This condition gives a requirement of fast decay of the distribution in those regions. In simple cases it is possible to verify this requirement analytically, for instance when it can be shown that the distribution is strictly zero in those regions, but for most models this has to be verified a posteriori by diagnosing the output from numerical simulations.
Besides this, we also found that time-evolved observables typically have an essential singularity at a pole. However, this is counteracted by the distribution going to zero at this pole, with nontrivial angular behaviour. This ensures that the contribution to expectation values from this region is finite, but not that boundary terms are necessarily absent.
In order to further understand and support these analytical considerations, we have subsequently analysed a number of models and theories with increasing complexity, from the one-pole model with one degree of freedom to full QCD at nonzero baryon density, with the aim of extracting common features. Logically a pole can be outside, on the edge of, or inside the distribution, and we have given examples of each of these. As expected, when the pole is outside, it does not interfere with the Langevin process. The possibility of a pole on the edge is important, since it indicates that it is not the winding around the pole that matters, but the decay of the distribution towards the pole. Indeed, we have encountered both correct and incorrect convergence in this case, and this can be traced back to the fast decay of the distribution, or lack thereof.
As a side remark, we note that further support for our analytical understanding comes from the observed interplay between observables, drift and the distribution: if the observable is naturally suppressed (enhanced) near the pole, it is possible to obtain correct (incorrect) results. This explains why in one analysis one can encounter both correctly and incorrectly determined or nonconverging expectation values.
When the pole is inside the distribution, we have found that it typically leads to a bottleneck, i.e. a region in configuration space which is difficult to pass and effectively divides the configuration space in two, nearly disjoint regions. For QCD and QCD-like models, it is zeroes of the determinant that correspond to poles and determine the location of the bottleneck. Hence the complex-valued determinant, preferably not raised to any powers (such as ), or the real part of the determinant, provide useful diagnostic observables to analyse the dynamics. We have studied a number of models in this way, namely U(1) and SU(3) one-link models and heavy dense QCD (HDQCD) in four dimensions. In all of these, we indeed observed similar behaviour: the emergence of two regions, which were denoted with and are identified by the sign of the real part of the determinant (before raising it to a power). We found that it is typically the region with positive real part that dominates the dynamics, but that excursions to can upset expectation values, even when their relative weight is suppressed. In some cases the bottleneck is particularly difficult to pass, which may occur when the order of the zero is increased. It is possible to analyse each region separately. The error made by restricting the simulation to can then be estimated and was typically seen to be small, depending on the parameters used. In the SU(3) one-link model and HDQCD, the process is affected by zeroes close to the half filling point, where the sign problem is milder. In this case the region took the form of characteristic “whiskers”: even though the relative weight of these regions was small, the contribution to expectation values could be large. Hence an exclusion of this region improves the results, but with a systematic uncertainty. In HDQCD we found indications that the role of zeroes is unchanged when the lattice spacing is decreased, while keeping the physical volume approximately constant. Finally, we considered QCD with dynamical staggered quarks at high temperature and analysed the eigenvalues of the Dirac operator. Here we noted that there are typically no eigenvalues close to zero, which suggests that complex Langevin dynamics is applicable in this part of the phase diagram.
8 Summary and Outlook
We have given a detailed analysis of the role of poles in the Langevin drift, in the case of complex Langevin dynamics for theories with a sign problem. Since the standard derivation of the formal justification relies on holomorphicity of the drift, we have revisited the derivation and shown that, besides the requirement of a fast decay of the probability distribution at large imaginary directions, an additional requirement of fast decay near the pole(s) is present. The probability distribution is typically not known a priori, but its decay can be analysed a posteriori.
We then studied a number of models, from simple integrals to QCD at nonzero baryon chemical potential, and found support for the analytical considerations. In the cases when the simulation is affected by the pole(s), we found that typically the configuration space is divided into two regions, connected via a bottleneck. For theories with a complex fermion determinant, such as QCD, the bottleneck is determined by the zeroes of the determinant. In the simple models, and even for QCD in the presence of heavy (static) quarks, this understanding is sufficient to analyse the reliability of the Langevin simulation.
In full QCD, with dynamical quarks, ideally it requires knowledge of the (small) eigenvalues of the fermion matrix throughout the simulation, which is nontrivial. At high temperature, it was shown that the eigenvalues are typically not close to zero. Hence the most important outstanding question for QCD refers to the low-temperature region in the phase diagram. Here a number of hurdles remains to be taken. When eigenvalues become very small, the conjugate gradient algorithm used in the fermion matrix inversion becomes ineffective. This is common to many problems at nonzero density, even in the absence of a sign problem, and requires e.g. a regulator. A successful approach in this case has not yet been developed. Besides this, on coarse lattices gauge cooling, which stabilises the Langevin process, is ineffective. This situation is improved on finer lattices, which are however more expensive due to the larger values of required. A definite statement on the applicability of complex Langevin dynamics throughout the QCD phase diagram can only be made once further analysis of theses issues has been brought to a conclusion.
Acknowledgements
IOS would like to thank Prof. Reimer Kühn from King’s College London for interesting discussions. GA is supported by STFC (grant ST/L000369/1), the Royal Society and the Wolfson Foundation. DS, ES and IOS thankfully acknowledge support from the DFG via projects STA 283/16-2 (EHS and IOS) and SE 2466/1-1 (DS). The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for funding this project by providing computing time on the GCS Supercomputer SuperMUC at Leibniz Supercomputing Centre (LRZ, www.lrz.de), as well as for providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS share of the supercomputer JURECA and JUQUEEN juqueen at Jülich Supercomputing Centre (JSC). Some parts of the numerical calculation were done on the GPU cluster at Eötvös and Wuppertal Universities.
Appendix A Second-order poles: a solvable real example
In this appendix, we discuss the special case of a second-order pole in a solvable example. Statement: When there is a pole with residue 2 in the drift (corresponding to a second-order zero in the density) and the Laurent expansion around the pole has no constant term, then has no essential singularity, only a simple pole.
A simple example is the following: consider the action
[TABLE]
leading to
[TABLE]
A simple computation then gives
[TABLE]
so that for no higher singularities are produced by to . The more general statement follows easily by computation.
This model (which is a special case of the models considered in Sec. 2.1), has some interesting features, in particular the complete spectrum of and the evolution of holomorphic observables can be determined. Because the model is real we now write instead of and drop the tilde, i.e. we consider
[TABLE]
and its dual
[TABLE]
For one can actually find the complete spectrum of : because the model is real, a well-known fact Damgaard:1987rr is that is conjugate to a self-adjoint operator ,
[TABLE]
which is, up to constant, the Hamiltonian of a harmonic oscillator. It has apparently a negative eigenvalue; this is, however, deceptive: so far we have been sloppy about the boundary conditions at . The drift is strongly repulsive away from the origin along the real axis, the probability density is vanishing quadratically there and the Langevin process does not cross the origin.
This means that mathematically we have to consider with 0-Dirichlet boundary conditions at the origin. To avoid confusion, we call the corresponding Hamiltonian . All functions in the domain of definition of have to have at least a square integrable second derivative; this means that they are continuous and vanish at the origin. The ground state wave function of (ignoring the b.c. at 0), which superficially seems to belong to a negative eigenvalue of , is not in the domain of definition (neither are all even eigenfunctions of ). Actually, because there is no communication between the two half-lines, we may as well consider the problem only on one of the half-lines . From now on we choose .
The eigenfunctions of are thus the odd eigenfunctions of
[TABLE]
with eigenvalues and such that they are normalised on ; are the Hermite polynomials. The eigenfunctions of are then found as
[TABLE]
while those of are
[TABLE]
The ground state of is thus a constant and the ground state of is
[TABLE]
as it has to be.
The first excited state of is ; it belongs to the eigenvalue . We thus find
[TABLE]
which converges to the correct expectation value
[TABLE]
for . Similarly convergence to the correct value is found for all even functions.
How about the superficially unstable mode , see Eq. (A.3)? It corresponds to the ground state of on , which is not in the domain of definition of . The odd eigenfunctions of are complete in the subspace of odd functions; hence the odd eigenfunctions restricted to are complete in and so the apparent eigenvector of with a negative eigenvalue should be considered as a convergent series
[TABLE]
with
[TABLE]
Instead of considering as a self-adjoint operator on we may consider itself as a self-adjoint operator on the Hilbert space obtained as (the completion of) the set of functions on with the scalar product
[TABLE]
The eigenfunctions , are orthogonal with respect to this scalar product. So we can write equivalently
[TABLE]
where the convergence is now to be understood in the sense of . is in , but not in the domain of definition of .
Appendix B Solutions to the sign problem for real models with poles
In this appendix we consider real models with a zero in the density and hence a pole in the Langevin drift, with a weight of the form motivated by the U(1) one-link model in Sec. 3, i.e.
[TABLE]
B.1 One pole, odd
Since the models are real, one might attempt to treat them by the real Langevin method. But a simple consideration shows that for a real model with a sign problem this cannot produce correct results. The reason is that the real Langevin equation will have a positive equilibrium measure on the real axis and thus cannot reproduce all the averages which would be obtained with a signed measure. When we modify the process to allow it move out into the complex plane, the story changes: it seems then a priori not impossible that a positive measure on reproduces correctly the averages of holomorphic observables with a signed measure on and there are examples that bear this out. A simple example is given by
[TABLE]
It is easy to verify that the correct expectation values are reproduced by the positive density in ,
[TABLE]
with
[TABLE]
This simple solution is, however, unrelated to the CL method.
B.2 One pole, even
For and even, there is no sign problem, but the lack of ergodicity exists as well. In this case, because of the stronger repulsion away from the pole, our simulations typically do not cross the pole, and so produces incorrect results when started on one side of the pole. (If there is a symmetry , this defect can easily be remedied by starting the process with equal probability on either side of the pole). Another way to facilitate the crossing and achieve correct results is by adding a small imaginary noise term.
A simple cure consists in the reweighting with the sign factors, as follows. Replace the observable by
[TABLE]
and compute by real or complex Langevin
[TABLE]
where the symbol stands for the ordinary real or complex Langevin long time average. But this cure, like any reweighting method, while it works for one-variable models, it is not very useful for lattice models. So we will not pursue it any further.
B.3 A cure for compact real models
The final cure we consider is for a real but nonpositive weight . Let be a constant such that
[TABLE]
and define
[TABLE]
Then for any observable satisfying we can rewrite as
[TABLE]
because
[TABLE]
So a correct procedure is to run real Langevin with the drift derived from the positive density and correct the normalisation as shown above.
We take the U(1) model with , such that
[TABLE]
The drift for the modified Langevin process is now
[TABLE]
A full set of observables satisfying the condition are the exponentials . For the normalisation factor we have
[TABLE]
This procedure works very well, as shown in Fig. 36. Note that in the figure, the horizontal axis is
[TABLE]
which increases as is increased ( is the original process). The observables are Re/Im with .
As expected, the numerical results start agreeing with the exact results as soon as is large enough to make nonnegative. Therefore a plot like this can also serve to determine the minimal for which correct results are obtained and a priori knowledge about the zeroes of the density is not required.
The advantage of this cure is that it can be easily generalised to the complex case ; it turns out that it does work reasonably well, but not perfectly, provided is not very large. But again, since the cure involves some reweighting, it is not very useful for lattice systems.
Appendix C Zeroes of the HDQCD determinant
In this Appendix, we further consider the HDQCD determinant for gauge group SU(3) or SL(3,), reduced to a single link . We will demonstrate that zeroes of the determinant do not come as isolated points.
As discussed in Sec. 5, the determinant contains the factors
[TABLE]
such that . For a discussion of its zeroes we may look at each factor separately. Let us first consider . The eigenvalues of can be parametrised as ; in terms of these
[TABLE]
In , parametrised by , the determinant vanishes on the three submanifolds given by
[TABLE]
But there is a different way to think about this: define
[TABLE]
and are algebraically independent and can be used instead of and to parametrise conjugacy classes of SL(3, ). The map from to is not one-to-one, since interchanging and will leave unchanged. The inverse map from to will have branch points where any two eigenvalues coincide.
The fact that ignore permutations of the eigenvalues is actually an advantage because too remains unchanged. The zeroes of are then determined by
[TABLE]
The three manifolds (1,2,3) in Eq. (C.3) are thus mapped into a single manifold given by Eq. (C.5). This manifold is an affine complex plane in the space (affine means that it does not go through the origin).
does not have to be computed by taking the inverse of ; one can instead use the identity, valid for SL(3, ),
[TABLE]
So we get, using ,
[TABLE]
which vanishes on a complex parabola.
The determinant factor is quite similar; proceeding as before we find
[TABLE]
so its zeroes are again described either by a complex affine plane in or a complex parabola in . The main point is that they are real manifolds of codimension two, so there are no isolated zeroes.
Appendix D Expansion methods
One possibility to deal with the pole is to use power series expansions in order to approximate the meromorphic drift by polynomials. In QCD the pole in the drift is always due to a zero of the determinant. In the one-pole model the role of the determinant is played by the factor .
D.1 One-pole model
We explain the approach in the one-pole model. We study two basic procedures:
- (1)
Fixed expansion: Let be the ‘determinant’ causing problems due to its zeroes. Consider the drift caused by ,
[TABLE]
In order to obtain a holomorphic approximation to , we choose a point not too far from the peak of the distribution but far enough from the pole(s). We then expand around this point to order as follows: let , then replace by
[TABLE]
Since this is a polynomial in , there is no indeterminacy when . The difference between the exact value and is , which converges to 0 if and only if . There could be problems if the process goes outside the region of convergence, but experience shows that typically the drift will tend to keep the process inside the region of convergence. An illustrative example is shown in Fig. 37. Because the expansion point is chosen once and for all, we call this the fixed expansion. In any case, by varying one can check whether this is the case. Numerical studies using this fixed expansion are presented below for the U(1) one-link model and for the SU(3) one-link model in Sec. 5.
- (2)
Dynamic expansion: one may choose different expansion points , with , in such a way that the domain of analyticity is covered, while always staying well inside the domain of convergence. The quality of the expansion can be fixed by changing to the actual configuration point whenever with some pre-chosen . If is chosen small enough we should not find any appreciable difference between the results using and . By studying various situations we find that, as expected, the dynamically expanded drift generally performs just like the unexpanded one: it works where the latter works and it fails where the latter fails.
D.2 U(1) one-link model
We now apply the expansion method to the U(1) one-link model and focus on the fixed expansion. Consider the factor
[TABLE]
appearing in . The drift caused by this,
[TABLE]
has two poles for . As described for the one-pole model we obtain a holomorphic approximation to by choosing a point somewhere near the center of the equilibrium distribution; here the natural choice is , i.e. . The Taylor expansion of around this point to order then looks as in Eq. (D.2). Again the drift from the expansion tends to keep the process inside the region of convergence, as shown in Fig. 38.
We have tested this approach numerically, using expansions to order 10 and 20, making sure that the centre of the expansion was chosen reasonably far away from the poles and near the maximum of the distribution in the region with positive , i.e. . We found the results to be more or less comparable to the ones obtained by restricting the process to , and not a great difference between orders 10 and 20.
Hence we conclude that the fixed expansion is a potential cure of the ills of meromorphic drift in the cases where restricting the process to works as well. It should be noted though that potentially significant, though much reduced deviations from the exact results remain.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) S. Borsanyi, Z. Fodor, C. Hoelbling, S. D. Katz, S. Krieg and K. K. Szabo, Full result for the QCD equation of state with 2+1 flavors , Phys. Lett. B 730 (2014) 99–104 , [ 1309.5258 ]. · doi ↗
- 2(2) Hot QCD collaboration, A. Bazavov et al., Equation of state in (2 + + 1)-flavor QCD , Phys. Rev. D 90 (2014) 094503 , [ 1407.6387 ]. · doi ↗
- 3(3) I. Barbour, N.-E. Behilil, E. Dagotto, F. Karsch, A. Moreo, M. Stone et al., Problems with Finite Density Simulations of Lattice QCD , Nucl. Phys. B 275 (1986) 296–318 . · doi ↗
- 4(4) P. de Forcrand, Simulating QCD at finite density , Po S LAT 2009 (2009) 010, [ 1005.0539 ].
- 5(5) G. Aarts, Complex Langevin dynamics and other approaches at finite chemical potential , Po S LATTICE 2012 (2012) 017, [ 1302.3028 ].
- 6(6) C. Gattringer, New developments for dual methods in lattice field theory at non-zero density , Po S LATTICE 2013 (2014) 002, [ 1401.7788 ].
- 7(7) D. Sexty, New algorithms for finite density QCD , Po S LATTICE 2014 (2014) 016, [ 1410.8813 ].
- 8(8) L. Scorzato, The Lefschetz thimble and the sign problem , Po S LATTICE 2015 (2016) 016, [ 1512.08039 ].
