Decoding probabilistic syndrome measurement and the role of entropy

Jo\~ao F. Doriguello

arXiv:2302.11631·quant-ph·February 24, 2023

Decoding probabilistic syndrome measurement and the role of entropy

Jo\~ao F. Doriguello

PDF

Open Access

TL;DR

This paper investigates the performance of the toric code under probabilistic stabiliser measurements, demonstrating that high error thresholds are achievable with modified decoding methods, and explores the impact of entropy on decoding efficiency.

Contribution

It introduces a modified decoding approach for probabilistic stabiliser measurements and analyzes the role of entropy in improving decoder performance.

Findings

01

High error threshold of 1.69% with probabilistic measurements using edge-contraction decoding

02

Deterministic measurements achieve a higher threshold of 2.93%

03

Entropy factors have negligible advantage in fully continuous measurement scenarios

Abstract

In realistic stabiliser-based quantum error correction there are many ways in which real physical systems deviate from simple toy models of error. Stabiliser measurements may not always be deterministic or may suffer from erasure errors, such that they do not supply syndrome outcomes required for error correction. In this paper, we study the performance of the toric code under a model of probabilistic stabiliser measurement. We find that, even under a completely continuous model of syndrome extraction, the threshold can be maintained at reasonably high values of $1.69%$ by suitably modifying the decoder using the edge-contraction method of Stace and Barrett (Physical Review A 81, 022317 (2010)), compared to a value of $2.93%$ for deterministic stabiliser measurements. Finally, we study the role of entropic factors which account for degenerate error configurations for improving on the…

Equations45

m odd \sum n (m n) p_{Δ}^{m} (1 - p_{Δ})^{n - m} = \frac{1}{2} (1 - (1 - 2 p_{Δ})^{n}) .

m odd \sum n (m n) p_{Δ}^{m} (1 - p_{Δ})^{n - m} = \frac{1}{2} (1 - (1 - 2 p_{Δ})^{n}) .

p

p

p_{Δ}

\overset{p}{ˉ} (ω_{ij}) = \frac{1 - ( 1 - 2 p _{Δ} ) ^{ω_{ij} / s}}{2} = \frac{1 - ( 1 - 2 p ) ^{ω_{ij}}}{2} .

\overset{p}{ˉ} (ω_{ij}) = \frac{1 - ( 1 - 2 p _{Δ} ) ^{ω_{ij} / s}}{2} = \frac{1 - ( 1 - 2 p ) ^{ω_{ij}}}{2} .

P_{ij} = E \in E \sum P_{E} = P_{0} + P_{1} + P_{2} + \dots,

P_{ij} = E \in E \sum P_{E} = P_{0} + P_{1} + P_{2} + \dots,

P_{E} = e \in E \prod p_{e} e \in / E \prod (1 - p_{e})

P_{E} = e \in E \prod p_{e} e \in / E \prod (1 - p_{e})

ln P_{0} = E max ln P_{E} = ln C - E min e \in E \sum ln (\frac{1 - p _{e}}{p _{e}}),

ln P_{0} = E max ln P_{E} = ln C - E min e \in E \sum ln (\frac{1 - p _{e}}{p _{e}}),

P_{ij}

P_{ij}

\approx C E \in E \sum e \in E \prod \frac{p ω ( e )}{1 - p ω ( e )}

\approx C E \in E \sum e \in E \prod (p ω (e) + (p ω (e))^{2})

= C E \in E \sum (e \in E \prod p ω (e)) (e \in E \prod (1 + p ω (e)))

\approx C E \in E \sum p^{∣ E ∣} δ_{E},

P_{ij} \propto l \geq l_{0} \sum p^{l} E \in E_{l} \sum δ_{E},

P_{ij} \propto l \geq l_{0} \sum p^{l} E \in E_{l} \sum δ_{E},

Ω_{k} := E \in E_{l_{0} + k} \sum δ_{E} .

Ω_{k} := E \in E_{l_{0} + k} \sum δ_{E} .

w_{ij} = d_{C} (i, j) .

w_{ij} = d_{C} (i, j) .

w_{ij}

w_{ij}

w_{ij} = Δ (x_{i}, x_{j}) + Δ (y_{i}, y_{j}) + w_{time}^{BG} max (t_{i 1} - t_{j 2}, t_{j 1} - t_{i 2}, 0),

w_{ij} = Δ (x_{i}, x_{j}) + Δ (y_{i}, y_{j}) + w_{time}^{BG} max (t_{i 1} - t_{j 2}, t_{j 1} - t_{i 2}, 0),

w_{ij} = Δ (x_{i}, x_{j}) + Δ (y_{i}, y_{j}) + w_{time}^{AP} ∣ t_{i} - t_{j} ∣,

w_{ij} = Δ (x_{i}, x_{j}) + Δ (y_{i}, y_{j}) + w_{time}^{AP} ∣ t_{i} - t_{j} ∣,

R_{L} = \frac{N _{R}}{1/ s} = \frac{lo g ( 1 - s ^{'} )}{lo g ( 1 - s )} s

R_{L} = \frac{N _{R}}{1/ s} = \frac{lo g ( 1 - s ^{'} )}{lo g ( 1 - s )} s

s \to 0 lim N_{s} \cdot p_{Δ} = \frac{T}{2} s \to 0 lim \frac{1 - ( 1 - 2 p ) ^{s}}{s} = \frac{T}{2} ln (\frac{1}{1 - 2 p}) .

s \to 0 lim N_{s} \cdot p_{Δ} = \frac{T}{2} s \to 0 lim \frac{1 - ( 1 - 2 p ) ^{s}}{s} = \frac{T}{2} ln (\frac{1}{1 - 2 p}) .

Ω_{0} (i, j) = \frac{( Δ ( x _{i} , x _{j} ) + Δ ( y _{i} , y _{j} ) + ∣ t _{i} - t _{j} ∣ )!}{( Δ ( x _{i} , x _{j} ))! ( Δ ( y _{i} , y _{j} ))! ( ∣ t _{i} - t _{j} ∣ )!},

Ω_{0} (i, j) = \frac{( Δ ( x _{i} , x _{j} ) + Δ ( y _{i} , y _{j} ) + ∣ t _{i} - t _{j} ∣ )!}{( Δ ( x _{i} , x _{j} ))! ( Δ ( y _{i} , y _{j} ))! ( ∣ t _{i} - t _{j} ∣ )!},

w_{ij} = l_{0} (i, j) ln (\frac{1 - p}{p}) - τ ln Ω_{0} (i, j),

w_{ij} = l_{0} (i, j) ln (\frac{1 - p}{p}) - τ ln Ω_{0} (i, j),

Ω_{0} (i, j) = (min (\frac{t _{i 2}}{s}, \frac{t _{j 2}}{s}) - max (\frac{t _{i 1}}{s}, \frac{t _{j 1}}{s})) \times \frac{( Δ ( x _{i} , x _{j} ) + Δ ( y _{i} , y _{j} ))!}{( Δ ( x _{i} , x _{j} ))! ( Δ ( y _{i} , y _{j} ))!},

Ω_{0} (i, j) = (min (\frac{t _{i 2}}{s}, \frac{t _{j 2}}{s}) - max (\frac{t _{i 1}}{s}, \frac{t _{j 1}}{s})) \times \frac{( Δ ( x _{i} , x _{j} ) + Δ ( y _{i} , y _{j} ))!}{( Δ ( x _{i} , x _{j} ))! ( Δ ( y _{i} , y _{j} ))!},

w_{ij} = w_{time}^{BG} max (t_{i 1} - t_{j 2}, t_{j 1} - t_{i 2}, 0) + Δ (x_{i}, x_{j}) + Δ (y_{i}, y_{j}) - τ ln Ω_{0} (i, j) / ln (\frac{1 - p _{Δ}}{p _{Δ}}),

w_{ij} = w_{time}^{BG} max (t_{i 1} - t_{j 2}, t_{j 1} - t_{i 2}, 0) + Δ (x_{i}, x_{j}) + Δ (y_{i}, y_{j}) - τ ln Ω_{0} (i, j) / ln (\frac{1 - p _{Δ}}{p _{Δ}}),

\displaystyle w_{ij}=\begin{cases}l_{0}(i,j)\ln{p^{-1}}-\tau\ln\delta_{0},\\ l_{0}(i,j)\ln{p^{-1}}-\tau\ln\Omega_{0},\\ l_{0}(i,j)\ln{p^{-1}}-\tau\ln\big{(}\Omega_{0}+p\cdot\Omega_{1}\big{)}.\end{cases}

\displaystyle w_{ij}=\begin{cases}l_{0}(i,j)\ln{p^{-1}}-\tau\ln\delta_{0},\\ l_{0}(i,j)\ln{p^{-1}}-\tau\ln\Omega_{0},\\ l_{0}(i,j)\ln{p^{-1}}-\tau\ln\big{(}\Omega_{0}+p\cdot\Omega_{1}\big{)}.\end{cases}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuantum Computing Algorithms and Architecture · Quantum Information and Cryptography · Quantum and electron transport phenomena

Full text

Decoding probabilistic syndrome measurement and the role of entropy

João F. Doriguello

Centre for Quantum Technologies, National University of Singapore, Singapore

School of Mathematics, University of Bristol, Bristol, United Kingdom

Quantum Engineering Centre for Doctoral Training, University of Bristol, Bristol, United Kingdom

Abstract

In realistic stabiliser-based quantum error correction there are many ways in which real physical systems deviate from simple toy models of error. Stabiliser measurements may not always be deterministic or may suffer from erasure errors, such that they do not supply syndrome outcomes required for error correction. In this paper, we study the performance of the toric code under a model of probabilistic stabiliser measurement. We find that, even under a completely continuous model of syndrome extraction, the threshold can be maintained at reasonably high values of $1.69\%$ by suitably modifying the decoder using the edge-contraction method of Stace and Barrett (Physical Review A 81, 022317 (2010)), compared to a value of $2.93\%$ for deterministic stabiliser measurements. Finally, we study the role of entropic factors which account for degenerate error configurations for improving on the performance of the decoder. We find that in the limit of completely continuous stabiliser measurement any advantage further provided by these factors becomes negligible in contrast to the case of deterministic measurements.

††preprint: APS/123-QED

I Introduction

To achieve scalable quantum computation, quantum error correction is required to address unavoidable noise on physical qubits. Quantum error-correcting codes [1, 2] can encode quantum information and, combined with a decoder, can enable fault-tolerant computation despite the existence of errors on physical qubits. There are useful benchmarking methods to analyse the performance of error-correcting codes, such as using simple toy error models which abstract away many of the details of a physical system that would actually be used to implement such a code. However, to make progress towards quantum error correction in practice, it is important to analyse the performance of codes and decoders when features of a hardware system are reintroduced. Here we isolate and analyse one such feature which deviates from a simple toy error model: asynchronous measurement.

To perform quantum error correction, parity check measurements are repeatedly made on the qubits of the code. A usual setting in topological codes is that these measurement are made deterministically in ‘rounds’, i.e., on demand (not necessarily error-free), such that in each round every qubit of the code is involved in one parity check. However, when any of the operations that are used to perform a parity check are non-deterministic, this assumption does not apply. Parity checks may be inherently probabilistic, as is the case when they depend on ancillary states from non-deterministic entanglement generation or distillation procedures, e.g. in modular quantum-computing architectures [3, 4, 5]. In other systems, parity checks may be subject to measurement erasure, where measurement outcomes are not always returned, e.g. in photonic quantum computing [6] where single-photon detectors suffer optical loss [7, 8, 9].

In this work we study a model of asynchronous parity check measurement in the toric code. In this model the stabiliser measurements are attempted at discrete times and each attempt provides a parity outcome with probability $s$ , called the synchronicity parameter. We push this to the limit $s\to 0$ where parity checks are performed continuously. For an independent and identical distributed (i.i.d.) error model and a minimum-weight perfect matching (MWPM) decoder [10, 11, 12], the toric code exhibits a threshold of $2.93\%$ when parity checks are entirely synchronous [13]. We show that, by marking unsuccessful parity checks as erased in the syndrome graph (the ‘history’ of stabiliser measurement outcomes), it is possible to contract erased edges in the syndrome graph into multi-edges following the method described by Stace and Barrett [14]. This gives a clear framework on how to properly include non-identical error probabilities arising from asynchronism into a MWPM decoder which, when appropriately modified, can maintain the threshold at a reasonably high value of $1.69\%$ in the completely continuous regime.

A secondary motivation for studying this model is to explore the role of degeneracy in the MWPM decoder under asynchronous measurements. It is known [15, 14] that accounting for degeneracy, i.e., the number of shortest paths that are consistent with a matching, can improve the usual MWPM decoder’s threshold: for an i.i.d. error model with faultless, fully synchronous ( $s=1$ ) stabiliser measurements, path counting boosts the MWPM decoder threshold from $10.3\%$ to $10.65\%$ [14]. Moreover, degeneracy has also been used to close the gap between minimum-weight perfect matching and optimal methods [16], as well as to compare different variants of the toric code with a comparable number of qubits [17]. Here we study how to introduce degeneracy into the MWPM decoder under asynchronism by considering multi-path counting on top of the edge-contraction method, and we observed a mild improvement from $1.69\%$ to $1.70\%$ on the decoder’s threshold. We argue, and provide numerical evidence, that the presence of asynchronism increases the predominance, i.e., the relative probability, of the most likely error configuration over all the others, thus diminishing the role of degeneracy on decoding.

Sec. II reviews the toric code and introduces our toy model of asynchronism. Sec. III discusses the approach to decoding and the way in which degeneracy appears. It also introduces our proposed decoding algorithms. Their performance is then benchmarked in Sec. IV. We further discuss our results and conclude in Sec. V.

II Asynchronism in the toric code

II.1 The toric code

The toric code [1] is a topological code defined on an $L\times L$ square lattice with periodic boundary condition, where a qubit is located on each edge of the lattice. There is an operator $X_{v}$ and $Z_{f}$ associated with each vertex $v$ and each face $f$ of the lattice, respectively. The code space is defined as the simultaneous ‘ $+1$ ’ eigenstate of the operators $X_{v}$ and $Z_{f}$ . $X_{v}$ is the product of the Pauli- $X$ matrices acting on edges incident to $v$ , i.e., $X_{v}=\prod_{e\ni v}X_{e}$ , while $Z_{f}=\prod_{e\in f}Z_{e}$ is the product of the Pauli- $Z$ s acting on all edges of the face $f$ . These operators, and any product of them, form the stabiliser group, $S$ . Logical operators are made up of $X$ and $Z$ operators acting on a string of qubits that span the lattice, giving rise to logical operators $\overline{X}_{1}$ and $\overline{Z}_{2}$ along one direction, and $\overline{X}_{2}$ and $\overline{Z}_{1}$ along the other direction. To achieve fault tolerance the stabiliser operators $X_{v}$ are measured. If there is an error $E_{Z}\in\{I,Z\}^{\otimes n}$ , any stabiliser $X_{v}$ that anticommutes with the error returns a ‘ $-1$ ’ outcome. To account for the fact that the stabiliser measurements themselves can also be subject to error, the stabilisers are measured multiple times, and parity check operators are defined as the product of two subsequent measurements of the same stabiliser generator. If no error occurs during both measurements, then the parity check will return a ‘ $+1$ ’ outcome. If a Pauli error occurs between the first and second measurement, or if there is a measurement error in one of the measurements, then the parity check will return a ‘ $-1$ ’ outcome, which can be seen as a quasi-particle and is called anyon. The subset of parity checks with ‘ $-1$ ’ measurement outcomes is called the syndrome $\sigma$ . Given a syndrome $\sigma$ , a decoder can then be applied to find a correction operator $\mathcal{C}(\sigma)$ such that $\mathcal{C}(\sigma)E_{Z}\in S$ . That is, if the correction operator is applied to the code, the error is corrected up to a stabiliser. We note that in quantum computation it is not necessary to physically apply any correction operator to the qubits, rather the correction can be thought of as a reference frame through which the measurement outcomes can be interpreted.

II.2 Asynchronous stabiliser measurement

We now introduce a model of asynchronous stabiliser measurement. This model is designed to isolate the effects of measurement asynchronicity while leaving all other features of the system the same. But it is worth highlighting that there could be many things about this model that could be changed depending on the physical system.

Consider a square toric code of size $L\times L$ . The toric code is subject to repeated measurements for a time $T$ . Each attempted stabiliser measurement provides a parity outcome with probability $s$ , a parameter we called the synchronicity of the system. Otherwise, with probability $1-s$ , no outcome is obtained, which is marked as a ‘[math]’ outcome, i.e., erased. Parity measurements are successfully recorded at a rate $1$ per unit time on average, meaning that $1/s$ measurements are attempted in one unit of time. We define two measures of errors on qubits: the simulation error $p_{\Delta}$ and the physical error $p$ . The simulation error is the probability that a qubit suffers an error between two consecutive parity check attempts. The physical error is the probability that a qubit suffers an error per unit time, i.e., after $1/s$ parity check attempts. The physical and simulation errors are related as follows: the probability that a qubit suffers an error after $n$ measurement rounds equals the probability that during these $n$ rounds its state is flipped an odd number of times (each with probability $p_{\Delta}$ ), i.e.,

[TABLE]

Since a time unit represents $1/s$ measurement rounds on average, both quantities $p$ and $p_{\Delta}$ are related via

[TABLE]

Finally, successful measurements are subject to measurement errors, which flip the outcome value with probability $q=p$ .

When $s=1$ , measurements are fully synchronous. When $s\to 0$ , measurements are completely continuous. By fixing the rate $p$ of physical errors and the rate of successful parity checks (set to $1$ ), we are able to probe the behavior of the code with respect to the parameter $s$ . We consider three distinct regimes, which are illustrated in Fig. 1:

Synchronous measurement ( $s=1$ ). This corresponds to the error model with fully synchronous parity checks. 2. 2.

Discrete asynchronous measurement ( $0<s<1$ ). Measurements are performed in discrete rounds, but are not deterministic and occur with probability $s$ . Measurement rounds are performed at a rate $1/s$ such that the overall rate of successful stabiliser measurement remains at $1$ per unit time. 3. 3.

Continuous measurement ( $s=0$ ). Measurements are not performed in rounds, but are received continuously at a rate $1$ . Similarly, Pauli errors are treated as continuous. The times of the successful measurements and Pauli errors are modelled as arising from a Poisson distribution, the resulting distribution obtained from the Binomial distribution in the limit $s\to 0$ (see Appendix A).

One can see from Fig. 1 the effect of the probabilistic nature of parity checks. Successful stabiliser measurements are separated in time, thus creating a block-like structure. Every stabiliser operator $X_{v}$ has an ordered list of measurement times for successful parity checks $(t^{v}_{1},t^{v}_{2},\dots)$ , where $t^{v}_{1}<t^{v}_{2}<\dots$ . Two consecutive measurement times define a parity block. More specifically, the $i$ th parity block associated with $v$ is defined by the pair of time coordinates $(t^{v}_{i-1},t^{v}_{i})$ . If the measurement outcomes differ from each other at consecutive times $t^{v}_{i-1}$ and $t^{v}_{i}$ , then we refer to this block as an anyon block. In the fully synchronous regime ( $s=1$ ), two consecutive measurements with differing outcomes lead to an anyon well defined in time. On the other hand, for $s<1$ , such anyons (now anyon blocks) are spread over time. Defining their time position is one of the main issues in constructing the decoding problem and correcting for errors.

II.3 Constructing the decoding problem

To analyse fault tolerance in this system we first want to formulate the error model and structure of the code as a syndrome graph. In the syndrome graph, vertices represent fault-tolerant parity checks and edges represent the potential errors in the system, e.g. physical and measurement errors. This representation is the most useful way to analyse the performance of decoding algorithms as it fully describes the system, capturing both space and time behavior.

Each edge in the syndrome graph is assigned a bit that indicates whether or not an error has occurred. Vertices are assigned a parity value which is computed as the parity of the values of all edges incident to that vertex. If there are no errors, all vertices will have an even parity. If an error occurs, the two vertices connected to the corresponding edge will have their values flipped.

For a fault-tolerance system there may be multiple possible syndrome graph representations that capture the same error model. We consider first the simple syndrome graph that is most naturally derived from the parity check structure. We then study the contracted syndrome graph.

II.3.1 Simple syndrome graph

When all parity measurements are performed synchronously, the syndrome graph has a cubic structure. Time-like edges represent the possible measurement errors on parity checks with error probability $q=p$ , while space-like errors represent potential Pauli errors on the physical qubits with error probability $p_{\Delta}=p$ . As previously mentioned, the set of all odd parity checks defines the syndrome and two consecutive parity checks with differing outcomes define an anyon in between both measurement times.

In our model of asynchronous measurement, we have to modify this representation since not all parity checks return an outcome. This is done by marking an edge of the graph as erased when there is a corresponding measurement erasure. The net result is that multiple sequential erasures in time lead to a ‘block’ of marked edges. The formulation of this system into a syndrome graph, named simple syndrome graph, is illustrated in Fig. 2. Space-like edges are still associated with error probability $p_{\Delta}$ and non-erased time-like edges with error probability $q=p$ . As previously mentioned, anyons are no longer well defined in time, as the ‘blocks’ of erased edges can now have variable time lengths. We note that the graph structure, i.e., its cubic structure, is the same in the every instance, the only difference being the position of the erased edges.

II.3.2 Contracted syndrome graph

Given a simple syndrome graph with a set of erased edges as shown in Fig. 2(b), we find an alternative representation without erased edges. When erasure is present, fault-tolerant parity checks are only complete for each cluster of erased edges [14]. In our case this means simply treating all the vertices between two successful measurements as one vertex, i.e., considering a parity block as a vertex. By contracting the graph around the erased edges, we arrive at the contracted syndrome graph. An example is shown in Fig. 2(c). The contraction resolves the problem of defining the anyons. These are now placed at contracted vertices that are between two consecutive parity checks with differing outcomes, or, in other words, at contracted vertices associated with anyon blocks.

Carrying out the contraction will often result in multi-edges in the graph, where two erased components were connected by multiple edges in the simple syndrome graph. These correspond to multiple possible errors that could cause the same syndrome. An equivalent representation that is more convenient for decoding is to instead represent these as a single edge with modified error probability. Such modified error probability is related to the physical error $p$ and the time overlap of erased components in the simple syndrome graph, as illustrated in Fig. 2, via the same reasoning that relates $p_{\Delta}$ and $p$ in Eq. (2). More specifically, let $\omega_{ij}=\min(t_{i},t_{j})-\max(t_{i-1},t_{j-1})$ be the time overlap between two adjacent parity blocks $(t_{i-1},t_{i})$ and $(t_{j-1},t_{j})$ . The number of merged edges is therefore just $\omega_{ij}/s$ . The probability $\bar{p}(\omega_{ij})$ of a Pauli error occurring on the merged edge of the contracted syndrome graph is equal to the probability of a Pauli error occurring an odd number of times on the corresponding edges from the simple syndrome graph, which is given by Eq. (1):

[TABLE]

A merged edge between two adjacent parity blocks with time overlap $\omega(e)$ has thus an associated error probability $\bar{p}(\omega(e))$ . Time-like (vertical) edges continue to represent possible measurement errors with probability $q=p$ . The resulting contracted syndrome graph thus offers a simple and compact framework in which decoding techniques can be straightforwardly used. We will show how to apply such decoding techniques in the following section.

II.3.3 Continuous stabiliser measurement

In the case of continuous stabiliser measurement when $s=0$ , there is no way to construct the simple syndrome graph. In this case we build the contracted syndrome graph directly by recording the parity check measurement times. Given the locations of successful parity checks, a vertex is identified with each parity block. An edge is then placed between vertices whose adjacent parity blocks overlap in time and has an associated probability according to Eq. (3).

III Decoding

The job of the decoder is to identify a correction for a given syndrome graph, in the form of a predicted set of flipped edges. An optimal decoder identifies corrections that minimise the chance of logical errors. For practical use however, efficient decoding algorithms are required that approximate optimal decoding while being computationally tractable [18, 15, 19]. In this section, we describe the decoding strategies for probabilistic syndrome measurement that will be analysed in Sec. IV.

III.1 Anyon-pairing decoders

A correction in the toric code can be expressed as a pairing of anyons (odd-parity check vertices). Any two error chains that produce the same syndrome but differ by trivial cycles have the same effect on the logical state. The task that should be performed by a decoder can be understood as matching anyons in a way that minimises the chance of a logical error. A large range of decoders can be defined as performing minimum-weight perfect matching (MWPM) [20] on matching graphs derived from syndrome graphs. A matching graph is specified for any syndrome graph as a complete graph where the vertices correspond to the anyons, and the edge weights between vertices correspond to distances in the syndrome graph. It is then necessary to properly weight the edges in the syndrome graph, which we now describe.

When constructing a matching graph, we would ideally like to compute each anyon pairing probability, i.e., the probability that any error created an observed anyon pair. Performing decoding on a matching graph with these probabilities then reveals the most likely pairing of anyons, and is independent of the type of syndrome graph (e.g. simple or contracted). In other words, we would like to compute the pairing probability between anyons $i$ and $j$ as given by

[TABLE]

where $E$ is an error chain (a set of odd parity outcomes) whose boundaries are the vertices $i$ and $j$ , $P_{E}$ is its probability, and $\mathbb{E}$ is the set of all such error chains. The error chains are indexed $E=0,1,2,\dots$ from most to least likely.

Consider now an error model where each edge $e$ in the syndrome graph represents an independent (but not necessarily identical) error occurring with probability $p_{e}$ . The probability for each error chain can be expressed as

[TABLE]

where $C=\prod_{\forall e}(1-p_{e})$ is a constant for a given syndrome graph. Commonly, a simplified MWPM decoder is used that identifies only most likely errors for the correction operator, corresponding to approximating $P_{ij}$ by $P_{0}$ for each anyon pairing in Eq. (4). Since

[TABLE]

the error chains that are used must minimise $\sum_{e\in E}\ln((1-p_{e})/p_{e})$ . The distance between any two pairs of anyons to be inputted into the matching graph can be found by using Dijkstra’s algorithm [21] on the syndrome graph with edges weighted by $\ln((1-p_{e})/p_{e})$ . The minimisation itself, i.e., the anyon pairing with overall minimum additive weight, can be found via Edmond’s minimum-weight, perfect-matching algorithm [10].

A MWPM decoder can be improved by considering more terms $P_{E}$ in the anyon pairing probability. In the usual fully synchronous ( $s=1$ ) i.i.d. error model with $q=p$ (equal physical and measurement error probabilities), $P_{E}=Cp^{|E|}$ , where $|E|$ is the length of the error chain. It is then possible that two or more paths have the same probability $P_{E}$ , meaning they are degenerate. The question is thus reduced to counting the number of paths with a given length between two anyons. The introduction of degeneracy for the shortest path into the MWPM decoder, i.e., considering all terms $P_{E}$ equal to $P_{0}$ in the pairing probability, was examined in [14]. When erasure is present and we work with the contracted syndrome graph, it is important to introduce a suitable notion of degeneracy for error chains when estimating anyon pairing probabilities. Recall that the probability $p_{e}$ of an edge $e$ in the contracted syndrome graph is given by $p_{e}=\frac{1}{2}(1-(1-2p)^{\omega(e)})$ , where $\omega(e)$ is the time overlap between the blocks defining $e$ (see Eq. (3)). By approximating $p_{e}\approx p\cdot\omega(e)$ and assuming $p_{e}$ is small, we find

[TABLE]

where in Eq. (6) we defined the quantity $\delta_{E}=\prod_{e\in E}\omega(e)$ (which is the product of the $\omega(e)$ values along the error chain), and we approximated $\prod_{e\in E}(1+p\omega(e))\approx 1$ . This last approximation can be justified as follows. Time overlaps $\omega(e)$ are expected to be smaller than $1$ on average (since parity blocks have length $1$ on average), so we can write $\prod_{e\in E}(1+p\omega(e))\lessapprox e^{p|E|}$ . For large chains, if the lattice size $L$ is sufficiently large so that $L$ is comparable to $p^{-1}$ , $e^{p|E|}\approx e^{pL}$ might be considerable, and the actual probability of large chains is underestimated. However, for small chains, the approximation $\prod_{e\in E}(1+p\omega(e))\approx 1$ is fairly accurate, and these are the ones that are relevant to the decoder since the most likely error configurations are typically composed of small error chains. The probability underestimation for large chains is thus ignored by the decoder.

By grouping terms for which error chains have the same number of edges, we obtain the following expression for the pairing probability:

[TABLE]

where $l_{0}$ denotes the length of the shortest error chain connecting $i$ and $j$ , and $\mathbb{E}_{l}$ is the set of error chains connecting $i$ and $j$ of length $l$ . We define the $(k+1)$ th-order degeneracy factor for each term in $p^{l}=p^{l_{0}+k}$ to be

[TABLE]

These factors are closely related to counting the number of paths with the same number of errors, i.e., edges. Indeed, for the i.i.d. error model with synchronicity $s=1$ , $\delta_{E}=1$ for all paths, and so $\Omega_{k}$ equals exactly the number of paths with $l_{0}+k$ edges (see more in Appendix B).

III.2 Decoding algorithms

In this section we propose several decoders based on different approximations for $P_{ij}$ .

III.2.1 Contracted Syndrome Graph decoders

We first consider a MWPM decoder on the contracted syndrome graph with the approximation $P_{ij}=P_{0}$ , which we name Contracted Graph (CG) decoder. The probability $P_{0}$ is given by Eq. (5) with $p_{e}=\bar{p}(\omega(e))=\frac{1}{2}(1-(1-2p)^{\omega(e)})$ . Finding the most likely error chain is equivalent to $\min_{E}\sum_{e\in E}\ln\left((1-p_{e})/p_{e}\right)$ and, hence, the CG decoder weights each edge by $\ln((1-p_{e})/p_{e})$ and proceeds to find the path with the minimum additive weight. In other words, this weight assignment defines a metric $d_{C}$ in the contracted syndrome graph. Therefore, the weight between two anyon blocks $i$ and $j$ is set as the shortest distance between them,

[TABLE]

The CG decoder can be enhanced by keeping more terms in Eq. (4). From Eq. (7) we can keep the first two groups of terms with shortest lengths ( $\mathbb{E}_{l_{0}}$ and $\mathbb{E}_{l_{0}+1}$ ) with their corresponding degeneracy terms $\Omega_{0}$ and $\Omega_{1}$ . Similarly to Eq. (5), the CG decoder should now optimise $\max_{E}\ln{P_{ij}}\propto\max_{E}\ln\big{(}p^{|E|}\Omega_{0}+p^{|E|+1}\Omega_{1}\big{)}=-\min_{E}\big{[}|E|\ln{p^{-1}}-\ln(\Omega_{0}+p\cdot\Omega_{1})\big{]}$ . Therefore, the weight assignment between a pair of anyons $i$ and $j$ is

[TABLE]

up to an additive constant, and where we included a parameter $\tau$ , named degeneracy parameter, that can be tuned in order to improve the decoder performance. Efficient computation of degeneracies $\Omega_{0}$ and $\Omega_{1}$ can be done via Dijkstra’s algorithm, as explained in Appendix C.

III.2.2 Approximated decoders

One of the drawbacks of the CG decoder is the lack of a closed-form expression for the distance between two anyons in the metric $d_{C}$ , since erased time-like edges are randomly distributed. This means that we must use Dijkstra’s algorithm to compute such distances, which can be too slow for the situation at hand. It is thus interesting to propose heuristic approximations to the CG decoder that do not require the use of Dijkstra’s algorithm and have a close-form expression for the distance between two anyon blocks given their coordinates. In order to do so, we work with the simple syndrome graph given its cubic structure.

For our first approximation, we treat the anyon blocks as defined anyons in a fully synchronous simple syndrome graph, thus ignoring erased edges. This means taking a Manhattan distance between two anyon blocks as their weight into the matching graph. More specifically, consider two anyon blocks with coordinates $(x_{i},y_{j},t_{i1},t_{i2})$ and $(x_{j},y_{j},t_{j1},t_{j2})$ . Their spacial distance is $\Delta(x_{i},x_{j})+\Delta(y_{i},y_{j})$ , where $\Delta(x_{i},x_{j})=\min(x_{i}-x_{j}~{}(\text{mod}~{}L),x_{j}-x_{i}~{}(\text{mod}~{}L))$ is the $x$ horizontal distance on the lattice (and similarly for the $y$ coordinate). Moreover, if both blocks overlap in time, then their time distance is zero, since there is an error chain with minimal length with no time-like edges connecting both blocks (see Fig. 3). If the blocks do not overlap in time, we take the average number of non-erased time edges between them. Suppose, e.g. that $t_{i1}>t_{j2}$ . There are $(t_{i1}-t_{j2})/s$ time-like edges between both blocks, out of which $t_{i1}-t_{j2}$ are non-erased on average. We thus can approximate their time distance as $\max(t_{i1}-t_{j2},t_{j1}-t_{i2},0)$ (note this is [math] when the blocks overlap in time, since $t_{i1}-t_{j2}$ and $t_{j1}-t_{i2}$ are negative). Given these considerations, we propose the Block Graph (BG) decoder which, instead of finding the actual distances within the contracted syndrome graph using Dijkstra’s algorithm, sets the distance (and thus the weight) between two anyon blocks with coordinates $(x_{i},y_{j},t_{i1},t_{i2})$ and $(x_{j},y_{j},t_{j1},t_{j2})$ as

[TABLE]

where we introduced a tunable parameter $w^{\rm BG}_{\rm time}$ . The weight between anyons in the matching graph becomes then a function of only their coordinates.

Our second approximation within the simple syndrome graph is to reduce the analysis back to a cubic syndrome graph by defining a specific time coordinate for each anyon. More specifically, each anyon is identified at a time location in the middle of its corresponding anyon block. For example, if an anyon block is defined by times $t_{i-1}$ and $t_{i}$ , then the corresponding anyon is given a location $(t_{i}+t_{i-1})/2$ . Our proposed Average Position (AP) decoder treats these anyons as existing in a cubic syndrome graph, and computes the weight between two anyons $i$ and $j$ with coordinates $(x_{i},y_{i},t_{i})$ and $(x_{j},y_{j},t_{j})$ using the Manhattan distance,

[TABLE]

where we introduced a tunable parameter $w^{\rm AP}_{\rm time}$ .

IV Results

IV.1 Simulation methods

To study the performance of each decoding algorithm we perform Monte Carlo simulations of the system where errors are sampled and the resulting system is decoded and analysed to determine whether or not an error is introduced. Since we consider a model of independent $X$ and $Z$ errors we directly simulate only phase-flip errors and $X$ -type parity checks, as by symmetry the performance will be the same for bit-flip errors. For each decoding algorithm we simulate its performance for a range of stabiliser synchronicity $s\in[0,1]$ . Our simulations capture both discrete probabilistic measurements, where stabiliser measurements are made at discrete time steps with varying success probability, and continuous measurement for which we sample errors and measurements over a continuous range. Here we briefly outline the simulation methods for both for these cases, and more details on simulation techniques can be found in Appendix A.

IV.1.1 Threshold performance

Discrete Measurement. To model discrete probabilistic stabiliser measurements we sample measurements and errors on the simple cubic syndrome graph. We define a time scale such that stabiliser measurements are obtained at a rate $1$ after $1/s$ time steps on average. In other words, each measurement round is performed after a time interval $s$ . At each time step each physical qubit (space-like edge) suffers a flip with probability $p_{\Delta}$ , the simulation error, where $p_{\Delta}$ is related to the physical error rate $p$ (error probability after $1/s$ time steps) via $p_{\Delta}=\frac{1}{2}\big{(}1-(1-2p)^{s}\big{)}$ (Eq. (2)). Each stabiliser measurement (time-like edge) is sampled and is successfully measured with probability $s$ . If the measurement does not succeed then the edge is marked as erased, otherwise, if it does succeed, then its value is flipped with probability $q$ , the measurement error. We take $q=p$ .

Continuous measurement. To model continuous stabiliser measurement ( $s=0$ ) we cannot directly sample stabiliser measurements as probabilistic events. Instead we sample error events and measurement events over a continuous time period, aiming to keep all the error parameters equivalent to the discrete measurement case. We set a time interval $T$ and a physical error $p$ per unit time. For each qubit we sample the number of bit-flips it suffers in the time interval $T$ from a Poisson distribution with parameter $\frac{T}{2}\ln(1/(1-2p))$ (see Appendix A for a justification). Given the number of events we then sample their time coordinate from a uniform distribution along the time interval $(0,T)$ . This gives us a set of space-time coordinates of qubit flip events. For each stabiliser site we sample the number of successful measurements from a Poisson distribution with parameter $T$ and distribute these measurements uniformly at random along the time interval $(0,T)$ . This gives us a set of space-time coordinates for stabiliser measurements. A parity check is done by counting the number of errors of the adjacent qubits prior to the measurement time. If the number is even (odd), the measurement outcome is $+1$ ( $-1$ ). For faulty measurements this outcome is flipped with probability $q=p$ .

Given the locations of parity check measurements we then directly construct the contracted syndrome graph by identifying a vertex with each successive pair of parity checks, and edges between neighboring check locations where parity blocks have a non-zero time overlap. Each edge has an associated error probability $p_{e}=\bar{p}(\omega(e))$ , where $\omega(e)$ is the time overlap of the parity blocks defining the edge $e$ (Eq. (3)).

Parameter optimisation. The AP decoder, the BG decoder and the CG decoder augmented with $\Omega_{0}$ and $\Omega_{1}$ have tunables parameters, to know, the time and degeneracy parameters $w^{\rm AP}_{\rm time}$ , $w^{\rm BG}_{\rm time}$ and $\tau$ , respectively. For a given value of $s$ , we probe their dependence on these parameters and pick the optimum value when comparing the threshold performance between different decoders. Their dependence on $w^{\rm AP}_{\rm time}$ , $w^{\rm BG}_{\rm time}$ and $\tau$ is explored in Appendix B.

IV.1.2 Analysing entropic contributions

In addition to gauging the decoders’ performance via their threshold, we want to understand how good an approximation is being made to the anyon pairing probability. To do this we compute the average magnitude $\langle P_{E}/P_{0}\rangle$ of the first few terms $P_{E}$ from Eq. (4) relative to the zeroeth-order term $P_{0}$ . If the higher-order terms have small values, then we expect the proposed decoders to perform well. To obtain these ratios we perform a further series of numerical experiments. We fix the synchronicity $s$ , a physical error $p$ and a measurement error $q=p$ , and, by sampling errors via Monte Carlo simulation as previously described, we obtain a syndrome and identify all pairs of anyons that are matched by the decoder (not any pair of anyon). The ratio $P_{E}/P_{0}$ is then computed for each such matched pair using Yen’s algorithm [22], which is a generalisation of Dijkstra’s algorithm for computing the $k$ -shortest loopless paths in a graph with non-negative edge cost. We average this value over all the matched anyon pairs and, finally, over other random contracted syndrome graphs.

IV.2 Threshold performance

Fig. 4 shows our main results, the threshold performance with synchronicity $s\in[0,1]$ for the three main different decoders introduced in the previous section. Fig. 4(a) compares all decoders. On the other hand, Fig. 4(b) specifically compares the CG decoder with and without the degeneracy terms $\Omega_{0}$ and $\Omega_{1}$ . At $s=1$ we have the usual MPMW decoder for the toric code with faulty measurements and i.i.d. error model [13], hence all decoders perform identically. As the synchronicity $s$ decreases, the performance of all decoders decreases, as expected. Nonetheless, even at the limit of continuous stabiliser measurement ( $s=0$ ), the threshold can be maintained at a reasonably high level, e.g. $1.688\%\pm 0.001\%$ for the CG decoder. On the other hand, the simplification of the syndrome graph structure by the BG and AP decoders, while speeding up the decoding procedure, leads to a decrease in threshold values, e.g. $1.20\%\pm 0.01\%$ (BG decoder) and $1.32\%\pm 0.01\%$ (AP decoder) at $s=0$ . Interestingly, the AP decoder, even though inferior to the BG decoder for high values of $s$ , outperforms it for high asynchronism, a fact for which we do not have an explanation. In Appendix B we show more information on the AP and BG decoders, e.g. their dependence on the time parameters $w_{\rm time}^{\rm AP}$ and $w_{\rm time}^{\rm BG}$ at $s=0$ and the optimal values for $w_{\rm time}^{\rm AP}$ and $w_{\rm time}^{\rm BG}$ as a function of asynchronicity. In addition, we also show how the inclusion of degeneracy terms like $\Omega_{0}$ and $\Omega_{1}$ into the BG decoder can lead to a substantial threshold increase.

Something that stands out from Fig. 4(b) is the fact that, while the introduction of high-order degeneracy like $\Omega_{1}$ does give higher threshold values compared to the base case of the CG decoder, this improvement becomes very small in the limit $s\to 0$ . Even by $s=0.9$ the reduction is significant. While at $s=1$ the threshold increases from $2.937\%$ to $3.050\%$ (an $\sim 0.11\%$ additive improvement), at $s=0$ it only increases from $1.688\%$ to $1.699\%$ (an $\sim 0.01\%$ additive improvement). This feature is not entirely surprising, given the following. If one assumes that the set of possible error probabilities $\{p_{e}\}_{e}$ on each edge is very diverse, e.g. consider the case of continuous asynchronism where $p_{e}=\bar{p}(\omega(e))$ and $\omega(e)$ can be any real number in $(0,T)$ , then it becomes very unlikely to have two degenerate error chains. Therefore, for completely different error probabilities $\{p_{e}\}_{e}$ , we expect most of the terms in Eq. (4) to be different from each other. This is in contrast to the fully synchronous regime ( $s=1$ ), where most first terms are equal (see more in Appendix B). Consequently, the leading term $P_{0}$ plays a more prominent role in the sum, and any truncation to it is less disruptive to its original value when $s=0$ compared to when $s=1$ .

In order to support the above claim, we shed some light on the relative size between the first $P_{E}$ terms and $P_{0}$ which underlies the decrease in threshold values. Fig. 5 explores how much smaller the first few terms in $\sum P_{E}$ are in comparison with $P_{0}$ both in the fully synchronous ( $s=1$ ) and continuous asynchronous ( $s=0$ ) regimes. The average ratio $\langle P_{E}/P_{0}\rangle$ is obtained for $E=1,\dots,10$ . One can see that most of high-order contribution is coming from $P_{1}$ on average, which is also where the discrepancy between high and low synchronicity regimes lies: $P_{1}$ ’s contribution is more than double in the regime $s=1$ compared to its contribution when $s=0$ . On the other hand, asynchronism has a much smaller impact on the average $\langle P_{E}/P_{0}\rangle$ for high-order terms $E>1$ . The relative importance of $P_{1}$ over other high-order terms $P_{E}$ is explicitly shown in Appendix B for the fully synchronous i.i.d. error model.

IV.3 Advantage in logical gate time

Direct handling of asynchronous stabiliser measurement in decoding can also provide an advantage in the time needed to execute logical gates. In Ref. [23], probabilistic stabiliser measurements arise in a scheme for quantum computing using networked ion traps. In this situation the physical errors occur only during successful stabiliser measurement, and so there is no penalty to the threshold for lower measurement probability. Ref. [23] handles the probabilistic nature of the measurements by waiting for as many attempts as necessary to get to $99\%$ success across all stabiliser sites, and abandoning the remaining $1\%$ , whose impact on the threshold is negligible. This essentially redefines a ‘round’ of stabiliser measurement to be made up of $N_{R}$ rounds, such that $1-s^{\prime}=(1-s)^{N_{R}}<0.01$ . Once a stabiliser site is successfully measured it idles and waits until the round is completed. The cost to this approach is in the time taken to execute logical operations. In the limit of small success probability many attempts must be taken to complete a renormalised round, and during this time many of the sites spend a significant time idling. We can define the logical gate overhead, $R_{L}$ , as the ratio between the number of rounds needed to complete renormalised round vs the number needed to measure a stabiliser on average as

[TABLE]

for $s<0.99$ , and $R_{L}=1$ for $s\geq 0.99$ . By using asynchronous decoding, each stabiliser is measured independently of the others, allowing stabiliser measurement information to be gathered at a faster rate, and giving $R_{L}=1$ . Fig. 6 shows that, for high asynchronism, the bundling method from [23] takes more than four times on average to perform a round compared to our asynchronous decoding approach. Moreover, their bundling method, which presupposes that physical errors occur due to successful stabiliser measurements, would probably not be viable in a more stringent error model like ours where physical errors can happen in between stabiliser measurements.

V Conclusions

We have shown how asynchronism can be incorporated into MWPM decoders while still maintaining a high threshold. We considered a simple error model where a stabiliser measurement outputs an outcome with probability $s$ called synchronicity. The limit $s\to 0$ represents a continuous asynchronous regime were stabilisers are measured completely at random in time. We tackled asynchronism by marking unsuccessful stabiliser measurements as erased in the simple syndrome graph, followed by contracting each cluster of erased edges using an edge-contraction method from Stace and Barrett [14]. The resulting graph was named contracted syndrome graph and, in opposition to the simple one, offers an easy framework to take non-identical error probabilities into consideration when decoding. We then proposed a MWPM-like decoder, named Contracted Graph (CG), using a properly weighted contracted syndrome graph.

We benchmarked the CG decoder via Monte Carlo and observed that the threshold values do decrease as the synchronicity tends to zero, but a significant level can be maintained even under a completely continuous model of syndrome extraction, e.g. the CG decoder holds thresholds of $1.69\%$ at $s=0$ . While our results were obtained with a simple error model, they show that erasure errors suffered by measurements can be efficiently handled by decoders. Studying the performance of the CG decoder under more realistic error models is a point to be considered in the future.

The CG decoder is relatively simple: by being a MPWM decoder, one only requires performing Dijkstra’s algorithm on a rightly weighted graph. However, even though being a polynomial-time algorithm, it could be too slow for practical applications. Indeed, running Dijkstra’s algorithm requires time $O(|E|+|V|\log{|E|})$ , where $|E|$ and $|V|$ are the number of edges and vertices of the syndrome graph, respectively. The syndrome graph is fairly sparse ( $|E|=O(|V|)$ ), meaning that each application of Dijkstra’s algorithm takes $\widetilde{O}(|V|)=\widetilde{O}(L^{3})$ time in the contracted syndrome graph (the notation $\widetilde{O}$ hides polylog factors). Since Dijkstra’s algorithm must be used once for each of the $O(pL^{3})$ anyons, this leads to the overall time complexity $\widetilde{O}(pL^{6})$ .

In order to remedy this, we proposed the Block Graph (BG) and Average Position (AP) decoders that skip any use of Dijkstra’s algorithm by approximating the distance between two anyons, which can be calculated in constant time. The overall time complexity improves to $O(p^{2}L^{6})$ . However, the price is a decrease in threshold value down to $1.32\%$ at $s=0$ , which is still reasonably high. Given the simple structure of these decoders, specially the AP, it might be possible to borrow previous techniques used to improve the basic MWPM decoder [11, 24] (some of these ideas could possibly be applied to the CG decoder as well). Finally, the AP and BG decoders allow for the introduction of auxiliary parameters like the time weights $w_{\rm time}^{\rm AP}$ and $w_{\rm time}^{\rm BG}$ , which must be tweaked depending on the error model. Understanding their performance as a function of $w_{\rm time}^{\rm AP}$ and $w_{\rm time}^{\rm BG}$ with more mathematical rigour is something that we did not tackle and should be considered in the future.

Another aspect we explored was the role of degeneracy terms under asynchronous measurements and how they could be included into the decoder. More specifically, we study the inclusion of the first and second-order degeneracy terms $\Omega_{0}$ and $\Omega_{1}$ into the CG decoder. Such inclusion only produced a mild improvement in threshold, from $1.69\%$ to $1.70\%$ in the limit $s\to 0$ , which hints to the fact that the role of degeneracy becomes less important in the continuous asynchronous regime and considering only the lowest weight error configuration becomes an increasingly better approximation. This was further backed up by our numerical results on the relative size between the most likely error configurations. We showed that, as the synchronicity decreases, the probability of the most likely error configuration becomes relatively higher than the probability of the subsequent ones. In might be interesting to understand this behaviour in a more qualitative manner, although it might be a hard task given the similarity to the problem of counting trails.

Acknowledgments

We would like to specially thank Hugo Cable and Naomi Nickerson for the initial project proposal, many ideas and discussions throughout the project and initial contributions to the manuscript. We also thank Noah Linden, Ryan Mann, Ashley Montanaro, and Ronald de Wolf for useful discussions and helpful comments on the manuscript. This work was supported by the National Research Foundation, Singapore and A*STAR under the CQT Bridging Grant and the Quantum Engineering Programme Award number NRF2021-QEP2-02-P05, and by the Bristol Quantum Engineering Centre for Doctoral Training, EPSRC Grant No. EP/L015730/1, while at the University of Bristol where most of this project was conducted. This work was carried out using the computational facilities of the Advanced Computing Research Centre, University of Bristol — http://www.bris.ac.uk/acrc/ — and the computational facilities of the National University of Singapore.

Appendix A Simulation methods

The simulations were all carried out in an $L\times L$ periodic lattice with $L\in\{10,12,14\}$ and repeated a number of $10^{5}$ to $10^{6}$ times. The simulations used C++ and were carried out using the computational facilities of the Advanced Computing Research Centre, University of Bristol, and the computational facilities of the National University of Singapore. The nodes had 16 and 64 cores. Each single threshold value was computed by plotting the decoding success probability for a range of $16$ values of the physical error $p$ , and took between $15$ hours for our simple decoders (BG and AP decoders) to 48 hours for our more complex decoders (CG decoder). All codes and raw data can be found in [25].

A.1 Discrete measurement

We perform $N_{s}$ measurement rounds and generate a simple syndrome graph of size $L\times L\times N_{s}$ with $N_{s}=\lceil 2/s\rfloor L$ , where $\lceil x\rfloor$ denotes the closest integer to $x$ . We assume periodic boundaries in space, and open boundaries in time, corresponding to initialisation and destructive measurement of a toric code. The last measurement round is taken to be perfect in order to guarantee the existence of a perfect matching of the anyons.

At each measurement round we flip the value of each qubit with probability $p_{\Delta}$ , the simulation error, after which we perform the stabiliser measurements, each with probability $s$ . The stabiliser outcomes, if successful, are flipped with probability $q$ . The time scale is defined such that a stabiliser outcome per qubit is obtained at an average rate of $1$ , i.e., after $1/s$ measurement rounds, meaning that consecutive measurement rounds are separated by a time interval of $s$ . The physical error $p$ is related to $p_{\Delta}$ according to Eq. (2). We always fix $q=p$ .

The resulting simple syndrome graph with error configuration is then processed to construct the matching graph. Depending on the decoder this may first involve performing edge contraction of erased edges.

A.2 Continuous measurement

In the limit $s\to 0$ we cannot sample discretely and instead generate the contracted syndrome graph directly. We first note that the number of bit-flips that a qubit suffers in the discrete measurement regime is a Binomial distribution $B(N_{s},p_{\Delta})$ . Thus, in the limit $s\to 0$ and $N_{s}\to\infty$ , such Binomial distribution converges towards a Poisson distribution with parameter $\lim_{s\to 0}N_{s}\cdot p_{\Delta}$ according to the Poisson limit theorem. Using Eq. (2b) and that $N_{s}=T/s$ , where $T$ is time corresponding to the last measurement round, we obtain

[TABLE]

A similar reasoning applies to stabiliser measurements: the number of successful stabiliser measurements in the limit $s\to 0$ has a Poisson distribution with parameter $\lim_{s\to 0}s\cdot N_{s}=T$ . The simulation for the continuous asynchronous regime ( $s=0$ ) is then performed by first setting a time interval $T=2L$ and a physical error $p$ and then, for each qubit, sampling the number of bit-flips it suffers from a Poisson distribution with parameter $\frac{T}{2}\ln(1/(1-2p))$ and distributing these bit-flips uniformly at random along the time interval $(0,T)$ . The same is done with the measurements: for each stabiliser operator, we sample a number of successful faulty measurements from a Poisson distribution with parameter $T$ and uniformly distribute these measurements along the time interval $(0,T)$ . We also include perfect measurements at time $T$ to guarantee the existence of a perfect matching.

A.3 Estimating the $P_{E}$ terms

We also computed the average relative size between the first few $P_{E}$ terms and $P_{0}$ from Eq. (4). The ratios were obtained by averaging over matched pairs of anyons by the CG decoder in a typical simulation as follows: given an $L\times L$ lattice, we first set a synchronicity $s$ , a physical error $p$ and a measurement error $q=p$ . The value of $p$ was chosen as the threshold of the CG decoder from Fig. 4 at the given $s$ . The number of measurement rounds was set as $N_{s}=\left\lceil 2/s\right\rfloor L$ for $s\in(0,1]$ , and time interval $T=2L$ for $s=0$ . We then applied the usual procedure just described above of introducing physical errors and measuring the stabilisers to obtain a random contracted syndrome graph and a set of anyons. The ratio $P_{E}/P_{0}$ is averaged over all the anyon pairs that would be matched by the CG decoder. We stress that we do not average over all anyon pairs, since anyons far away are disregarded by the decoder. The final result was averaged over other random contracted syndrome graphs. The number of samples over random contracted syndrome graphs was set to between $1000$ and $25000$ , depending on the lattice size. Computing $P_{E}$ for each pair of anyons was performed via Yen’s algorithm [22], which is a generalisation of Dijkstra’s algorithm for computing the $k$ -shortest loopless paths in a graph with non-negative edge cost.***If the length of the $E$ -shortest path in the contracted syndrome graph is $l_{E}=\sum_{e}\ln((1-p_{e})/p_{e})$ , then $P_{E}\propto\exp(-l_{E})$ .

Appendix B Extra Results

B.1 Fully synchronous i.i.d. error model

In the fully synchronous ( $s=1$ ) regime with faulty measurements and i.i.d. error model, the first-order degeneracy term $\Omega_{0}$ from Eq. (8) between two anyons $i$ and $j$ with coordinates $(x_{i},y_{i},t_{i})$ and $(x_{j},y_{j},t_{j})$ can be explicitly calculated as

[TABLE]

which is the number of ways one can take $\Delta(x_{i},x_{j})$ steps in the $x$ -direction, $\Delta(y_{i},y_{j})$ steps in the $y$ -direction and $|t_{i}-t_{j}|$ steps in the $t$ -direction. The above expression was previously considered by [14, 26, 17].

The first-order degeneracy term can be included into the MWPM decoder through the weight assignment

[TABLE]

similarly to Eq. (10), where $\tau$ is a tunable degeneracy parameter. Fig. 7 shows the effect of the first-order degeneracy term $\Omega_{0}$ on the threshold values using the MWPM decoder as a function of $\tau$ . The threshold value at $\tau=0$ corresponds to the one of the usual MPMW decoder for the toric code, and was computed to be $2.937\%\pm 0.002\%$ , in agreement with past results [13, 27, 28, 4, 29]. Moreover, the dependence on $\tau$ is very similar to the one observed in [14, Figure 9] with perfect stabiliser measurements. In particular, the threshold does not peak at $\tau=1$ , as one would expect, but around $\tau=1.4$ , where the matching algorithm favours more degenerate paths, and by $\tau=2$ it has already dropped significantly. The threshold at $\tau=1$ is $3.050\%\pm 0.002\%$ .

Furthermore, Fig. 7 also shows the effect of considering only the second highest term $P_{1}$ from Eq. (4) on top of $P_{0}$ , instead of the first $\Omega_{0}$ highest terms (replace $\Omega_{0}$ with $1+P_{1}/P_{0}$ in Eq. (14)). We see that most of the first-order-degeneracy-term improvement on the threshold comes from $P_{1}$ alone, and is a consequence of Fig. 5.

B.2 Average Position decoder

The AP decoder from Sec. IV was optimised in terms of the time parameter $w_{\rm time}^{\rm AP}$ (see Eq. (12)). We explore its dependence on $w_{\rm time}^{\rm AP}$ in Figs. 8 and 9. More specifically, Fig. 8 shows the thresholds dependence on $w_{\rm time}^{\rm AP}$ in the continuous asynchronous regime ( $s=0$ ), while Fig. 9 shows the optimal value of the $w_{\rm time}^{\rm AP}$ parameter at which the threshold is maximum as a function of $s$ .

The overall shape of Fig. 8 is expected: both underestimation (large $w_{\rm time}^{\rm AP}$ ) and overestimation (small $w_{\rm time}^{\rm AP}$ ) of the time weight between anyons worsen the performance of the AP decoder. An optimal time weight $w_{\rm time}^{\rm AP}$ should be observed. Such point, in Fig. 8 at $w_{\rm time}^{\rm AP}\approx 0.56$ , suggests that placing anyons in the middle of parity blocks “shortens” the time distance, although we currently do not have deeper arguments explaining this fact. Regarding Fig. 9, we expect a smooth interpolation between $w_{\rm time}^{\rm AP}\approx 0.56$ at $s=0$ and $w_{\rm time}^{\rm AP}=1$ at $s=1$ .

B.3 Block Graph decoder

B.3.1 Time weight parameter

Similarly to the AP decoder, the BG decoder from Sec. IV was optimised in terms of the time parameter $w_{\rm time}^{\rm BG}$ (see Eq. (11)). We explore its dependence on $w_{\rm time}^{\rm BG}$ in Fig. 10. At $s=0$ , we observe an optimal time weight $w^{\rm BG}_{\rm time}\approx 1.28$ , which points to the fact that, in opposition to the AP decoder, the BG decoder “stretches” the time distance. Moreover, during the simulations we noticed that the threshold of the BG decoder is quite insensitive to changes on $w^{\rm BG}_{\rm time}$ . This explains the apparent non-smoothness interpolation from $s=0$ to $s=1$ : simulation inaccuracies hide the exact optimal point in a relatively large interval of possible values.

B.3.2 Degeneracy

The BG decoder uses the weight assignment from Eq. (11). The net effect is to disregard erased edges between two anyon blocks and view a pair of anyon blocks as embedded in a cubic syndrome graph: space-like edges have an associated $p_{\Delta}$ error, while time-like edges have an associated $p\cdot s\approx p_{\Delta}$ error (since a measurement error occurs with probability $q=p$ if the stabiliser measurement is successful). The scenario is thus similar to the synchronous regime with i.i.d. errors and we can introduce high-order degeneracy terms like $\Omega_{0}$ by counting paths similarly to what was done in the previous section. Consider two anyon blocks $(x_{i},y_{i},t_{i1},t_{i2})$ and $(x_{j},y_{j},t_{j1},t_{j2})$ . If they do not overlap in time, then $\Omega_{0}$ is the number of shortest paths between their closest points, given by Eq. (13) with $t_{i1}/s$ and $t_{j2}/s$ if $t_{i1}>t_{j2}$ and with $t_{i2}/s$ and $t_{j1}/s$ if $t_{j1}>t_{i2}$ (see Fig. 11). If the blocks overlap in time, then

[TABLE]

which is the number of paths with zero time-like edges connecting both blocks (remember that the time scale is such that measurement rounds happen at times proportional to $s$ ). Hence, similarly to Eq. (14), degeneracy can be introduced into the BG decoder via the weighting assignment

[TABLE]

where again $\tau$ is a tunable degeneracy parameter.

In Fig. 12 we compare the thresholds of the BG decoder enhanced by the degeneracy term $\Omega_{0}$ with its base case and also with the CG decoder. We can see that degeneracy can improve the BG decoder’s performance, specially for high asynchronism, albeit not sufficiently to reach CG decoder’s level. Moreover, such improvement by degeneracy does not apply in the limit $s\to 0$ , since the simple syndrome graph is not well defined and the naive method of counting paths is not possible anymore, therefore we limit the horizontal scale in Fig. 12 to $s\in[0.1,1]$ .

B.4 Degeneracy in the Contracted Graph decoder

The improvement provided by the degeneracy terms $\Omega_{0}$ and $\Omega_{1}$ is quite small, as evident by Fig. 4. We explore such improvement in more details for the continuous asynchronous regime. As mentioned in Eq. (10), the weight assignment in the CG decoder with degeneracy is $l_{0}\ln{p^{-1}}-\tau\ln\big{(}\Omega_{0}+p\cdot\Omega_{1}\big{)}$ . The usual CG decoder without degeneracy can be approximated by using $\tau\ln\delta_{0}$ instead of $\tau\ln\left(\Omega_{0}+p\cdot\Omega_{1}\right)$ , where $\delta_{0}=\prod_{e\in E_{0}}\omega(e)$ for the most likely error chain $E_{0}$ . Indeed, from Eq. (6), the CG decoder optimises $\max_{E}\ln(p^{|E|}\delta_{E})=-\min_{E}\big{[}|E|\ln{p^{-1}}-\ln{\delta_{E}}\big{]}$ . We can also define an intermediate instance where $\tau\ln\Omega_{0}$ is used instead of the full $\tau\ln\left(\Omega_{0}+p\cdot\Omega_{1}\right)$ . Thus the following weight assignments are possible:

[TABLE]

In Fig. 13 we depict the improvement provided by including only the degeneracy term $\Omega_{0}$ , and by including both $\Omega_{0}$ and $\Omega_{1}$ . As expected, the introduction of more degeneracy improves the CG decoder threshold values. Moreover, the decoders’ performances peak around $\tau=1$ , as expected given the discussion leading to Eq. (9): the probability $P_{0}$ considered by the CG decoder is best approximated by the case $\tau=1$ .

B.5 $P_{E}$ terms

Fig. 14 numerically explores the average ratio between some of the first $P_{E}$ terms and $P_{0}$ for different sizes of the toric code. More specifically, Fig. 14(a) computes the average $\langle P_{1}/P_{0}\rangle$ as a function of the synchronicity $s$ in an $L\times L$ lattice for different sizes $L$ . Fig. 14(b) shows $\langle P_{2}/P_{0}\rangle$ in a similar fashion. Turning to the results, in Fig. 14 we see that $P_{1}$ is between $4\%$ - $10\%$ of $P_{0}$ on average, while $P_{2}$ corresponds to only $1\%$ of $P_{0}$ on average. The contribution of $P_{1}$ diminishes fairly rapidly with $s$ , thus explaining the abrupt advantage drop from degeneracy observed in Fig. 4(b) between $s=1$ and $s=0.9$ . Moreover, Fig. 14(b) shows an interesting effect which we do not have an explanation for: $P_{2}$ has a sudden decrease with synchronicity, which further lessens the role of degeneracy in the CG decoder when moving away from $s=1$ , followed by a latter improvement with more asynchronism, albeit not enough to make up for $P_{1}$ .

Appendix C Dijkstra’s algorithm

The weight between two anyons in the CG decoder is obtained via the metric $d_{C}$ (Eq. (9)), which means that we are required to calculate shortest paths between two vertices in a graph. In order to do so efficiently, we used Dijkstra’s algorithm [21]. Its run time is $O(|V|^{2})$ , where $|V|$ is the number of vertices in the graph. It is possible to improve its complexity to $O(|E|+|V|\log{|V|})$ , where $|E|$ is the number of edges, by replacing the min-priority queue from the original algorithm by a Fibonacci heap min-priority queue [30]. Here we used a binary min-priority heap, which is a heap data structure that takes the form of a binary tree, and a common variant of Dijkstra’s algorithm that fixes a source vertex and calculates the shortest paths from it to all the other vertices in the graph, thus producing a shortest-path tree.

The algorithm works by initialising two values:

•

dist[], an array of distances from the source vertex to each vertex in the syndrome graph $S$ . Initially, $\texttt{dist[source]}=0$ and $\texttt{dist[}v\texttt{]}=\infty$ for all the other vertices $v$ . As the algorithm progresses, the distance from source to each vertex is updated.

•

$Q$ , a min-priority queue of unvisited vertices. Initially $Q$ contains all the vertices. At the end of the algorithm, $Q$ is empty.

A min-priority queue is an abstract data type that provides $3$ basic operations: add_with_priority(), decrease_priority() and extract_min(). The operation add_with_priority( $v$ , dist[ $v$ ]) adds $v$ based on the value dist[ $v$ ]. A min-priority queue will order its vertices $v$ based on the increasing value of dist[ $v$ ]. The decrease_priority( $v$ , dist[ $v$ ]) updates the ordering according to a new value dist[ $v$ ] of vertex $v$ . And the operation extract_min() extracts the vertex with the minimum distance (located at the root of a binary or Fibonacci min-priority heap).

In summary, the algorithm first initialises the array of distances and the min-priority queue. Then, at each step, the vertex $u$ with minimum dist[ $u$ ] is extracted from $Q$ and set as the current vertex. We consider all its children and calculate their tentative distances through current, i.e., $\texttt{dist[current]}+d_{S}(\texttt{current},\texttt{child})$ . If dist[child] is greater than this tentative distance, then the distance of the child vertex is updated to the tentative value. We repeat this process until $Q$ is empty.

We provide a pseudo-code for Dijkstra’s algorithm.

C.1 Degeneracy terms

It is possible to use Dijkstra’s algorithm in order to calculate the degeneracies $\Omega_{0}$ and $\Omega_{1}$ between the source vertex and all other vertices. This is done by using a simplified version of Dijkstra’s algorithm for unweighted graphs. Similarly to the array of distances dist[], up to two arrays of degeneracies are updated along the algorithm.

We initialise four values:

•

$l_{0}$ [], an array of distances from source to each vertex in the unweighted contracted syndrome graph. Initially, $l_{0}\texttt{[source]}=0$ and $l_{0}\texttt{[}v\texttt{]}=\infty$ for all the other vertices $v$ . As the algorithm progresses, the distance from source to each other vertex is updated.

•

$Q$ , a queue of vertices to be explored. Initially $Q$ contains only source. At the end of the algorithm, $Q$ is empty.

•

$\Omega_{0}$ [], an array of first-order degeneracies between source and each vertex in the contracted syndrome graph. Initially, $\Omega_{0}\texttt{[source]}=1$ . As the algorithm progresses, $\Omega_{0}$ [] between source and each other vertex is updated.

•

$\Omega_{1}$ [], an array of second-order degeneracies between source and each vertex in the contracted syndrome graph. Initially, $\Omega_{1}\texttt{[}v\texttt{]}=0$ for all vertices $v$ . As the algorithm progresses, $\Omega_{1}$ [] between source and each other vertex is updated.

Notice that $Q$ here is a normal queue, with only two operation: add_end() and extract_first_element(). The operation add_end( $v$ ) adds $v$ at the end of the queue, and the operation extract_first_element() extracts the first element in the queue.

The Dijkstra’s algorithm for unweighted graphs is slightly different. This is because we need to update the distance of a given vertex only once. In summary, the algorithm first initialises the array of distances, the two arrays of degeneracies and the queue. Then, at each step, the first vertex of $Q$ is extracted and set as the current vertex. We consider all its children. There are up to four cases we need to analyse:

$l_{0}\texttt{[child]}=\infty$ : child has not been visited, so its distance is updated to $l_{0}\texttt{[current]}+1$ . We update $\Omega_{0}\texttt{[child]}$ to $\Omega_{0}\texttt{[current]}\cdot\omega(\texttt{current},\texttt{child})$ , where $\omega(e)$ is the time overlap for edge $e$ . The vertex child is then added to the end of $Q$ . 2. 2.

$l_{0}\texttt{[child]}=l_{0}\texttt{[current]}+1$ : child has been visited before through a different shortest path. We update $\Omega_{0}\texttt{[child]}$ by adding $\Omega_{0}\texttt{[current]}\cdot\omega(\texttt{current},\texttt{child})$ to its old value. 3. 3.

$l_{0}\texttt{[child]}=l_{0}\texttt{[current]}$ : child and current are equidistant from source, and the edge (child, current) belongs to a second-shortest path. We update $\Omega_{1}\texttt{[child]}$ by adding $\Omega_{0}\texttt{[current]}\cdot\omega(\texttt{current},\texttt{child})$ to its old value. 4. 4.

$l_{0}\texttt{[child]}=l_{0}\texttt{[current]}-1$ : current may be visited through a second-shortest path. We update $\Omega_{1}\texttt{[current]}$ (and not $\Omega_{1}\texttt{[child]}$ ) by adding $\Omega_{1}\texttt{[child]}\cdot\omega(\texttt{current},\texttt{child})$ to its old value.

We repeat the above procedure until $Q$ is empty.

The update of $\Omega_{1}\texttt{[]}$ works by noticing that Dijkstra’s algorithm explores the vertices in layers. First all vertices with distance $1$ are queued and later explored, followed by all vertices with distance $2$ , and so on. A second-shortest path can only happen if it is a combination of a shortest path with an edge between two vertices from the same layer, i.e., with the same distance $l_{0}\texttt{[]}$ . Condition $3$ ( $l_{0}\texttt{[child]}=l_{0}\texttt{[current]}$ ) ensures that we go from a shortest path to a second-shortest path via an edge between vertices in the same layer. We then need to use the first-order degeneracy $\Omega_{0}\texttt{[]}$ to update the second-order degeneracy $\Omega_{1}\texttt{[]}$ . This works because at this point of the algorithm all values of $\Omega_{0}\texttt{[]}$ for the given layer were already calculated. On the other hand, condition $4$ ( $l_{0}\texttt{[child]}=l_{0}\texttt{[current]}-1$ ) ensures that we stay on a second-shortest path if there is one through child, as the transition between shortest and second-shortest paths happened in some previous layer. Hence the use of a second-order degeneracy $\Omega_{1}\texttt{[]}$ to also update a second-order degeneracy. This works since the values of $\Omega_{1}\texttt{[]}$ for a given layer are completely calculated once all its vertices are considered (differently from $\Omega_{0}\texttt{[]}$ , whose values are calculated when all the vertices from the previous layer have been considered). Moreover, if there is no second-shortest path through child to current, then $\Omega_{1}\texttt{[child]}=0$ and so $\Omega_{1}\texttt{[current]}$ is unchanged.

We provide a pseudo-code below for our adapted Dijkstra’s algorithm. If we are not required to compute the second-order degeneracy term $\Omega_{1}\texttt{[]}$ , then all the lines regarding it can be ignored (lines $4,18,19,20,21$ ).

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Kitaev [2003] A. Y. Kitaev, Fault-tolerant quantum computation by anyons, Annals of Physics 303 , 2 (2003) .
2Dennis et al. [2002] E. Dennis, A. Kitaev, A. Landahl, and J. Preskill, Topological quantum memory, Journal of Mathematical Physics 43 , 4452 (2002) .
3Monroe et al. [2014] C. Monroe, R. Raussendorf, A. Ruthven, K. Brown, P. Maunz, L.-M. Duan, and J. Kim, Large-scale modular quantum-computer architecture with atomic memory and photonic interconnects, Physical Review A 89 , 022317 (2014) .
4Nickerson [2015] N. Nickerson, Practical fault-tolerant quantum computing , Ph.D. thesis , Imperial College London (2015).
5Kim et al. [2011] J. Kim, P. Maunz, T. Kim, J. Hussman, R. Noek, A. Mehta, and C. Monroe, Modular universal scalable ion-trap quantum computer (MUSIQC), in AIP Conference Proceedings , Vol. 1363 (American Institute of Physics, 2011) pp. 190–193.
6Knill et al. [2001] E. Knill, R. Laflamme, and G. J. Milburn, A scheme for efficient quantum computation with linear optics, Nature 409 , 46 (2001) .
7Gimeno-Segovia et al. [2015] M. Gimeno-Segovia, P. Shadbolt, D. E. Browne, and T. Rudolph, From three-photon Greenberger-Horne-Zeilinger states to ballistic universal quantum computation, Physical review letters 115 , 020502 (2015) .
8Morley-Short et al. [2019] S. Morley-Short, M. Gimeno-Segovia, T. Rudolph, and H. Cable, Loss-tolerant teleportation on large stabilizer states, Quantum Science and Technology 4 , 025014 (2019) . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Decoding probabilistic syndrome measurement and the role of entropy

Abstract

I Introduction

II Asynchronism in the toric code

II.1 The toric code

II.2 Asynchronous stabiliser measurement

II.3 Constructing the decoding problem

II.3.1 Simple syndrome graph

II.3.2 Contracted syndrome graph

II.3.3 Continuous stabiliser measurement

III Decoding

III.1 Anyon-pairing decoders

III.2 Decoding algorithms

III.2.1 Contracted Syndrome Graph decoders

III.2.2 Approximated decoders

IV Results

IV.1 Simulation methods

IV.1.1 Threshold performance

IV.1.2 Analysing entropic contributions

IV.2 Threshold performance

IV.3 Advantage in logical gate time

V Conclusions

Acknowledgments

Appendix A Simulation methods

A.1 Discrete measurement

A.2 Continuous measurement

A.3 Estimating the PEP_{E}PE​ terms

Appendix B Extra Results

B.1 Fully synchronous i.i.d. error model

B.2 Average Position decoder

B.3 Block Graph decoder

B.3.1 Time weight parameter

B.3.2 Degeneracy

B.4 Degeneracy in the Contracted Graph decoder

B.5 PEP_{E}PE​ terms

Appendix C Dijkstra’s algorithm

C.1 Degeneracy terms

A.3 Estimating the $P_{E}$ terms

B.5 $P_{E}$ terms