A stochastic variant of the abelian sandpile model
Seungki Kim, Yuntao Wang

TL;DR
This paper introduces a stochastic variant of the abelian sandpile model (SSP), exploring its mathematical properties, physical behavior, and potential as a simplified model for understanding the LLL algorithm in lattice cryptography.
Contribution
The paper develops a basic theoretical framework for SSP, a stochastic extension of ASM, and investigates its properties and potential applications in cryptography.
Findings
SSP shares mathematical properties with ASM
SSP exhibits different physical behavior and steady state shape
Numerical study of SSP's behavior supports its theoretical analysis
Abstract
We introduce a natural stochastic extension, called SSP, of the abelian sandpile model(ASM), which shares many mathematical properties with ASM, yet radically differs in its physical behavior, for example in terms of the shape of the steady state and of the avalanche size distribution. We establish a basic theory of SSP analogous to that of ASM, and present a brief numerical study of its behavior. Our original motivation for studying SSP stems from its connection to the LLL algorithm established in another work by the authors [5]. The importance of understanding how LLL works cannot be stressed more, especially from the point of view of lattice-based cryptography. We believe SSP serves as a tractable toy model of LLL that would help further our understanding of it.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A stochastic variant of the abelian sandpile model
Seungki Kim and Yuntao Wang
Abstract.
We introduce a natural stochastic extension, called SSP, of the abelian sandpile model(ASM), which shares many mathematical properties with ASM, yet radically differs in its physical behavior, for example in terms of the shape of the steady state and of the avalanche size distribution. We establish a basic theory of SSP analogous to that of ASM, and present a brief numerical study of its behavior.
Our original motivation for studying SSP stems from its connection to the LLL algorithm established in another work by the authors [5]. The importance of understanding how LLL works cannot be stressed more, especially from the point of view of lattice-based cryptography. We believe SSP serves as a tractable toy model of LLL that would help further our understanding of it.
1. Introduction
1.1. Overview
Let us start by recalling ASM ([1], [3], also see [4]) on a one-dimensional lattice. Let be the cycle graph whose vertices consist of elements, say , and the (undirected) edges are defined by . We designate as the sink. Fix two positive integers and , preferably .
We temporarily refer to as the “toppling strength” and as the threshold. That is, toppling at site is to subtract grains from the pile on and add grains to each of its neighbors. A configuration on is called stable if and only if all non-sink vertices have grains. Fix another positive integer coprime to , preferably . One then considers the following Markov chain on the space of the stable configurations of ASM: given a stable configuration on , obtain the next stable configuration by first adding grains to any randomly chosen non-sink site, and then stabilizing the resulting configuration.111The only reason to impose the condition is to prevent the pile heights at each site from being concentrated on a select few congruence classes modulo ; it is not so much an essential condition as a cosmetic one.
On the other hand, consider the following simple stochastic variant of ASM, which we named SSP. SSP is exactly the same as ASM except for the toppling procedure: first one samples uniformly from (so that has mean ) and then subtract grains from site and add to each of the two neighboring sites. One can of course associate a Markov chain to SSP analogously to the above discussion: add grains to a random site and then stabilize.
It is rather surprising that this simple and natural extension of ASM has never been considered in the literature so far. In fact, our motivation for studying SSP does not come from physics, but from a study of the practical behavior of the LLL algorithm [8]. The LLL algorithm is one of the fundamental tools in computational mathematics used for lattice basis reduction, with numerous applications to number theory and cryptography — see the book [10] for an extensive treatment on the subject. Despite its celebrated status, much of its behavior in practice has been shrouded in complete mystery. For example, the most well-known problem concerns the quantity called the root Hermite factor(RHF), defined as
[TABLE]
where is the pile height at , in sandpile language. It is a theorem from [8] that the RHF of LLL has a sharp upper bound by , but empirically one observes RHF most of the time — see [9] for details. This phenomenon has been well-known since the birth of LLL in 1982, yet there has been not even a vague heuristic as to why this must happen.
Recently, we argued in [5] that LLL behaves like a sandpile model, and that this idea explains much of its practical behavior all at once, in particular this RHF problem. The remaining difficulty is that LLL as a sandpile model is nonabelian, which is difficult to study analytically. Hence we invented SSP, the closest abelian version of LLL, to be used as a toy model.
Figure 1 compares the empirical average “steady state density” of LLL and SSP. The resemblance therein should come as surprising, since the two algorithms originate from very different fields of study, lattice reduction and statistical physics, respectively. We hope that the present paper helps build a bridge between them, to the benefit of both areas.
1.2. In this paper
In the next section, we develop a basic mathematical theory of SSP. As in the case of ASM, we impose a monoid structure on the set of stable configurations, which we show is abelian (Theorem 4), and we prove the existence of the unique steady state (Theorem 5). Many of our results can be proved using the operator algebra method e.g. as in [12], but here we introduce a different idea, which is useful for picturing the “shape” of the steady state. With its help we are able to prove that the average of SSP over the steady state is strictly bounded away from the worst-case by a constant independent of system size (Proposition 8). This is precisely what one wants to show with LLL, and hopefully our idea generalizes to this case, the only obstacle being its non-abelian toppling rule.
In Section 3, we provide some preliminary numerical studies on the statistical behavior of SSP, concerning its avalanche size distribution and its steady state. Though we still need a more extensive investigation, it seems that SSP has much in common with other one-dimensional stochastic models, such as the abelian Manna model and the Oslo rice-pile model. From the LLL perspective, it is especially interesting that both the Oslo and the Manna models exhibit the same boundary phenomenon as can be seen in Figure 1 (see [6] and [12]).
1.3. Acknowledgment
We thank Deepak Dhar, Phong Nguyen, and Su-Chan Park for helpful comments and suggestions.
2. Mathematical properties of SSP
2.1. Setup
In the literature on sandpiles, there seem to exist two sets of notations, one that is physics-oriented and one that is mathematics-oriented — heights versus configurations, sites versus vertices, relaxations versus stabilizations, and so on. This paper mostly employs the latter, and there are several words that we made up ourselves. In any case, we provide the definition of every term that we use below.
Like ASM, SSP is played on a finite undirected graph , equipped with a designated vertex called sink, which we denote here by . Throughout this paper, we assume that the graph obtained from by removing is connected. In addition, let be a probability mass function supported on a finite subset of , and let be an integer satisfying , which we sometimes refer to as the threshold.
A pure configuration is a function . A (mixed) configuration is a formal finite sum of pure configurations of form , where are positive real numbers such that , and are pure configurations. The role the bracket notation here is to clarify that this is a formal sum. We define configurations this way, since toppling in SSP leads to multiple probabilistic results, as will be explained below.
A pure configuration is called stable if for all , and called nonnegative if for all . A configuration is called stable or nonnegative if it is a formal sum of stable or nonnegative pure configurations, respectively.
Toppling a pure configuration at a non-sink vertex goes as follows: i) sample from according to , ii) subtract from iii) for each add to , where is the number of edges connecting and . Denote the outcome by . If , we call this toppling legal. It is clear that, unless is supported on a singleton set, toppling a pure configuration results in a mixed configuration.
A choice algorithm is a map , such that . Fixing a choice algorithm , we can define what it means to topple a mixed configuration . For a pure configuration , define
[TABLE]
Then we define , the outcome of toppling once. The stabilization of is the outcome of toppling repeatedly until it becomes stable. Later we will show that is independent of .
Remark*.*
It is possible to extend the definition of a choice algorithm to a map into the set of probability mass functions on , so that only if , and
[TABLE]
All the results in this paper carry over to this generalization.
As in the case of ASM, we can define an operation on the set of nonnegative stable configurations. If and are pure stable configurations, we define , where is defined so that . In general, if and , then we define and accordingly .
Because we assumed that , for any nonnegative configurations and , is nonnegative as well. Later, we will show that is abelian and associative, that the set of nonnegative stable configurations form a semigroup under this operation, whose minimal ideal consists of a single element which is the steady state.
2.2. Stabilization is independent of choice algorithms
We start by defining what we temporarily call an increment stack. An increment stack consists of sequences, each of which is indexed by an element of , whose entries are chosen from the elements of ; thus it has the form , with each . The set of all increment stacks is given a topology and a measure as follows: the cylinder sets, i.e. sets of form , form the base for the closed sets, and the measure is defined by . This notion simulates the randomness in the amount being toppled during a stabilization process: SSP is equivalent to first sampling an increment stack according to , and if one must topple at for the -th time, remove unit of sand from and distribute it equally among the neighbors of (respecting the multiplicities).
For a function , a substack of an increment stack is a finite-length sequence of form . We define the following partial order on the set of substacks: if for . Given a pure configuration , if for every
[TABLE]
or equivalently, if toppling, legally or illegally, according to stabilizes — then is called a stabilizing substack of .
Lemma 1**.**
For any given increment stack and pure configuration , there exists a unique minimal stabilizing substack of .
Proof.
Suppose and are two distinct minimal stabilizing substacks. Define , and let be another substack; by assumption and . Then, for every non-sink ,
[TABLE]
Therefore is a stabilizing substack, contradicting the minimality of and . ∎
Given an increment stack and a pure configuration , a choice algorithm induces a stabilizing substack , where is the number of times would topple on until it stabilizes .
Proposition 2**.**
Any choice algorithm induces the minimal stabilizing substack.
Proof.
Write for the minimal stabilizing substack, and for the substack induced by a choice algorithm . Imagine toppling an input configuration , one vertex at a time, and consider a situation in which has just toppled on for times, being the first vertex at which this occurs. Observe that, at this point, the configuration is stabilized at . Hence cannot topple more on , at least not until it topples on some neighborhood of more than times.
Suppose is the second vertex at which topples on for times. By the same argument as above, cannot topple on more than times until it topples on some more than times. Repeating, we see that on no vertices can topple more than times. This proves that , but since is minimal we must have . ∎
An immediate consequence of Proposition 2 is that, for any configuration , pure or non-pure, its stabilization is independent of the choice algorithm . Indeed, is determined solely by the measure on the set of all increment stacks, and the minimal stabilizing substack of each increment stack. Therefore, the notion of a choice algorithm is somewhat superfluous from a theoretical viewpoint.
2.3. is abelian and associative
It is clear by definition that is abelian. We will now show that is associative as well. Since and , it suffices to show that
Proposition 3**.**
For nonnegative configurations and ,
[TABLE]
or equivalently,
[TABLE]
Proof.
Assume first that and are pure configurations. Fix an increment stack . Let and be the minimal stabilizing substacks for and , respectively. Note — this is where we use the nonnegativity of and .
Consider the left-hand side of (2), namely . A priori, we need two increment stacks for each of the two stabilizations: for the inner term and for the outer one. Write . It suffices to prove that
[TABLE]
or, in other words, that carrying out both stabilizations on the left-hand side of (2) using only one increment stack induces no distortion on the measure . But this is clear from the definition of .
It remains to consider the case in which and are not necessarily pure configurations. Write and . Then
[TABLE]
as desired. ∎
Remark*.*
If the nonnegativity assumption is not present, Proposition 3 fails. In fact, it fails for ASM as well.
Thanks to our results so far, we have the following
Theorem 4**.**
The set of all nonnegative stable configurations of a SSP forms a commutative monoid under .
2.4. The existence of unique steady state
Denote the monoid in Theorem 4 by . A steady state is an element such that
[TABLE]
It is clear from this definition that a steady state, if exists, is unique.
Theorem 5**.**
Assume the greatest common divisor of the elements of equals . Then has a steady state.
There is no loss of generality incurred by the assumption in the theorem, since if we divide all elements of by the g.c.d. then we obtain essentially the same model.
Our proof of Theorem 5 relies almost entirely on the following lemma.
Lemma 6**.**
Let such that , and such that the g.c.d. of the elements of equals . Consider the matrix
[TABLE]
with the first column filled with , the superdiagonal with , and [math] elsewhere. Then its (complex right) eigenvalue with the largest modulus is 1, which has multiplicity 1, with the eigenvector
[TABLE]
where .
Proof.
The characteristic equation of equals
[TABLE]
We first show that is an eigenvalue of multiplicity one. Certainly . Also since
[TABLE]
and , . It is straightforward to check that (3) is a solution to . This proves the latter two claims of the lemma.
If , then , so it cannot be a root of . Therefore it suffices to show that is the only complex eigenvalue with modulus . Suppose and = 0. Then for all , must hold. In other words, is an -th root of unity for all . But since the g.c.d. of the elements of equals 1 by assumption, is forced. ∎
An immediate consequence of Lemma 6 is that, for any , approaches a constant multiple of (3) as . This is how we use Lemma 6 in the argument below.
Proof of Theorem 5.
Write , and let be the smallest positive integer such that . Define to be the toppling operator at with increment (hence coincides with the toppling operator of the standard ASM on ), so that where is the toppling operator of SSP.
Define by , where the constant factor is set so that , that is,
[TABLE]
For any pure configuration , define
[TABLE]
One can think of as a distribution supported on points on the space of (pure) configurations, which turn out to form the shape of a parallelepiped; see Figure 2 for an illustration in case . The point on the upper-right corner corresponds to , and other points are obtained by toppling in various directions; see again Figure 2 for an illustration.
Write for the indicator function of the statement inside the parenthesis, i.e. if it is satisfied, , otherwise zero. From Lemma 6, it follows that
[TABLE]
This means that, if we “push” the parallelepiped-shaped distribution in the direction of by applying to the face of the parallelepiped corresponding to that direction, we obtain the same distribution on the parallepiped, except that the upper-right corner of the parallelepiped moves from to .
We choose to be the set of all recurrent configurations of ASM on , whose threshold on each vertex is set uniformly to , rather than the conventional . For convenience, let us temporarily write
[TABLE]
We will show that is the steady state.
By Proposition 3, it suffices to show that for all . By the theory of ASM, contains exactly one representative of each of the equivalence classes of configurations i.e. for each , there exists exactly one such that stabilizes to in ASM. By (4), , and therefore, by Proposition 3 again,
[TABLE]
as desired. ∎
The point of the following proposition is that, in a SSP, if we start from any configuration, and repeat sufficiently many times the Markov process of adding a sand grain at random — e.g. add — and re-stabilizing, then we obtain a configuration that is arbitrarily close to the steady state. In the context of LLL, in place of the Markov chain one takes a large input, which is why the proposition below is stated the way it is.
Proposition 7**.**
We continue with the notations of Theorem 5. In addition, below we interpret a pure configuration as an -dimensional vector with integer entries, and a mixed configuration as a finitely supported function on the set of pure configurations. Then we impose the typical Euclidean metric on the space of pure or mixed configurations accordingly.
For any , there exists such that the stabilization of any non-negative pure configuration whose distance from origin is greater than is within an distance of with respect to the uniform norm, where is the pure configuration that is recurrent and is equivalent to under the ASM on .
Proof.
If is far enough from the origin, then it must be toppled at each vertex arbitrarily many times in order for it to become stabilized. So without loss of generality, assume that can be legally toppled at every vertex. Write as earlier, and consider the configuration
[TABLE]
(5), like , may be thought of as a distribution on a parallelepiped illustrated in Figure 2. By Lemma 6, if we “push” the parallelepiped — in the manner similar to (4) — from every direction sufficiently many times, the weight on each and every point will be arbitrarily close to . Hence, if is sufficiently far enough from the origin, the stabilization of (5) will be arbitrarily close to . This completes the proof. ∎
The steady state alone forms the minimal ideal of the commutative monoid of the nonnegative stable configurations of SSP. One may think this is disappointing, since its counterpart in ASM has a very rich theory. However, the theory of the abelian sandpile group can be extended to SSP in a straightforward way, if phrased in terms of the cosets of the “translations” , since SSP respects this structure. Indeed, in our proof of Theorem 5, it can be seen that the underlying ASM theory plays a role. Our SSP theory is building up on top of the ASM theory, rather than replacing it.
2.5. The shape of the steady state on
Ideally, we would like to prove the various quantitative properties of the steady state that we observe in Figure 1: the middle values are almost identical to one another, but near the boundaries there is a sharp drop in pile height, et cetera. This can be interpreted as the following combinatorial problem. Consider the parallelepiped in Figure 2, and suppose is a stable configuration; it may help to think of as the maximal stable configuration, i.e. has height on all sites. Then some configurations on the parallelepiped are stable e.g. , but others may not be. Because is stable, we cannot legally push the entire face of the parallelepiped as we have done in the proofs of Theorem 5 and Proposition 7. Instead, we must “fold” the parts of the parallelepiped that are outside the set of stable configurations. Understanding the steady state then amounts to understanding the resulting distribution on the “folded” parallelepiped — they are identical. In some sense, this parallelepiped-folding replaces the role of the burning algorithm (see [3], also [2]) in the study of the steady state.
The difficulty lies in describing this folded distribution in a mathematically concise manner. The best we can prove at the moment is that there exists a gap, whose size is independent of the system size , between the average-case and the worst-case RHF (1) of SSP on . This is in analogy with the folklore observation that there exists a gap between the average-case and the worst-case RHF of the LLL algorithm.
Proposition 8**.**
Let be as in the proof of Theorem 5. Then for any and sufficiently large, the steady state of SSP on has the average RHF bounded from above by .
Remark*.*
Note that the maximum equals . For the SSP in the introduction i.e. (uniform distribution on ), so the proposition yields the bound , whereas experimentally we have .
Proof of Proposition 8.
As in the remark above, the greatest possible equals . For , we will estimate the number of pure stable configurations whose RHF is greater than . If satisfies such a condition, then writing , from (1) it follows that
[TABLE]
Moreover, this is an if-and-only-if condition. Hence it comes down to measuring the volume of the set of vectors , , such that (6) holds. This set forms a simplex, with one vertex at origin, and other vertices given by
[TABLE]
for each , whose only non-zero entry is the -th entry. The volume of this simplex equals
[TABLE]
which, by Stirling’s formula, is strictly bounded by for all sufficiently large . Furthermore, the maximum possible probability mass of the steady state is given by . Hence, for large, the portion of the steady state lying on the set is arbitrary small if
[TABLE]
This completes the proof. ∎
3. Behavior of SSP on one-dimensional lattices
In this section, we present a pilot numerical study on the behavior of SSP pertaining to quantitative properties of its steady state, whose existence is established in the previous section. We restrict our attention to SSP on one-dimensional grids for , with and the uniform distribution on , adding grains to a random site for each step in the associated Markov chain. We hope to be able to carry out more extensive experiments at a later time.
3.1. Avalanche size distribution
Figure 3 presents the log-log scale graph of the avalanche size — i.e. the number of sites on which at least one toppling occurred during a stabilization — distribution of SSP. For each system size , we made one million trials. The -axis represents of avalanche sizes, and the -axis represents of the number of occurences.
Figure 3 clearly suggests a power law, plus a delta distribution near . The data points for , excluding those at the tail, form a line of slope . Of course, more experiments are needed to precisely determine the exponent of SSP.
It may be amusing to recall the behavior of one-dimensional ASM, which is well-known to behave in the totally opposite way, i.e. (avalanche size) (frequency). On the other hand, several stochastic models in dimension 1 are known to demonstrate a power law ([6], [7], [12]).
3.2. Average heights of the piles in the steady state
Figure 4 presents the (empirical) average steady state density of SSP for system sizes 400, 4000, 40000, respectively. The values in the middle seem to stabilize at , and in the extremes at .
The obvious curiosity is the fall-off shape at the boundaries. Figure 5 puts together the three graphs in Figure 4 for a comparison. Here we can see that the boundary areas overlap, which indicate that the effect of a boundary is independent of system size. Also observe that, in Figure 5, the first half of the average output shape of overlaps with the other shapes; the latter half, of course, must be the mirror image of the first half. Indeed, the first half of the average output shape of is also found to overlap with that of .
At this point, we are able to only speculate as to why the average height diminishes near the boundaries, as follows. When a toppling occurs at one of the middle vertices, it is expected that some of the toppled sand comes back when the neighboring vertex topples. Some grains of sand may travel several steps away from where it started and then come back. However, when a toppling occurs at the left extreme, for example, the sand that moved to the left is lost forever.
Other stochastic sandpile models in dimension one, such as the abelian Manna model ([12]) and the Oslo rice-pile model ([6]), have also been observed to exhibit similar-looking shapes. [6] attributes it to the finite-size scaling theory — see equation (7) in [6]. However, in the SSP case, the same explanation seems to account for only the first 10 points or so away from the boundary.
It is also natural to ask whether the general features of the average steady state density are preserved if is replaced with something other than a uniform distribution. To this end, we ran several brief experiments with the Poisson distribution, and also the family of fat-tailed distributions of index i.e. , which is often used to produce unusual outcomes.222These distributions have infinite support, and thus they are not strictly covered by the theory developed here. However, the parallelepiped-pushing idea still goes through. In all cases, we still observe both the flatness of the middle values and the diminishing amount of sand near the boundaries, suggesting that these are general properties of one-dimensional stochastic sandpiles. For instance, see Figure 6, the outcome for the case of fat-tailed distributions of indices and , with mean and standard deviation respectively. It may be interesting to study how the average output shape of SSP and the representative values of are related; for example, it seems reasonable to conjecture that the mean of correlates with the pile height in the middle, and the standard deviation with the boundary effect.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. Bak, C. Tang, and K. Wieselfeld. Self-organized criticality: an explanation of 1 / f 1 𝑓 1/f noise. Phys. Rev. Lett. , 59:381-384, 1987.
- 2[2] Y. Chan, J.-F. Marckert, T. Selig. A natural stochastic extension of the sandpile model on a graph. J. Combin. Theory Ser. A , 120 (2013), no. 7, 1913-1928.
- 3[3] D. Dhar. Self-organized critical state of sandpile automaton models. Phys. Rev. Lett. , 64(14):1613-1616, 1990.
- 4[4] D. Dhar. Theoretical studies of self-organized criticality. Physica A , 369(2006) 29-70.
- 5[5] J. Ding, S. Kim, T. Takagi, and Y. Wang. LLL and stochastic sandpile models. Preprint. Available at https://sites.google.com/view/seungki/
- 6[6] P. Grassberger, D. Dhar, and P. K. Mohanty. Oslo model, hyperuniformity, and the quenched Edwards-Wilkinson model. Phys. Rev. E, 94, 042314 (2016).
- 7[7] H. N. Hyunh, G. Pruessner, L. Y. Chew. The Abelian Manna model on various lattices in one and two dimensions. J. Stat. Mech. , P 09024 (2011).
- 8[8] A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovász. Factoring polynomials with rational coefficients. Math. Ann. , 261(4):515-534, 1982.
