Automatic Generation of Moment-Based Invariants for Prob-Solvable Loops
Ezio Bartocci, Laura Kov\'acs, Miroslav Stankovi\v{c}

TL;DR
This paper introduces a method to automatically generate moment-based invariants for Prob-Solvable loops in probabilistic programs, enabling comprehensive analysis of loop behaviors beyond expected values.
Contribution
It presents a novel combination of symbolic summation and statistical methods to derive higher-order moment invariants for a specific class of probabilistic loops.
Findings
Successfully derived invariants for complex probabilistic loops
Automated computation of higher-order moments demonstrated
Enhanced analysis capabilities for probabilistic program behaviors
Abstract
One of the main challenges in the analysis of probabilistic programs is to compute invariant properties that summarise loop behaviours. Automation of invariant generation is still at its infancy and most of the times targets only expected values of the program variables, which is insufficient to recover the full probabilistic program behaviour. We present a method to automatically generate moment-based invariants of a subclass of probabilistic programs, called Prob-Solvable loops, with polynomial assignments over random variables and parametrised distributions. We combine methods from symbolic summation and statistics to derive invariants as valid properties over higher-order moments, such as expected values or variances, of program variables. We successfully evaluated our work on several examples where full automation for computing higher-order moments and invariants over program…
| Program | Moment | Runtime () | Computed Moment-Based Invariants |
| Coupon [18] | 1 | 0.37 | |
| 2 | 0.40 | ||
| 3 | 0.34 | ||
| Coupon4 [18] | 1 | 0.90 | |
| 2 | 1.1 | ||
| 3 | 1.3 | ||
| random_walk_1d_cts [18] | 1 | 0.12 | |
| 2 | 0.45 | ||
| 3 | 1.00 | ||
| sum_rnd_series [6] | 1 | 0.31 | |
| 2 | 2.89 | ||
| 3 | 17.7 | ||
| product_dep_var [6] | 1 | 0.65 | |
| 2 | 6.27 | ||
| 3 | 37.5 | ||
| random_walk_2d [4, 18] | 1 | 0.07 | |
| 2 | 0.26 | ||
| 3 | 0.49 | ||
| Binomial(””) [6, 8, 14] | 1 | 0.17 | |
| 2 | 0.47 | ||
| 3 | 1.6 | ||
| StutteringA – Fig. 1(A) | 1 | 0.44 | |
| 2 | 2.2 | ||
| 3 | 8.48 | ||
| StutteringB – Fig. 1(B) | 1 | 0.49 | |
| 2 | 2.03 | ||
| 3 | 7.43 | = | |
| StutteringC – Fig. 1(C) | 1 | 1.8 | |
| 2 | 72.5 | ||
| 3 | 2144 | ||
| StutteringD – Fig. 1(D) | 1 | 1.92 | |
| 2 | 46.3 | ||
| 3 | 2076 | ||
| StutteringP | 1 | 0.28 | |
| 2 | 1.68 | ||
| 3 | 6.05 | ||
| Square | 1 | 0.38 | |
| 2 | 2.46 | ||
| 3 | 8.70 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: TU Wien, Austria
Automatic Generation of Moment-Based
Invariants for Prob-Solvable Loops
Ezio Bartocci
Laura Kovács
Miroslav Stankovič
Abstract
One of the main challenges in the analysis of probabilistic programs is to compute invariant properties that summarise loop behaviours. Automation of invariant generation is still at its infancy and most of the times targets only expected values of the program variables, which is insufficient to recover the full probabilistic program behaviour. We present a method to automatically generate moment-based invariants of a subclass of probabilistic programs, called Prob-solvable loops, with polynomial assignments over random variables and parametrised distributions. We combine methods from symbolic summation and statistics to derive invariants as valid properties over higher-order moments, such as expected values or variances, of program variables. We successfully evaluated our work on several examples where full automation for computing higher-order moments and invariants over program variables was not yet possible.
1 Introduction
Probabilistic programs (PPs), originally employed in cryptographic/privacy protocols and randomised algorithms, are now gaining momentum due to the several emerging applications in the areas of machine learning and AI [10]. One of the main problems that arise from introducing randomness into the program is that we can no longer view variables as single values; we must think about them as distributions. Existing approaches, see e.g. [21, 14, 4] usually take into consideration only expected values, or upper and lower bounds over program variables. As argued by [23], such information is however insufficient to characterize the full value distributions of variables; (co-)variances and other higher-order moments of variables are also needed. Consider for example the PPs of Fig. 1(A) and Fig. 1(B): the expected value of variable at each loop iteration is the same in both PPs, while the variance of the value distribution of differs in general (a similar behaviour is also exploited by Fig. 1(C)-(D)). Thus, Fig. 1(A) and Fig. 1(B) do not have the same invariants over higher-order moments; yet, current approaches would fail identifying such differences and only compute expected values of variables.
One of the main challenges in analysing PPs and computing their higher-order moments comes with the presence of loops and the burden of computing so-called quantitative invariants [14]. Quantitative invariants are properties that are true before and after each loop iteration. Weakest pre-expectations [21, 14] can be used to compute quantitative invariants. This approach, supported for example in Prinsys [11], consists in annotating a loop with a template invariant and then solve constraints over the unknown coefficients of the template. Other methods [2, 18] use martingales that are expressions over program variables whose expectations remain invariant. The aforementioned approaches are however not fully automatic since they require user guidance for providing templates and hints. In addition, they are limited to invariants over only expected values: with the exception of [18], they do not compute higher-order moments describing the distribution generated by the PP (see Section 6 for more details).
In this paper we introduce a fully automated approach to compute invariant properties over higher-order moments of so-called Prob-solvable loops, to stand for probabilistic P-solvable loops. Prob-solvable loops are PPs that extend the imperative P-solvable loops described in [17] with probabilistic assignments over random variables and parametrised distributions. As such, variable updates are expressed by random polynomial, and not only affine, updates (see Section 3). Each program in Fig. 1 is Prob-solvable; moreover, Fig. 1(C)-(D) involve nonlinear updates over .
Our work uses statistical properties to eliminate probabilistic choices and turn random updates into recurrence relations over higher-order moments of program variables. We show that higher-order moments of Prob-solvable loops can be described by C-finite recurrences (Theorem 4.1). We further solve such recurrences to derive moment-based invariants of Prob-solvable loops (Section 4). A moment-based invariant is a property that holds at arbitrary loop iterations (hence, invariants), expressing closed form solutions of higher-order moments of program variables. To the best of our knowledge, no other method is able to derive higher-order moments of PPs in a fully automated approach. Our work hence allows to replace, for example, the required human guidance of [11, 19] for Prob-solvable loops. Unlike existing works, we support PPs with parametrised distributions (e.g., in Fig. 1(A)): instead of taking concrete instances of a given parametrised distribution, we automatically infer invariants of the entire class of PPs characterised by the considered parametrised distribution.
Our approach is both sound and terminating: given a Prob-solvable loops and an integer , we automatically infer the moment-based invariants over the th moments of our input loop (see Section 4). Unlike the approach of [17] for deriving polynomial invariants of non-probabilistic (P-solvable) loops, our work only computes closed form expressions over higher-order moments and does not employ Gröbner basis computation to eliminate loop counters from the derived closed forms. As such, our moment-based invariants are not restrictive to polynomial properties but are linear combinations of polynomial expressions and exponential sequences over the loop counter. Moreover, Prob-solvable are more expressive than P-solvable loops as they are not restricted to deterministic updates but allow random assignments over variables.
Contributions.
Our main contributions are: (1) we introduce the class of Prob-solvable loops with probabilistic assignments over random variables and distributions (Section 3); (2) we show that Prob-solvable loops can be modelled as C-finite recurrences over higher-order moments of variables (Theorem 4.1); (3) we provide a fully automated approach that derives moment-based invariants over arbitrary higher-order moments of Prob-solvable loops (Algorithm 1); (4) we implemented our work as an extension of the Aligator package [12] and evaluated over several challenging PPs (Section 5).
2 Preliminaries
We recall basic mathematical properties about recurrences and higher-order moments of variable values – for more details see [16, 20]. Throughout this paper, let denote the set of natural, integer and real numbers. We reserve capital letters to denote abstract random variables, e.g. , and use small letters to denote program variables, e.g. , all possibly with indices.
2.1 C-Finite Recurrences
While sequences and recurrences are defined over arbitrary fields of characteristic zero, in our work we only focus over sequences/recurrences over .
Definition 1 (Sequence)
A univariate sequence in is a function . A recurrence for a sequence is
[TABLE]
for some function , where is called the order of the recurrence.
For simplicity, we denote by both the recurrence of as well as the recurrence equation . When solving the recurrence , one is interested in computing a closed form solution of , expressing the value of as a function of for any . In our work we only consider the class of linear recurrences with constant coefficients, also called C-finite recurrences.
Definition 2 (C-finite recurrences)
A C-finite recurrence satisfies the linear homogeneous recurrence with constant coefficients:
[TABLE]
where is the order of the recurrence, and are constants with .
An example of a C-finite recurrence is the recurrence of Fibonacci numbers satisfying the recurrence , with initial values and . Unlike arbitrary recurrences, closed forms of C-finite recurrences always exist [16] and are defined as:
[TABLE]
where are the distinct roots of the characteristic polynomial of and are polynomials in . Closed forms of C-finite recurrences are called C-finite expressions. We note that, while the C-finite recurrence (1) is homogeneous, inhomogeneous C-finite recurrences can always be translated into homogeneous ones, as the inhomogeneous part of a C-finite recurrence is a C-finite expression.
In our work, we focus on the analysis of Prob-solvable loops and consider loop variables as sequences , where denotes the loop iteration counter. Thus, gives the value of the program variable at iteration .
2.2 Expected Values and Moments of Random Variables
Here we introduce the relevant notions from statistics that our work relies upon.
Definition 3 (Probability space)
A probability space is a triple consisting of a sample space denoting the set of outcomes, where , a -algebra with , denoting a set of events, a probability measure s.t. .
We now define random variables and their higher-order moments.
Definition 4 (Random variable)
A random variable is a measurable function from a set of possible outcomes to .
In the context of our Prob-solvable loops, for each loop variable , we consider elements of its corresponding sequence to be random variables. When working with a random variable , one is in general interested in expected values and other moments of .
Definition 5 (Expected value)
An expected value of a random variable defined on a probability space is the Lebesgue integral: In the special case when is discrete, that is the outcomes are with corresponding probabilities , we have The expected value of is often also referred to as the mean or of .
For program variables of Prob-solvable loops, our work computes the expected values of the corresponding sequences but also higher-order and mixed moments.
Definition 6 (Higher-Order Moments)
Let be a random variable, and . We write to denote the th moment about of , which is defined as:
[TABLE]
In this paper we will be almost solely interested in moments about [math] (called raw moments) and about the mean (called central moments). We note though that we can move to moments of with different centers using Proposition 1.
Proposition 1 (Transformation of center)
Let be a random variable, and . The th moment about of , can be calculated from moments about of by:
Similarly to higher-order moments, we also consider mixed moments, that is , where and are random variables. For arbitrary random variables and , we have the following basic properties about their expected values and other moments:
- •
for a constant ,
- •
expected value is linear, and ,
- •
expected value is not multiplicative, in general
- •
expected value is multiplicative for independent random variables.
As a consequence of the above, expected values of monomials over arbitrary random variables, e.g. , cannot be in general further simplified.
The moments of a random variable with bounded support fully characterise its value distribution. While computing all moments of is generally very hard, knowing only a few moments of gives useful information about its value distributions. The most common moments are variance, covariance, skewness, as defined below.
Definition 7 (Common moments)
Variance measures how spread the distribution is and is defined as the second central moment: .
Covariance is a mixed moment measuring variability of two distributions and is defined as: .
Skewness measures asymmetry of the distribution and is defined as the normalised third central moment: .
Basic results about variance and covariance state: , , and .
Definition 8 (Moment-Generating Function (MGF))
A moment generating function of a random variable is given by:
[TABLE]
whenever this expectation exists.
Moment-generating functions, as the name suggests, can be used to compute higher-order moments of a random variable . If we take the th derivative of the moment-generating function of , evaluated at [math], we get the th moment about [math] of , that is 111due to the series expansion and derivative w.r.t. . For many standard distributions, including Bernoulli, uniform and normal distributions, the moment-generating function exists and gives us a way to compute the moments for random variables drawing from these distributions. Thanks to these properties, we can use common distributions in our Prob-solvable programs.
3 Programming Model: Prob-solvable Programs
We now introduce our programming model of Prob-solvable programs, to stand for probabilistic P-solvable programs. P-solvable programs [17] are non-deterministic loops whose behaviour can be expressed by a system of C-finite recurrences over program variables. Prob-solvable programs extend P-solvable programs by allowing probabilistic assignments over random variables and distributions.
Prob-solvable Loops.
Let and denote real-valued program variables. We define Prob-solvable loops with variables as programs of the form:
[TABLE]
- •
is a sequence of initial assignments over . That is, is an assignments sequence , with representing a number drawn from a known distribution 222a known distribution is a distribution with known and computable moments - in particular, can be a real constant.
- •
is the loop body and is a sequence of random updates, each of the form:
[TABLE]
or, in case of a deterministic assignment,
[TABLE]
where are constants and are polynomials over program variables . Further, in (6) is the probability of updating to , whereas the probability to update to in (6) is .
The coefficients , and the coefficients of and in the variable assignments (6)-(7) of Prob-solvable loops can be drawn from a random distribution as long as the moments of this distribution are known and are independent from program variables . Hence, the variable updates of Prob-solvable loop can involve coefficients drawn from Bernoulli, uniform, normal, and other distributions. Moreover, Prob-solvable support parametrised distributions, for example one may have the random distribution with arbitrary symbolic constants. Similarly, rather than only considering concrete numeric values of , the probabilities in the probabilistic updates (6) of Prob-solvable loops can also be symbolic constants.
Example 1
The programs in Fig. 1 are Prob-solvable, using uniform distributions given by . Fig. 1(D) also uses normal distribution given by . Note that while the random distributions of Fig. 1(B,D) are defined in terms of concrete constants, Fig. 1(A,C) have a parametrised random distribution, defined in terms of .
Prob-solvable Loops and Moment-Based
Recurrences.
Let us now consider a Prob-solvable program with denoting the loop iteration counter. We show that variable updates of Prob-solvable programs yield special recurrences in , called moment-based recurrences. For this, we consider program variables as sequences allowing us to precisely describe relations between values of at different loop iterations. Using this notation, probabilistic updates (6) over turn into a random variable, yielding the relation (similarly, for deterministic updates (7)):
[TABLE]
The relation above could be treated as a recurrence equation over random variables provided the probabilistic behaviour depending on is encoded (as an extension) into a recurrence equation. To analyse such probabilistic updates of Prob-solvable loops, for each random variable we consider their expected values and create new recurrence variables from expected values of monomials over original program variables (e.g. a new variable ). We refer to these new recurrence variables as E-variables. We note that any program variable yields an E-variable, but not every E-variable corresponds to one single program variable as E-variables are expected values of monomials over program variables. We now formulate recurrence equations over E-variables rather than over program variables, yielding moment-based recurrences.
Definition 9 (Moment-Based Recurrences)
Let be a sequence of random variables. A moment-based recurrence for is a recurrence over E-variable :
[TABLE]
for some function , where is the order of the moment-based recurrence.
Similarly to [21], note that variable updates yield the relation:
[TABLE]
Thanks to this relation, probabilistic updates (6) are rewritten into the moment-based recurrence equation:
[TABLE]
In particular, we have for the deterministic assignments of (7) (that is, in (7)).
By using properties of expected values of expressions over random variables, we obtain the following simplification rules:
[TABLE]
where is a constant and is a known independent distribution.
Example 2
The moment-based recurrences of the Prob-solvable loop of Fig. 1(A) are:
[TABLE]
By using the simplification rules (10) on the above recurrences, we obtain the following simplified moment-based recurrences of Fig. 1(A):
[TABLE]
In Section 4 we show that Prob-solvable loops can further be rewritten into a system of C-finite recurrences over E-variables.
Prob-solvable Loops and Mutually Dependent
Updates.
Consider PP loops with mutually dependent affine updates:
[TABLE]
where are constants. While such assignments are not directly captured by updates (6) of Prob-solvable loops, this is not a restriction of our work. Variable updates given by (12) yield mutually dependent C-finite recurrences over E-variables. Using methods from [16], this coupled system of C-finite recurrences can be rewritten into an equivalent system of independent C-finite recurrences over E-variables, yielding an independent system of moment-based recurrences over which our invariant generation algorithm from Section 4 can be applied. Hence probabilistic loops with affine updates are special cases of Prob-solvable loops.
Multi-Path Prob-solvable Loops.
While (5) defines Prob-solvable programs as single-path loops, the following class of multi-path loops can naturally be modeled by Prob-solvable programs:
[TABLE]
is as in (5), is a boolean-valued random variable, and and are respectively sequences of deterministic updates and as in (7). PPs (13) can be rewritten to equivalent Prob-solvable loops, as follows. A pair of updates from and from is rewritten by the following sequence of updates:
[TABLE]
with fresh program variables. The resulting program is Prob-solvable and we can thus compute moment-based invariants of multi-path loops as in (13). The programs Coupon, Random_Walk_2D of Table 1 are Prob-solvable loops corresponding to such multi-path loops.
4 Moment-Based Invariants of Prob-solvable Loops
Thanks to probabilistic updates, the values of program variables of Prob-solvable loops after specific number of loop iterations are not a priory determined. The value distributions of program variables are therefore random variables. When analysing Prob-solvable loops, and in general probabilistic programs, one is therefore required to capture relevant properties over expected values and higher moments of the variables in order to precisely summarise the value distribution of program variables.
Moment-Based Invariants.
We are interested in automatically generating so-called moment-based invariants of Prob-solvable loops. Moment-based invariants are properties over expected values and higher moments of program variables such that these properties hold at arbitrary loop iterations (and hence are invariants).
Automated Generation of Moment-Based
Invariants of Prob-solvable Loops.
Our method for generating moment-based invariants of Prob-solvable loops is summarized in Algorithm 1. Algorithm 1 takes as input a Prob-solvable loop and a natural number and returns moment-based invariants over the th moments of the program variables . We denote by the loop counter of .
Theorem 4.1
Higher-order moments of variables in Prob-solvable loops can be modeled by C-finite recurrences over E-variables.
Proof (Sketch)
We want to show that can be expressed using recurrence equation. The idea is to express in terms of value of at -th iteration. Value of is with probability and with probability . From here we can derive that E[x_{i}^{\alpha_{i}}(n+1)]=E[p_{i}\cdot\big{(}a_{i}x_{i}+P_{i}(x_{1},\dots x_{i-1})\big{)}^{\alpha_{i}}+(1-p_{i})\cdot\big{(}b_{i}x_{i}+Q_{i}(x_{1},\dots x_{i-1})\big{)}^{\alpha_{i}}]. For arbitrary monomial we can express by substituting each as above. This process is captured by line 8 of Alg. 1. The new equations can be further simplified using properties of expected values and the simplification rules (10) to give recurrence equations over E-variables.
We now describe Algorithm 1. Our algorithm first rewrites into a set of moment-based recurrences, as described in Section 3. That is, program variables are turned into random variables and variable updates over become moment-based recurrences over E-variables by using the relation of (8) (lines 1-2) of Alg. 1).
The algorithm next proceeds with computing the moment-based recurrences of the th moments of . Recall that the th moment of is given by:
[TABLE]
Hence, the set of monomials yielding E-variables for which moment-based recurrences need to be solved is initialized to (line 3 of Alg. 1). Note that by considering the resulting E-variables and solving the moment-based recurrences of , we derive closed forms of the th moments of (line 23 of Alg. 1). To this end, Algorithm 1 recursively computes the moment-based recurrences of every E-variable arising from the moment-based recurrences of (lines 5-21 of Alg. 1), thus ultimately computing closed forms for . One can then use transformations described in Proposition 1 to compute closed forms for other moments, such as variance and covariance. In more detail,
- •
for each monomial from , we substitute in by its probabilistic behaviour. That is, the update of in the Prob-solvable loop is rewritten, according to (8), into the sum of its two probabilistic updates, weighted by their respective probabilities (lines 6-8 of Alg. 1). Rewriting in line 8 of Alg. 1 represents the most non-trivial step in our algorithm, combining non-deterministic nature of our program with polynomial properties. The resulting polynomial from is then reordered to be expressed as a sum of new monomials (line 13 of Alg. 1); such a sum always exists as involves only addition and multiplication over (recall that and are polynomials over ).
- •
By applying the simplification rules(10) of E-variables over the moment-based recurrence of , the recurrence of is obtained and added to the set . Here, denotes . As the recurrence of depends on , moment-based recurrences of need also be computed and hence is enlarged by (lines 14-20 of Alg. 1).
As a result, the set of moment-based recurrences of E-variables corresponding to are obtained. These recurrences are C-finite expressions over E-variables (see correctness argument of Theorem 4.3) and hence their closed form solutions exist. In particular, the closed forms of is derived, turning into a inductive property that holds at arbitrary loop iterations and is hence a moment-based invariant of over the th moment of (line 23 of Alg. 1).
Theorem 4.2 (Soundness)
Consider a Prob-solvable loop with program variables and let be a non-negative integer with . Algorithm 1 generates moment-based invariants of over the th moments of .
Note when , Algorithm 1 computes the moment-based invariants as invariant relations over the closed form solutions of expected values of . In this case, our moment-based invariants are quantitative invariants as in [14].
Example 3
We illustrate Algorithm 1 for computing the second moments (i.e. ) of the Prob-solvable loop of Fig. 1(A). Our algorithm initializes and .
We next (arbitrarily) choose to be the monomial from . Thus, . Using the probabilistic update of , we replace by , that is by . As a result, and remains unchanged.
We next choose to be and set . We replace by its randomised behaviour, yielding E[M(n+1)]=E[x(n+1)^{2}]=E[\big{(}x(n)+f(n+1)\cdot{\tt rand(1-d,1+d)}\big{)}^{2}]. By the simplification rules (10) over E-variables, we obtain:
[TABLE]
as is independent from and We add the recurrence (15) to and keep unchanged as the E-variables have their recurrences already in .
We next set to and change . Similarly to , we get:
[TABLE]
by using that is independent from and . We add the recurrence (16) to and keep unchanged.
We set to , yielding . We extend with the recurrence:
[TABLE]
and add to . We therefore consider to be and set . We obtain:
[TABLE]
by using that and . We add the recurrence of to and keep .
As a result, we proceed to solve the moment-based recurrences of . We focus first on the recurrences over expected values:
[TABLE]
Note that the above recurrences are C-finite recurrences over E-variables. For computing closed forms, we respectively substitute by its closed form in and , yielding closed forms for and , and hence for . By also using the the initial values of Fig. 1, we derive the closed forms:
[TABLE]
We next similarly derive the closed forms for higher-order and mixed moments:
[TABLE]
yielding hence the moment-based invariants over the second moments of variables of Fig. 1. Using Proposition 1 and Definition 7, we derive the variance of as ∎
Let us finally note that the termination of Algorithm 1 depends whether for every monomial (from the set , line 5 of Alg. 1) the moment-based recurrence equation over the corresponding E-variable can be computed as C-finite expression over E-variables. To prove this, one can use transfinite induction over monomials . That is, we order monomials and show that the recurrence of depends only on smaller monomials, for which we can compute C-finite closed form expressions. Thus we have an inhomogeneous C-finite recurrence relation for , yielding a C-finite closed form expression.
Theorem 4.3 (Termination)
*For any non-negative integer with and any Prob-solvable loop with program variables , Algorithm 1 terminates.
Moreover, Algorithm 1 terminates in at most steps, where with denoting the degree of polynomials and of the variable updates (6).*
Proof
We associate every monomial with an ordinal number as follows:
[TABLE]
and order monomials such that if . Algorithm 1 terminates if for every monomial (from the set , line 5 of Alg. 1) the moment-based recurrence equation over the corresponding E-variable can be computed as C-finite expression over E-variables. We will show that this is indeed the case by transfinite induction over monomials.
Let be a monomial and assume that every smaller monomial has a closed form solution in form of a C-finite expression.
Let
[TABLE]
be the updates of our variables after removing the probabilistic choice as in line 6 of the algorithm. Then recurrence for is
[TABLE]
for some , constants and monomials all different than . By Lemma 1, we have an inhomogeneous C-finite recurrence relation , for some C-finite expression . Hence, the closed form of exists and is a C-finite expression. ∎
We finally prove our auxiliary lemma.
Lemma 1
* for all in (4).*
Proof
Let and have coming from
[TABLE]
Assume , i.e. , so we have . Note that in (19) only appears in factor . Considering the multiplicity, we get at most th power of , hence . Thus
So for we need from (c_{K}x_{K})^{\alpha_{K}}\cdot\prod_{i=1}^{K-1}\big{(}c_{i}x_{i}+P_{i}(x_{1},\cdots x_{i-1})\big{)}^{\alpha_{i}}.
Proceeding similarly for we get that for each we have , which contradicts the assumption, thus as needed.
Regarding the termination, let’s look at what monomials can possibly be added to . Let . Based on the algorithm and above it is clear that in case we have . For any the maximum value of is . Hence we have . thus we can count all possible monomials, hence the upper bound on the algorithm time complexity, as product of theses upper bounds. This yields as claimed. ∎
5 Implementation and Experiments
We implemented our work in the Julia language, using Aligator[12] for handling and solving recurrences. We evaluated our work on several challenging probabilistic programs with parametrised distributions, symbolic probabilities and/or both discrete and continuous random variables. All our experiments were run on MacBook Pro 2017 with 2.3 GHz Intel Core i5 and 8GB RAM. Our implementation and benchmarks are available at: github.com/miroslav21/aligator.
Benchmarks.
We evaluated our work on 13 probabilistic programs, as follows. We used 7 programs from works [6, 14, 4, 8, 18] on invariant generation. These examples are given in lines 1-7 of Table 1; we note though that Binomial(””) represents our generalisation of a binomial distribution example taken from [6, 8, 14] to a probabilistic program with parametrised probability . We further crafted 6 examples of our own, illustrating the distinctive features of our work. These examples are listed in lines 8-13 of Table 1: lines 8-11 correspond to the examples of Fig. 1; line 12 of Table 1 shows a variation of Fig. 1, with a parametrized distribution ; line 13 corresponds to a non-linear Prob-solvable loop computing squares. All our benchmarks are also available at the aforementioned url.
5.1 Coupon
Probabilistic model of Coupon Collector’s program for two coupons, taken from [18].
f := 0
c := 0
d := 0
while true:
f := 1 [1/2] 0
c := 1 - f + c*f
d := d + f - d*f
5.2 Coupon4
Probabilistic model of Coupon Collector’s program for four coupons, taken from [18].
f := 0
g := 0
a := 0
b := 0
c := 0
d := 0
while true:
f := 1 [1/2] 0
g := 1 [1/2] 0
a := a + (1-a)fg
b := b + (1-b)f(1-g)
c := c + (1-c)*(1-f)*g
d := d + (1-d)(1-f)(1-g)
5.3 Random_walk_1D_cts
A variation of random walk in one dimension with assignments from continuous distributions taken from [18].
v := 0
x := 0
while true:
v := u(0,1)
x := x + v [7/10] x - v
5.4 Sum_rnd_series
A program modeling Sum of Random Series game taken from [6].
n := 0
x := 0
while true:
n := n + 1
x := x + n [1/2] x
5.5 Product_dep_var
A program modeling Product of Dependent Random Variables game taken from [6].
f := 0
x := 0
y := 0
p := 0
while true:
f := 0 [1/2] 1
x := x + f
y := y + 1 - f
p := x*y
5.6 Random_walk_2D
A variation of random walk in two dimension as in [4, 18]. Each direction is chosen with equal probability.
h := 0
x := 0
y := 0
while true:
h := 1 [1/2] 0
x := x-h [1/2] x +h
y := y+(1-h) [1/2] y-(1-h)
5.7 Binomial
Another classic example, modeling binomial distribution. Appeared also in [6, 8, 14]. However, we consider the program to be parametric, computing invariants for arbitrary values of .
x := 0
while true:
x := x + 1 [p] x
5.8 StutteringA
Program 1(A) from Introduction.
5.9 StutteringB
Program 1(B) from Introduction.
5.10 StutteringC
Program 1(C) from Introduction.
5.11 StutteringD
Program 1(D) from Introduction.
5.12 StutteringP
A variation of program 1(A) from Introduction with , parametrized w.r.t. .
f := 0
x := -1
y := 1
s := 0
while true:
f := 1 [p] 0
x := x + f*u(0,2)
y := y + f*u(0,4)
s := x + y
5.13 Square
Our own program with polynomial assignments.
x := 0; y := 1
while true:
x := x+2 [1/2] x
y := x^2
Experimental Results with Moment-Based Invariants.
Results of our evaluation are presented in Table 1. While Algorithm 1 can compute invariants over arbitrary th higher-order moments, due to lack of space and readability, Table 1 lists only our moment-based invariants up to the third moment (i.e. ), that is for expected values, second- and third-order moments. The first column of Table 1 lists the benchmark name, whereas the second column gives the degree of the moments (i.e. ) for which we compute invariants. The third column reports the timings (in seconds) our implementation needed to derive invariants. The last column shows our moment-based invariants; for readability, we decided to omit intermediary invariants (up to for some programs) and only show the most relevant invariants.
We could not perform a fair practical comparison with other existing methods: to the best of our knowledge, existing works, such as [14, 11, 2, 18], require user guidance/templates/hints. Further, existing techniques do not support symbolic probabilities and/or parametrised distributions - which are, for example, required in the analysis of programs StutteringA, StutteringC, StutteringP of Table 1. We also note that examples Coupon, StutteringC, StutteringP involve non-linear probabilistic updates hindering automation in existing methods, while such updates can naturally be encoded as moment-based recurrences in our framework. We finally note that while second-order moments are computed only by [18], but with the help of user-provided templates, no existing approaches compute moments for . Our experiments show that inferring third-order moments are in general not expensive; yet, for examples StutteringA, StutteringC, StutteringP with parametrized distributions/probabilities more computation time is needed.
6 Related Work
Despite the impressive recent advancements, probabilistic model checking tools [1] (e.g., PRISM [19], STORM [7] and MRMC [15]) are not able to handle programs with unbounded and real variables. Model checking algorithms suffer from the state explosion problem and their performance in terms of time and memory consumption degrades as the number of reachable states to be considered increases. Furthermore, probabilistic model checking tools have no support for invariant generation. Our approach, based on symbolic summation over probabilistic expressions, can instead analyse probabilistic programs with a potentially infinite number of reachable states.
In [21], one of the first deductive frameworks to reason about probabilistic programs was proposed by annotating probabilistic programs with real-valued expressions over the expected values of program variables. Of particular interest are the annotations as quantitative invariants, summarising loop behaviors. The setting of [21] considers probabilistic programs where the stochastic inputs are restricted to discrete distributions with finite support and can deal also with demonic non-deterministic choice. Although our approach does not yet support demonic non-determinism, we are not restricted to discrete input distributions as long as we know their moments (e.g., the Gaussian distribution is characterised only by two moments: the mean and the variance). Moreover, our work is not restricted to quantitative invariants as invariants over expected values of program variables. Rather, we generate moment-based invariants that precisely capture invariant properties of higher-order and mixed moments of program variables.
Katoen et al. provided in [14] the first semi-automatic and complete method synthesising the linear quantitative invariants needed by [21]. The work of [14], implemented in PRINSYS [11], consists in annotating a loop with a linear template invariants and uses a constraint solver to find the parameters for which the template yields an invariant. The works [8, 6] also synthesize non-linear quantitative invariants.
Another related line of research is given in [2], where martingales are used to compute invariants of probabilistic programs. The martingales generated by [2] however heavily depend on the user-provided hints and hence less generic hints yield less expressive/precise invariants. Moreover, of [2] mainly focuses on invariants over expected values and it remains unclear which extensions of martingales need to be considered to compute higher-order moments. The work of [18] addresses such generalizations of martingales for computing higher-order moments of program variables, with the overall goal of approximating runtimes of randomized programs. The approach in [18] is however again restricted to user-provided templates. Unlike the works of [14, 11, 8, 6, 2, 18], our work does not rely on a priori given templates/hints, but computes the most precise invariant expression over higher-order or mixed moments of program variables. To do so, we use symbolic summation to compute closed forms of higher-order moments. In addition, Prob-solvable loops support parametrized distributions and symbolic probabilities, which is not the case of [2, 18].
There are two orthogonal problems related to quantitative invariants generation: program termination [22, 9] and worst-case execution [3, 13, 5]. The first is to assess whether a probabilistic program terminates with probability 1 or if the expected time of termination is bounded. In principle, one can use our approach to solve this class of problems for Prob-solvable loops, but this is not the focus of this paper. The second class of problems is related to finding bounds over the expected values. In [3] the authors consider bounds also over higher-order moments for a specific class of probabilistic programs with probabilistic affine assignments. This approach can handle also nonlinear terms using interval arithmetics and fresh variables, at the price to produce very conservative bounds. On the contrary our approach supports natively probabilistic polynomial assignments (in the form of Prob-solvable loops) and provides a precise symbolic expression over higher-order moments.
7 Conclusion
We introduced a novel approach for automatically generating moment-based invariants of a subclass of probabilistic programs (PPs), called Prob-solvable loops, with polynomial assignments over random variables and parametrised distributions. We combine methods from symbolic summation and statistics to derive invariants over higher-order moments, such as expected values or variances, of program variables. To the best of our knowledge, our approach is the first method computing higher-order moments of PPs fully automatically and the first to handle PPs with parametrised distributions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press (2008)
- 2[2] Barthe, G., Espitau, T., Fioriti, L.M.F., Hsu, J.: Synthesizing probabilistic invariants via Doob’s decomposition. In: CAV. LNCS, vol. 9779, pp. 43–61. Springer (2016)
- 3[3] Bouissou, O., Goubault, E., Putot, S., Chakarov, A., Sankaranarayanan, S.: Uncertainty propagation using probabilistic affine forms and concentration of measure inequalities. In: TACAS. LNCS, vol. 9636, pp. 225–243 (2016)
- 4[4] Chakarov, A., Sankaranarayanan, S.: Expectation invariants for probabilistic program loops as fixed points. In: SAS. LNCS, vol. 8723, pp. 85–100. Springer (2014)
- 5[5] Chatterjee, K., Fu, H., Goharshady, A.K., Goharshady, E.K.: Polynomial invariant generation for non-deterministic recursive programs. In: PLDI. p. to appear (2019)
- 6[6] Chen, Y., Hong, C., Wang, B., Zhang, L.: Counterexample-guided polynomial loop invariant generation by lagrange interpolation. In: CAV. LNCS, vol. 9206, pp. 658–674 (2015)
- 7[7] Dehnert, C., Junges, S., Katoen, J., Volk, M.: A storm is coming: A modern probabilistic model checker. In: CAV. LNCS, vol. 10427, pp. 592–600. Springer (2017)
- 8[8] Feng, Y., Zhang, L., Jansen, D.N., Zhan, N., Xia, B.: Finding polynomial loop invariants for probabilistic programs. In: ATVA. LNCS, vol. 10482, pp. 400–416. Springer (2017)
