Simple perfect samplers using monotone birth-and-death processes

Hiroyuki Masuyama

arXiv:1702.05720·math.PR·March 28, 2017

Simple perfect samplers using monotone birth-and-death processes

Hiroyuki Masuyama

PDF

Open Access

TL;DR

This paper introduces simple perfect sampling algorithms based on monotone birth-and-death processes that efficiently generate exact samples from arbitrary finite discrete distributions, with bounds on their expected running times.

Contribution

It constructs monotone birth-and-death processes matching the target distribution and derives bounds for the coalescence and running times of two perfect samplers, including a memory-efficient method.

Findings

01

Derived upper bounds for coalescence times of the processes.

02

Established bounds for the running times of Doubling and Read-once CFTP algorithms.

03

Demonstrated the effectiveness of the proposed sampler for unnormalized distributions.

Abstract

This paper proposes simple perfect samplers using monotone birth-and-death processes (BD-processes), which draw samples from an arbitrary finite discrete target distribution. We first construct a monotone BD-process whose stationary distribution is equal to the target distribution. We then derive upper bounds for the expected coalescence time of the copies of the monotone BD-process. We also establish upper bounds for the expected values and tail probabilities of the running times of two perfect samplers, which are Doubling CFTP and Read-once CFTP using our monotone BD-process. The latter sampler can draw samples exactly from unnormalized target distributions with little memory consumption.

Equations166

i \in Z_{N} min π (i) > 0.

i \in Z_{N} min π (i) > 0.

P

P

p_{i}

p_{i}

q_{i}

γ (i)

E [T_{C}] \leq θ N,

E [T_{C}] \leq θ N,

p_{i}

p_{i}

q_{i}

π (i) q_{i}

p_{i}

p_{i}

p_{i}

p_{i} \leq \frac{1}{1 + γ ( i )} = \frac{π ( i + 1 )}{π ( i + 1 ) + π ( i )},

p_{i} \leq \frac{1}{1 + γ ( i )} = \frac{π ( i + 1 )}{π ( i + 1 ) + π ( i )},

p_{i}

p_{i}

p_{i}

p_{i}

γ (0) \leq γ (1) \leq \dots \leq γ (N - 1),

γ (0) \leq γ (1) \leq \dots \leq γ (N - 1),

p_{i}

p_{i}

q_{i}

γ (0) \geq γ (1) \geq \dots \geq γ (N - 1),

γ (0) \geq γ (1) \geq \dots \geq γ (N - 1),

p_{i}

p_{i}

q_{i}

\frac{π ( 0 )}{π ( 0 )} \geq \frac{π ( 1 )}{π ( 1 )} \geq \dots \geq \frac{π ( N )}{π ( N )} .

\frac{π ( 0 )}{π ( 0 )} \geq \frac{π ( 1 )}{π ( 1 )} \geq \dots \geq \frac{π ( N )}{π ( N )} .

π (i - 1) π (i + 1) \leq [π (i)]^{2}, i \in Z_{[1, N - 1]},

π (i - 1) π (i + 1) \leq [π (i)]^{2}, i \in Z_{[1, N - 1]},

\phi(i,u)=\left\{\begin{array}[]{ll}i+1,&\quad u\in(1-p_{i},1),\\ i,&\quad u\in[q_{i},1-p_{i}],\\ i-1,&\quad u\in(0,q_{i}),\end{array}\right.

\phi(i,u)=\left\{\begin{array}[]{ll}i+1,&\quad u\in(1-p_{i},1),\\ i,&\quad u\in[q_{i},1-p_{i}],\\ i-1,&\quad u\in(0,q_{i}),\end{array}\right.

X_{n}^{(k)}=\left\{\begin{array}[]{ll}k,&\quad n=0,\\ \phi(X_{n-1}^{(k)},U_{n}),&\quad n\in\mathbb{N}.\end{array}\right.

X_{n}^{(k)}=\left\{\begin{array}[]{ll}k,&\quad n=0,\\ \phi(X_{n-1}^{(k)},U_{n}),&\quad n\in\mathbb{N}.\end{array}\right.

X_{n}^{(0)}\leq X_{n}^{(1)}\leq\cdots\leq X_{n}^{(N)}\quad\mbox{for all $n\in\mathbb{N}$}.

X_{n}^{(0)}\leq X_{n}^{(1)}\leq\cdots\leq X_{n}^{(N)}\quad\mbox{for all $n\in\mathbb{N}$}.

ϕ (i + 1, u)

ϕ (i + 1, u)

ϕ (i + 1, u)

ϕ (i + 1, u) \geq ϕ (i, u), i \in Z_{N - 1}, u \in (0, 1) .

ϕ (i + 1, u) \geq ϕ (i, u), i \in Z_{N - 1}, u \in (0, 1) .

T_{C} = in f {n \in N : X_{n}^{(0)} = X_{n}^{(1)} = \dots = X_{n}^{(N)}},

T_{C} = in f {n \in N : X_{n}^{(0)} = X_{n}^{(1)} = \dots = X_{n}^{(N)}},

T_{C} = in f {n \in N : X_{n}^{(0)} = X_{n}^{(N)}} .

T_{C} = in f {n \in N : X_{n}^{(0)} = X_{n}^{(N)}} .

T_{C}

T_{C}

T_{C}

E [T_{C}]

E [T_{C}]

E [T_{0, N}]

E [T_{0, N}]

E [T_{N, 0}]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Statistical Methods and Bayesian Inference · Advanced Statistical Process Monitoring

Full text

Simple perfect samplers using monotone

birth-and-death processes111This paper has been submitted for publication in a special issue on “Queueing Theory and Network Applications” in Annals of Operations Research.****

Hiroyuki Masuyama222E-mail: [email protected]

Department of Systems Science, Graduate School of Informatics, Kyoto University

Kyoto 606-8501, Japan

Abstract

[TABLE]

1 Introduction

Perfect sampling algorithms are based on “Coupling From The Past (CFTP)”, proposed by Propp and Wilson (1996). CFTP is a powerful technique that enables us to perform perfect sampling from the target distribution, i.e., to generate, in a finite time, samples that perfectly follow the target distribution. Basically, CFTP is time- and memory-consuming because we have to check whether or not the copies of a Markov chain used for CFTP coalesce at a single state every time we extend the sample paths of the copies to the past.

Propp and Wilson (1996) stated that CFTP is effectively achieved by a monotone Markov chain (see, e.g., Keilson and Kester 1977) constructed from the target distribution, which is called monotone CFTP or monotonic CFTP (MCFTP). As far as we know, there have been a small number of examples for which MCFTP algorithms are established, for example, attractive spin systems (Propp and Wilson 1996), closed Jackson networks (Kijima and Matsui 2008a, b), discretized Dirichlet distributions (Matsui et al. 2010) and truncated Gaussian distributions (Philippe and Robert 2003). In particular, the algorithms proposed by Kijima and Matsui (2008a, b) and Matsui et al. (2010) are remarkably fast, though they are somewhat sophisticated.

The main purpose of this paper is to establish simple perfect samplers, which draw samples from an arbitrary target distribution $\{\pi(i);i\in\mathbb{S}\}$ on an arbitrary finite discrete set $\mathbb{S}$ . It should be noted that $\mathbb{S}$ is mapped one-to-one to a finite set of nonnegative numbers. Thus, we assume, without loss of generality, that $\mathbb{S}=\{0,1,\dots,N\}=:\mathbb{Z}_{N}$ , where $N$ is a positive integer. We also assume that

[TABLE]

For later use, let $\mathbb{N}=\{1,2,3,\dots\}$ , $\mathbb{Z}_{+}=\{0,1,2,\dots\}$ , $\mathbb{Z}=\{0,\pm 1,\pm 2,\dots\}$ and $\mathbb{Z}_{n}=\{0,1,\dots,n\}$ for any $n\in\mathbb{Z}_{+}$ . For $n,m\in\mathbb{Z}$ such that $n\leq m$ , let $\mathbb{Z}_{[n,m]}=\{n,n+1,\dots,m-1,m\}$ . Let $x\vee y=\max(x,y)$ and $x\wedge y=\min(x,y)$ for $x,y\in(-\infty,\infty)$ . Furthermore, we use the notation $f(x)=O(g(x))$ to represent $\limsup_{x\to\infty}|f(x)|/|g(x)|<\infty$ .

In this paper, we first construct a monotone birth-and-death process (monotone BD-process or MBD for short) whose stationary distribution is equal to the target distribution $\{\pi(i);i\in\mathbb{Z}_{N}\}$ . More specifically, we construct a monotone stochastic matrix $\bm{P}:=(P(i,j))_{i,j\in\mathbb{Z}_{N}}$ such that

[TABLE]

where

[TABLE]

By definition, $r_{i}=1-p_{i}-q_{i}$ for $i\in\mathbb{Z}_{N}$ and $q_{0}=p_{N}=0$ . We prove that $\bm{P}$ is an irreducible and monotone stochastic matrix whose stationary distribution is equal to the target distribution $\{\pi(i);i\in\mathbb{Z}_{N}\}$ (see Theorem 2.1 below). We then discuss the first time when the copies of the MBD characterized by $\bm{P}$ coalesce at a single state, which is called the coalescence time and denoted by $T_{\rm C}$ . Utilizing the existing results on BD-processes, we derive the upper bound for the expected coalescence time:

[TABLE]

where $\theta\in(0,\infty)$ is a certain parameter (possibly depending on $N$ ).

Next we consider Doubling CFTP and Read-once CFTP (see, e.g., Huber 2016) using our MBD, which is referred to as Doubling-MBD sampler and Read-once-MBD sampler, respectively. Using (1.16), we obtain upper bounds for the expected values and tail probabilities of the running times of Doubling-MBD and Read-once-MBD samplers. These upper bounds show that the expected running times of the two MBD samplers are $O(\theta N)$ , and thus they are slower than the sophisticated special-purpose algorithms mentioned above. However, the construction of our MBD is very simple and little memory-consuming. In general, Doubling MCFTP and Read-once MCFTP are easily implementable (for details, see, e.g., Huber 2016). Therefore, Doubling-MBD and Read-once-MBD samplers are easily implementable and general-purpose perfect sampling algorithm. Furthermore, Read-once-MBD sampler is little memory-consuming, though the sampler is somewhat more time-consuming than Doubling-MBD sampler. As a result, Read-once-MBD sampler can draw samples from unnormalized target distributions with little memory consumption. This is a remarkable feature of Read-once-MBD sampler.

The rest of this paper is divided into two sections. Section 2 discusses our MBD constructed from the target distribution. Section 3 considers the performance of the two perfect samplers using our MBD.

2 Monotone BD-process from the target distribution

This section consists of two subsections. Section 2.1 constructs a monotone BD-process (MBD) whose stationary distribution is equal to the target distribution. Section 2.2 derives some upper bounds for the expected coalescence time of the copies of the MBD.

2.1 Construction of a monotone BD-process from the target distribution

The following theorem is the fundamental result of this paper.

Theorem 2.1

The stochastic matrix $\bm{P}$ defined by (1.8) together with (1.11)–(1.15) is an irreducible and monotone one whose stationary distribution is equal to the target distribution $\{\pi(i);i\in\mathbb{Z}_{N}\}$ .

Proof.

From (1.1) and (1.11)–(1.15), we have

[TABLE]

which show that $\bm{P}$ is an irreducible stochastic matrix and that the target distribution $\{\pi(i);i\in\mathbb{Z}_{N}\}$ is a reversible measure and thus a unique stationary distribution of $\bm{P}$ . Therefore, it suffices to prove that

[TABLE]

where (2.2) is the condition for the monotonicity of $\bm{P}$ (see, e.g., Keilson and Kester 1977, Definition 1.2).

From (1.11) and (1.15), we have, for $i\in\mathbb{Z}_{N-1}$ ,

[TABLE]

and thus

[TABLE]

where the last inequality follows from (2.3) and the last equality follows from (2.1). Similarly, for $i\in\mathbb{Z}_{N-1}$ ,

[TABLE]

The proof is completed. ∎∎

The following corollary is immediate from Theorem 2.1.

Corollary 2.1

Suppose that the conditions of Theorem 2.1 are satisfied. We then have the following:

(i)

If $\{\gamma(i);i\in\mathbb{Z}_{N-1}\}$ is nondecreasing, i.e.,

[TABLE]

then (1.11) and (1.14) are reduced to

[TABLE] 2. (ii)

If $\{\gamma(i);i\in\mathbb{Z}_{N-1}\}$ is nonincreasing, i.e.,

[TABLE]

then (1.11) and (1.14) are reduced to

[TABLE]

Remark 2.1

Suppose that the conditions of the statement (i) of Corollary 2.1 are satisfied. Let $\{\widehat{Y}_{n};n\in\mathbb{Z}_{+}\}$ denote an MBD with state space $\mathbb{Z}_{N}$ and transition probability matrix $\bm{P}$ in (1.8) together with (2.4) and (2.5). Furthermore, suppose that the BD-processes $\{\widehat{Y}_{n}\}$ starts with an initial distribution $\{\widehat{\pi}(i);i\in\mathbb{Z}_{N}\}$ such that

[TABLE]

Note here that (2.6) yields

[TABLE]

which shows that the target distribution $\{\pi(i)\}$ is log-concave. Therefore, it follows from Fill and Kahn (2013, Proposition 3.2, Corollary 3.3(a) and Theorem 5.1) that the BD-process $\{\widehat{Y}_{n}\}$ mixes (i.e., converges to stationarity) faster in total variation distance than does an arbitrary MBD $\{\widehat{Z}_{n};n\in\mathbb{Z}_{+}\}$ that has the same state space $\mathbb{Z}_{N}$ , stationary distribution $\{\pi(i);i\in\mathbb{Z}_{N}\}$ and initial distribution $\{\widehat{\pi}(i);i\in\mathbb{Z}_{N}\}$ as the BD-process $\{\widehat{Y}_{n};n\in\mathbb{Z}_{+}\}$ .

Next we describe a construction of the copies of the MBD with state space $\mathbb{Z}_{N}$ and transition probability matrix $\bm{P}$ , which can be used for MCFTP. To this end, we define $\{U_{m};m\in\mathbb{Z}\}$ as a sequence of independent and identically distributed (i.i.d.) uniform random variables in $(0,1)$ . We then have the following result.

Theorem 2.2

Suppose that the conditions of Theorem 2.1 are satisfied. Let $\phi:\mathbb{Z}_{N}\times(0,1)\to\mathbb{Z}_{N}$ denote a function such that, for $i\in\mathbb{Z}_{N}$ and $u\in(0,1)$ ,

[TABLE]

where $p_{i}$ and $q_{i}$ are given in (1.11) and (1.14), respectively. Furthermore, for each $k\in\mathbb{Z}_{N}$ , let $\{X_{n}^{(k)};n\in\mathbb{Z}_{+}\}$ denote a sequence of random variables such that

[TABLE]

Under these conditions, the stochastic processes $\{X_{n}^{(k)};n\in\mathbb{Z}_{+}\}$ ’s, $k\in\mathbb{Z}_{N}$ , are MBDs with transition probability matrix $\bm{P}$ , which satisfy

[TABLE]

Proof.

It is clear that $\{X_{n}^{(k)};n\in\mathbb{Z}_{+}\}$ ’s, $k\in\mathbb{Z}_{N}$ , are MBDs with transition probability matrix $\bm{P}$ . Thus, we prove that (2.9) holds.

It follows from (2.7) that, for $i\in\mathbb{Z}_{N-1}$ ,

[TABLE]

It also follows from (2.2) that $q_{i+1}\leq 1-p_{i}$ for $i\in\mathbb{Z}_{N-1}$ . Therefore,

[TABLE]

Combining (2.8) and (2.10) yields (2.9). ∎∎

Theorem 2.2 shows that the function $\phi$ , together with the uniform random variables $U_{m}$ ’s, generates MBDs with transition probability matrix $\bm{P}$ . Thus, we refer to $\phi$ as a monotone update function for MBDs with $\bm{P}$ . Note here that $\{X_{n}^{(k)};n\in\mathbb{Z}_{+}\}$ ’s can be considered the copies of a generic BD-process driven by the monotone update function $\phi$ , which is denoted by $\{X_{n};n\in\mathbb{Z}_{+}\}$ . Especially, we refer to $\{X_{n}^{(N)}\}$ and $\{X_{n}^{(0)}\}$ as the upper-bounding and lower-bounding copies, respectively, of $\{X_{n}\}$ .

2.2 Expected coalescence time of the copies of the monotone BD-process

Let $T_{\rm C}$ denote

[TABLE]

which is the first time when all the copies $\{X_{n}^{(k)}\}$ ’s coalesce at a single state in the state space $\mathbb{Z}_{N}$ . Thus, we call $T_{\rm C}$ the coalescence time of the copies $\{X_{n}^{(k)}\}$ ’s of $\{X_{n}\}$ . It follows from (2.9) and (2.11) that

[TABLE]

We now define $T_{i,j}$ , $i,j\in\mathbb{Z}_{N}^{2}$ , $i\neq j$ , as a generic random variable for the first passage time from state $i$ to state $j$ . We assume that $\{T_{0,1},T_{1,2},\dots,T_{N-1,N}\}$ are independent and so are $\{T_{N,N-1},T_{N-1,N-2},\dots,T_{1,0}\}$ , which does not lose generality due to the skip-free property of BD-processes. It then follows from (2.12) that

[TABLE]

where the symbol “ $\stackrel{{\scriptstyle{\rm d}}}{{=}}$ ” represents the equality in distribution. Therefore,

[TABLE]

We can readily obtain (see, e.g., Theorem 4.11 of Heyman and Sobel (2004), where continuous-time BD-processes are considered)

[TABLE]

Substituting (2.14) and (2.15) into (2.13) yields

[TABLE]

Using (2.16), we obtain the following result.

Theorem 2.3

If the conditions of Theorem 2.1 are satisfied, then

[TABLE]

where $\theta$ is a positive constant such that

[TABLE]

Proof.

Note that

[TABLE]

Substituting these inequalities into (2.16) leads to (2.17) with (2.18). ∎∎

Under some additional conditions, we obtain simpler bounds for $\mathsf{E}[T_{\rm C}]$ .

Theorem 2.4

Suppose that the conditions of Theorem 2.1 are satisfied. We then have the following:

(i)

If there exists some $C\in(0,\infty)$ independent of $N$ such that

[TABLE]

then

[TABLE] 2. (ii)

If there exists some $C\in(0,\infty)$ independent of $N$ such that

[TABLE]

then

[TABLE]

Remark 2.2

If $\{\pi(i);i\in\mathbb{Z}_{N}\}$ is nonincreasing or nondecreasing, then (2.21) holds for $C=1$ and thus the statement (ii) of Theorem 2.4 yields

[TABLE]

*Proof of Theorem 2.4. * We first prove the statement (i). Applying (2.19) to (2.16) yields

[TABLE]

which shows that (2.20) holds. Next we prove the statement (ii). Combining (2.21) and (2.16), we have either of the following inequalities:

[TABLE]

Each of the two inequalities shows that (2.22) holds. ∎

Example 2.1 (Truncated geometric distribution)

Consider a truncated geometric distribution. To this end, fix

[TABLE]

where $0<\xi<1$ . Clearly, $\gamma(i)=\xi^{-1}$ for $i\in\mathbb{Z}_{N-1}$ , which satisfies the conditions of the statement (i) of Corollary 2.1. Thus, from (2.4) and (2.5), we have

[TABLE]

Note here that

[TABLE]

Combining these results and the statement (i) of Theorem 2.4 yields

[TABLE]

Example 2.2 (Zipf distribution)

Consider the following Zipf distribution $\{\pi(i);i\in\mathbb{Z}_{N}\}$ :

[TABLE]

where $\alpha>1$ . We then have

[TABLE]

which is decreasing with $i$ . Therefore, according to the statement (ii) of Corollary 2.1, we fix

[TABLE]

Furthermore, since $\{\pi(i);i\in\mathbb{Z}_{N}\}$ is decreasing (see Remark 2.2), it follows from (2.23) and (2.24) that

[TABLE]

3 Perfect samplers using the monotone BD-process

In this section, we discuss the running times of Doubling CFTP and Read-once CFTP using the monotone update function $\phi$ , which are referred to as Doubling-MBD sampler and Read-once-MBD sampler, respectively.

To facilitate the subsequent discussion, we introduce some definitions. For $m\in\mathbb{Z}$ and $n\in\mathbb{Z}_{+}$ , let

[TABLE]

For convenience, let $\bm{U}_{m}^{(-n)}=\emptyset$ for all $m\in\mathbb{Z}$ and $n\in\mathbb{N}$ . In addition, for $s\in\mathbb{Z}$ , $n\in\mathbb{Z}_{+}$ and $x\in\mathbb{Z}_{N}$ , let $\varPhi_{s}^{s+n}(x,\bm{U}_{m}^{(n)})$ denote

[TABLE]

where $\phi$ is the monotone update function given in (2.7) and $\{U_{m};m\in\mathbb{Z}\}$ is a sequence of i.i.d. uniform random variables in $(0,1)$ . Note that, for any $t\in\mathbb{Z}_{+}$ , the two processes $\{\varPhi_{-t}^{-t+n}(N,\bm{U}_{-t}^{(n)});n\in\mathbb{Z}_{+}\}$ and $\{\varPhi_{-t}^{-t+n}(0,\bm{U}_{-t}^{(n)});n\in\mathbb{Z}_{+}\}$ are the upper- and lower-bounding copies of an MBD with transition probability matrix $\bm{P}$ , which run from time $-t$ to time $-t+n$ .

We first consider Doubling MBD sampler, which is described in Algorithm 1.

Let $T_{\rm D}$ denote the number of the uniform random variables used by Algorithm 1, i.e., $T_{\rm D}$ is equal to a positive integers such that

[TABLE]

Following Huber (2008), we read $T_{\rm D}$ as the running time of Algorithm 1. Using Huber (2008, Lemma 5.4), we obtain the following result.

Proposition 3.1 (Doubling-MBD sampler)

[TABLE]

where $\theta$ is the positive constant given in (2.18).

Proof.

It follows from Huber (2008, Lemma 5.4) that

[TABLE]

Combining these with Theorem 2.3 results in (3.1) and (3.2). ∎∎

Next we consider Read-once-MBD sampler, which is described in Algorithm 2 below.

Remark 3.1

When Algorithm 2 stops, we have

[TABLE]

As with Algorithm 1, we define $T_{\rm R}$ as the number of the uniform random variables used by Algorithm 2, and then read $T_{\rm R}$ as the running time of Algorithm 2. Let $L$ and $L^{\prime}$ denote the numbers of the iterations in Steps (ii) and (iii), respectively, of Algorithm 2. By definition, $L$ and $L^{\prime}$ are independent and

[TABLE]

In addition,

[TABLE]

Using Theorem 2.3 together with (3.3) and (3.4), and proceeding as in the proof of Huber (2008, Lemma 5.4), we obtain the following result.

Theorem 3.1 (Read-once-MBD sampler)

Fix $b\in\mathbb{N}$ such that $b>e$ , and fix the block size $B$ of Algorithm 2 such that $B=b\lceil\theta\rceil N$ , where $\theta$ is the positive constant given in (2.18). We then have

[TABLE]

where $\beta(b)=\exp\{1-b/e\}\in(0,1)$ . In addition, the value of integer $b>e$ minimizing the right hand side of (3.5) is equal to six, or equivalently,

[TABLE]

Proof.

It follows from Markov’s inequality that, for any fixed $\alpha>1$ ,

[TABLE]

Note here that $\{\mathsf{P}(T_{\rm C}>x);x\geq 0\}$ is log-subadditive (see, e.g., Propp and Wilson 1996, Theorem 6). Thus, from (3.8), we have

[TABLE]

We now fix $\alpha=e$ to maximizing $(\ln\alpha)/\alpha$ . It then follows from (3.9) that

[TABLE]

From $B=b\lceil\theta\rceil N$ and Theorem 2.3, we also have $B\geq b\mathsf{E}[T_{\rm C}]$ . Using this and (3.10), we obtain

[TABLE]

Substituting (3.11) into (3.3) yields

[TABLE]

Therefore, there exist independent random variables $\overline{L}$ and $\overline{L}^{\prime}$ such that

[TABLE]

It follows from (3.13) that

[TABLE]

and

[TABLE]

Combining these results with (3.4) and (3.12), we have

[TABLE]

which imply that (3.5) and (3.6) hold due to $B=b\lceil\theta\rceil N$ .

In what follows, we prove (3.7), which is equivalent to

[TABLE]

where $F$ denotes a function such on $(e,\infty)$ that

[TABLE]

By definition, $F$ is convex and

[TABLE]

Let $G(x)$ , $x>e$ , denote the numerator of $F^{\prime}(x)$ in the above equation, i.e.,

[TABLE]

We then have

[TABLE]

which lead to $F^{\prime}(2e)<0$ and $F^{\prime}(2.5e)>0$ . Note here that $2e>5.4$ and $2.5e<7$ . Therefore, the convexity of $F$ yields $F^{\prime}(5)<0$ and $F^{\prime}(7)>0$ , which results in (3.14). ∎∎

Proposition 3.1 and Theorem 3.1 imply that the running time $T_{\rm D}$ of Doubling-MBD sampler is less than the running time $T_{\rm R}$ of Read-once-MBD sampler. However, Doubling-MBD sampler has to store all the generated (uniform) random numbers until it outputs a sample following the target distribution. On the other hand, Read-once-MBD sampler is little memory-consuming because the sampler uses, only one time, each of the generated random numbers.

We close this section by comparing our perfect samplers with the inverse transform sampling (see, e.g., Fishman 1996). The inverse transform sampling for discrete target distributions is easy implementable and takes the $O(N)$ running time in order to draw a sample from the target distribution. Therefore, the inverse transform sampling is less time-consuming than our perfect samplers.

To discuss this topic from a different perspective, we suppose that the target distribution $\{\pi(i);i\in\mathbb{Z}_{N}\}$ is not normalized, in other words, we have an unnormalized target distribution $\{\widehat{\pi}(i);i\in\mathbb{Z}_{N}\}$ such that $C_{\pi}:=\sum_{i=0}^{N}\widehat{\pi}(i)\neq 1$ and

[TABLE]

It then follows from (1.15) and (3.15) that

[TABLE]

Therefore, our two perfect samplers still work well by using the unnormalized target distribution $\{\widehat{\pi}(i)\}$ . On the other hand, the inverse transform sampling has a problem in the present situation because it needs the cumulative distribution $\{\sigma(i);i\in\mathbb{Z}_{N}\}$ , where $\sigma(i)=\sum_{\ell=0}^{i}\pi(\ell)$ for $i\in\mathbb{Z}_{N}$ . To obtain the cumulative distribution $\{\sigma(i)\}$ , we have to compute the normalizing constant $C_{\pi}$ by summing the unnormalized target distribution $\{\widehat{\pi}(i)\}$ over its support set $\mathbb{Z}_{N}$ .

It should be note that the obtained constant $C_{\pi}$ includes, at worst, the $O(N)$ rounding error. Such rounding error can be reduced to $O(\ln N)$ if $C_{\pi}$ is computed by pairwise summation (see, e.g., Higham 1993). Furthermore, if $C_{\pi}$ is computed by Kahan summation algorithm, then the rounding error can be basically reduced to $O(1)$ but its computational complexity is four times as much as that of naive summation (see, e.g., Higham 1993). Even though we take any of these options, we have to store all the information of the cumulative distribution $\{\sigma(i)\}$ . Such memory consumption is not necessary for our two perfect samplers.

As a result, although our MBD perfect samplers may not be particularly superior in speed to other methods, they are easily implementable and can draw samples exactly from unnormalized target distributions. Especially, Read-one MBD sampler achieves such exact sampling with little memory consumption.

Acknowledgments

The author thanks Dr. Shuji Kijima for helpful comments that motivated this work.

Bibliography12

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Fill and Kahn (2013) Fill, J. A., Kahn, J., 2013. Comparison inequalities and fastest-mixing Markov chains. The Annals of Applied Probability 23 (5), 1778–1816.
2Fishman (1996) Fishman, G., 1996. Monte Carlo: Concepts, Algorithms, and Applications. Springer, New York.
3Heyman and Sobel (2004) Heyman, D. P., Sobel, M. J., 2004. Stochastic Models in Operations Research, Vol. I: Stochastic Processes and Operating Characteristics, paperback Edition. Dover Publications, Mineola, New York.
4Higham (1993) Higham, N. J., 1993. The accuracy of floating point summation. SIAM Journal on Scientific Computing 14 (4), 783–799.
5Huber (2008) Huber, M., 2008. Perfect simulation with exponential tails on the running time. Random Structures and Algorithms 33 (1), 29–43.
6Huber (2016) Huber, M., 2016. Perfect Simulation. CRC Press, Boca Raton, FL.
7Keilson and Kester (1977) Keilson, J., Kester, A., 1977. Monotone matrices and monotone Markov processes. Stochastic Processes and their Applications 5, 231–241.
8Kijima and Matsui (2008 a) Kijima, S., Matsui, T., 2008 a. Approximation algorithm and perfect sampler for closed jackson networks with single servers. SIAM Journal on Computing 38 (4), 1484–1503.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

1 Introduction

2 Monotone BD-process from the target distribution

2.1 Construction of a monotone BD-process from the target distribution

Theorem 2.1

Proof.

Corollary 2.1

Remark 2.1

Theorem 2.2

Proof.

2.2 Expected coalescence time of the copies of the monotone BD-process

Theorem 2.3

Proof.

Theorem 2.4

Remark 2.2

Example 2.1** **(Truncated geometric distribution)

Example 2.2** **(Zipf distribution)

3 Perfect samplers using the monotone BD-process

Proposition 3.1** **(Doubling-MBD sampler)

Proof.

Remark 3.1

Theorem 3.1** **(Read-once-MBD sampler)

Proof.

Acknowledgments

Example 2.1 (Truncated geometric distribution)

Example 2.2 (Zipf distribution)

Proposition 3.1 (Doubling-MBD sampler)

Theorem 3.1 (Read-once-MBD sampler)