Compositional Inference Metaprogramming with Convergence Guarantees

Shivam Handa; Vikash Mansinghka; Martin Rinard

arXiv:1907.05451·cs.PL·July 16, 2019

Compositional Inference Metaprogramming with Convergence Guarantees

Shivam Handa, Vikash Mansinghka, Martin Rinard

PDF

Open Access

TL;DR

This paper introduces a formal framework for probabilistic inference metaprogramming that guarantees convergence of hybrid algorithms applying different MCMC methods to subproblems, advancing the theoretical understanding of probabilistic programming.

Contribution

It presents the first formal convergence guarantees for hybrid probabilistic inference algorithms using subproblem-based metaprogramming.

Findings

01

Proves asymptotic convergence for inference metaprogramming with hybrid algorithms.

02

Defines independent subproblem inference and its advantages.

03

Establishes a mathematical framework for analyzing convergence in probabilistic programming.

Abstract

Inference metaprogramming enables effective probabilistic programming by supporting the decomposition of executions of probabilistic programs into subproblems and the deployment of hybrid probabilistic inference algorithms that apply different probabilistic inference algorithms to different subproblems. We introduce the concept of independent subproblem inference (as opposed to entangled subproblem inference in which the subproblem inference algorithm operates over the full program trace) and present a mathematical framework for studying convergence properties of hybrid inference algorithms that apply different Markov-Chain Monte Carlo algorithms to different parts of the inference problem. We then use this formalism to prove asymptotic convergence results for probablistic programs with inference metaprogramming. To the best of our knowledge this is the first asymptotic convergence…

Equations1561

\begin{array}[]{rcl}e_{v}\in E_{v}&:=&x\leavevmode\nobreak\ |\leavevmode\nobreak\ \lambda.x\leavevmode\nobreak\ e_{v}\leavevmode\nobreak\ |\leavevmode\nobreak\ (e_{v}\leavevmode\nobreak\ e^{\prime}_{v})\\ e,e_{1},e_{2}\in E&:=&x\leavevmode\nobreak\ |\leavevmode\nobreak\ \lambda.x\leavevmode\nobreak\ e\leavevmode\nobreak\ |\leavevmode\nobreak\ \mathsf{Dist}(e)\leavevmode\nobreak\ |\leavevmode\nobreak\ (e_{1}\leavevmode\nobreak\ e_{2})\\ s\in S&:=&\mathsf{assume}\leavevmode\nobreak\ x=e\leavevmode\nobreak\ |\leavevmode\nobreak\ \mathsf{observe}(\mathsf{Dist}(e)=e_{v})\\ p\in P&:=&\emptyset\leavevmode\nobreak\ |\leavevmode\nobreak\ s;p\end{array}

\begin{array}[]{rcl}e_{v}\in E_{v}&:=&x\leavevmode\nobreak\ |\leavevmode\nobreak\ \lambda.x\leavevmode\nobreak\ e_{v}\leavevmode\nobreak\ |\leavevmode\nobreak\ (e_{v}\leavevmode\nobreak\ e^{\prime}_{v})\\ e,e_{1},e_{2}\in E&:=&x\leavevmode\nobreak\ |\leavevmode\nobreak\ \lambda.x\leavevmode\nobreak\ e\leavevmode\nobreak\ |\leavevmode\nobreak\ \mathsf{Dist}(e)\leavevmode\nobreak\ |\leavevmode\nobreak\ (e_{1}\leavevmode\nobreak\ e_{2})\\ s\in S&:=&\mathsf{assume}\leavevmode\nobreak\ x=e\leavevmode\nobreak\ |\leavevmode\nobreak\ \mathsf{observe}(\mathsf{Dist}(e)=e_{v})\\ p\in P&:=&\emptyset\leavevmode\nobreak\ |\leavevmode\nobreak\ s;p\end{array}

Dist (e) [x / y] = Dis t^{'} (e [x / y]) = {e_{d} [x / y] ∣ e_{d} \in Dist (e [x / y])}

Dist (e) [x / y] = Dis t^{'} (e [x / y]) = {e_{d} [x / y] ∣ e_{d} \in Dist (e [x / y])}

FreeVariables (Dist (e)) = e_{d} \in Dist (e) \cup {e} ⋃ FreeVariables (e_{d})

FreeVariables (Dist (e)) = e_{d} \in Dist (e) \cup {e} ⋃ FreeVariables (e_{d})

\begin{array}[]{rcl}v\in V&:=&x\leavevmode\nobreak\ |\leavevmode\nobreak\ \langle\lambda.x\leavevmode\nobreak\ e,\sigma_{v},\sigma_{id}\rangle\leavevmode\nobreak\ |\leavevmode\nobreak\ (v_{1}\leavevmode\nobreak\ v_{2})\\ aa\in aA&:=&\perp\leavevmode\nobreak\ |\leavevmode\nobreak\ x=ae\\ ae\in aE&:=&(x:x)\#id\leavevmode\nobreak\ |\leavevmode\nobreak\ (x(id^{\prime}):v)\#id\\ &|&(\lambda.x\leavevmode\nobreak\ e:v)\#id\leavevmode\nobreak\ |\leavevmode\nobreak\ ((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id\\ &|&(\mathsf{Dist}(ae\#id^{\prime})=ae^{\prime}:v)\#id\\ as\in aS&:=&\mathsf{assume}\leavevmode\nobreak\ x=ae\leavevmode\nobreak\ |\leavevmode\nobreak\ \mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v})\\ t\in T&:=&\emptyset\leavevmode\nobreak\ |\leavevmode\nobreak\ as;t\end{array}

\begin{array}[]{rcl}v\in V&:=&x\leavevmode\nobreak\ |\leavevmode\nobreak\ \langle\lambda.x\leavevmode\nobreak\ e,\sigma_{v},\sigma_{id}\rangle\leavevmode\nobreak\ |\leavevmode\nobreak\ (v_{1}\leavevmode\nobreak\ v_{2})\\ aa\in aA&:=&\perp\leavevmode\nobreak\ |\leavevmode\nobreak\ x=ae\\ ae\in aE&:=&(x:x)\#id\leavevmode\nobreak\ |\leavevmode\nobreak\ (x(id^{\prime}):v)\#id\\ &|&(\lambda.x\leavevmode\nobreak\ e:v)\#id\leavevmode\nobreak\ |\leavevmode\nobreak\ ((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id\\ &|&(\mathsf{Dist}(ae\#id^{\prime})=ae^{\prime}:v)\#id\\ as\in aS&:=&\mathsf{assume}\leavevmode\nobreak\ x=ae\leavevmode\nobreak\ |\leavevmode\nobreak\ \mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v})\\ t\in T&:=&\emptyset\leavevmode\nobreak\ |\leavevmode\nobreak\ as;t\end{array}

t \in Traces (p) ⟺ \emptyset, \emptyset ⊢ p \Rightarrow_{s} t

t \in Traces (p) ⟺ \emptyset, \emptyset ⊢ p \Rightarrow_{s} t

\begin{array}[]{c}\begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash y\Rightarrow_{s}y,id,(y:y)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ y\notin\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash x\Rightarrow_{s}\sigma_{v}(x),id,(x(\sigma_{id}(x)):\sigma_{v}(x))\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ x\in\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\lambda.x\leavevmode\nobreak\ e\Rightarrow_{s}v,id,(\lambda.x\leavevmode\nobreak\ e:v)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\\ \sigma^{\prime}_{v}=\mathsf{RestrictKeys}(\sigma_{v},\mathsf{FreeVariables}(\lambda.x\leavevmode\nobreak\ e))\\ \sigma^{\prime}_{id}=\mathsf{RestrictKeys}(\sigma_{id},\mathsf{FreeVariables}(\lambda.x\leavevmode\nobreak\ e))\\ v=\langle\lambda.x\leavevmode\nobreak\ e,\sigma^{\prime}_{v},\sigma^{\prime}_{id}\rangle\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\mathsf{Dist}(e)\Rightarrow_{s}v,id,(\mathsf{Dist}(ae\#id^{\prime})=ae_{v}:v)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ id^{\prime}\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\\ \sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}v,id_{e},ae\\ e^{\prime}_{v}\in\mathsf{Dist}(v)\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \sigma_{v},\sigma_{id}\vdash e^{\prime}_{v}\Rightarrow_{s}v,id_{v},ae_{v}\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash(e_{1}\leavevmode\nobreak\ e_{2})\Rightarrow_{s}v,id,((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{e}:v)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ x\leftarrow\mathsf{Fresh\leavevmode\nobreak\ variable\leavevmode\nobreak\ name}\\ \sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\langle\lambda.y\leavevmode\nobreak\ e,\sigma^{\prime}_{v},\sigma^{\prime}_{id}\rangle,id_{1},ae_{1}\\ \sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}v^{\prime},id_{2},ae_{2}\\ \sigma_{v}^{\prime}[x\rightarrow v^{\prime}],\sigma_{id}^{\prime}[x\rightarrow id_{2}]\vdash e[x/y]\Rightarrow_{s}v,id_{e},ae_{e}\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash(e_{1}\leavevmode\nobreak\ e_{2})\Rightarrow_{s}v,id,((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\\ \sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}v_{1},id_{1},ae_{1}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}v_{2},id_{2},ae_{2}\\ v_{1}\neq\langle\lambda.x\leavevmode\nobreak\ e,\sigma^{\prime}_{v},\sigma^{\prime}_{id}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ v=(v_{1}\leavevmode\nobreak\ v_{2})\\ \end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash y\Rightarrow_{s}y,id,(y:y)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ y\notin\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash x\Rightarrow_{s}\sigma_{v}(x),id,(x(\sigma_{id}(x)):\sigma_{v}(x))\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ x\in\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\lambda.x\leavevmode\nobreak\ e\Rightarrow_{s}v,id,(\lambda.x\leavevmode\nobreak\ e:v)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\\ \sigma^{\prime}_{v}=\mathsf{RestrictKeys}(\sigma_{v},\mathsf{FreeVariables}(\lambda.x\leavevmode\nobreak\ e))\\ \sigma^{\prime}_{id}=\mathsf{RestrictKeys}(\sigma_{id},\mathsf{FreeVariables}(\lambda.x\leavevmode\nobreak\ e))\\ v=\langle\lambda.x\leavevmode\nobreak\ e,\sigma^{\prime}_{v},\sigma^{\prime}_{id}\rangle\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\mathsf{Dist}(e)\Rightarrow_{s}v,id,(\mathsf{Dist}(ae\#id^{\prime})=ae_{v}:v)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ id^{\prime}\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\\ \sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}v,id_{e},ae\\ e^{\prime}_{v}\in\mathsf{Dist}(v)\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \sigma_{v},\sigma_{id}\vdash e^{\prime}_{v}\Rightarrow_{s}v,id_{v},ae_{v}\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash(e_{1}\leavevmode\nobreak\ e_{2})\Rightarrow_{s}v,id,((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{e}:v)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ x\leftarrow\mathsf{Fresh\leavevmode\nobreak\ variable\leavevmode\nobreak\ name}\\ \sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\langle\lambda.y\leavevmode\nobreak\ e,\sigma^{\prime}_{v},\sigma^{\prime}_{id}\rangle,id_{1},ae_{1}\\ \sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}v^{\prime},id_{2},ae_{2}\\ \sigma_{v}^{\prime}[x\rightarrow v^{\prime}],\sigma_{id}^{\prime}[x\rightarrow id_{2}]\vdash e[x/y]\Rightarrow_{s}v,id_{e},ae_{e}\\ \end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash(e_{1}\leavevmode\nobreak\ e_{2})\Rightarrow_{s}v,id,((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\\ \sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}v_{1},id_{1},ae_{1}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}v_{2},id_{2},ae_{2}\\ v_{1}\neq\langle\lambda.x\leavevmode\nobreak\ e,\sigma^{\prime}_{v},\sigma^{\prime}_{id}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ v=(v_{1}\leavevmode\nobreak\ v_{2})\\ \end{array}\end{array}

\begin{array}[]{c}\\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\emptyset\Rightarrow_{s}\emptyset\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\mathsf{assume}\leavevmode\nobreak\ y=e;p\Rightarrow_{s}\mathsf{assume}\leavevmode\nobreak\ x=ae;t\end{array}\begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}v,id,ae\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \sigma_{v}[x\rightarrow v],\sigma_{id}[x\rightarrow id]\vdash p\Rightarrow_{s}t\end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\mathsf{observe}(\mathsf{Dist}(e)=e_{v});p\Rightarrow_{s}\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\\ \sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}e^{\prime}_{v},id_{e},ae\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \sigma_{v},\sigma_{id}\vdash p\Rightarrow_{s}t\end{array}\end{array}

\begin{array}[]{c}\\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\emptyset\Rightarrow_{s}\emptyset\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\mathsf{assume}\leavevmode\nobreak\ y=e;p\Rightarrow_{s}\mathsf{assume}\leavevmode\nobreak\ x=ae;t\end{array}\begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}v,id,ae\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \sigma_{v}[x\rightarrow v],\sigma_{id}[x\rightarrow id]\vdash p\Rightarrow_{s}t\end{array}\\ \\ \begin{array}[]{c}\sigma_{v},\sigma_{id}\vdash\mathsf{observe}(\mathsf{Dist}(e)=e_{v});p\Rightarrow_{s}\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t\end{array}\begin{array}[]{c}id\leftarrow\mathsf{Fresh\leavevmode\nobreak\ ID}\\ \sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}e^{\prime}_{v},id_{e},ae\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \sigma_{v},\sigma_{id}\vdash p\Rightarrow_{s}t\end{array}\end{array}

p = Program (t) ⟺ t \Rightarrow_{r} p

p = Program (t) ⟺ t \Rightarrow_{r} p

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}(x:x)\#id\Rightarrow_{r}x\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}(x(id^{\prime}):v)\#id\Rightarrow_{r}x\end{array}\begin{array}[]{c}\end{array}\par\\ \\ \begin{array}[]{c}(\lambda.x\leavevmode\nobreak\ e:v)\#id\Rightarrow_{r}\lambda.x\leavevmode\nobreak\ e\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id\Rightarrow_{r}(e_{1}\leavevmode\nobreak\ e_{2})\end{array}\begin{array}[]{c}ae_{1}\Rightarrow_{r}e_{1}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae_{2}\Rightarrow_{r}e_{2}\\ \end{array}\end{array}\\ \begin{array}[]{c}(\mathsf{Dist}(ae\#id^{\prime})=ae^{\prime}:v)\#id\Rightarrow_{r}\mathsf{Dist}(e)\end{array}\begin{array}[]{c}ae\Rightarrow_{r}e\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}(x:x)\#id\Rightarrow_{r}x\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}(x(id^{\prime}):v)\#id\Rightarrow_{r}x\end{array}\begin{array}[]{c}\end{array}\par\\ \\ \begin{array}[]{c}(\lambda.x\leavevmode\nobreak\ e:v)\#id\Rightarrow_{r}\lambda.x\leavevmode\nobreak\ e\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id\Rightarrow_{r}(e_{1}\leavevmode\nobreak\ e_{2})\end{array}\begin{array}[]{c}ae_{1}\Rightarrow_{r}e_{1}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae_{2}\Rightarrow_{r}e_{2}\\ \end{array}\end{array}\\ \begin{array}[]{c}(\mathsf{Dist}(ae\#id^{\prime})=ae^{\prime}:v)\#id\Rightarrow_{r}\mathsf{Dist}(e)\end{array}\begin{array}[]{c}ae\Rightarrow_{r}e\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}\emptyset\Rightarrow_{r}\emptyset\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}\mathsf{assume}\leavevmode\nobreak\ x=ae;t\Rightarrow_{r}\mathsf{assume}\leavevmode\nobreak\ x=e;p\end{array}\begin{array}[]{c}ae\Rightarrow_{r}e\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t\Rightarrow_{r}p\\ \end{array}\end{array}\\ \\ \begin{array}[]{c}\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t\Rightarrow_{r}\mathsf{observe}(\mathsf{Dist}(e)=e_{v});p\end{array}\begin{array}[]{c}ae\Rightarrow_{r}e\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t\Rightarrow_{r}p\\ \end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}\emptyset\Rightarrow_{r}\emptyset\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}\mathsf{assume}\leavevmode\nobreak\ x=ae;t\Rightarrow_{r}\mathsf{assume}\leavevmode\nobreak\ x=e;p\end{array}\begin{array}[]{c}ae\Rightarrow_{r}e\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t\Rightarrow_{r}p\\ \end{array}\end{array}\\ \\ \begin{array}[]{c}\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t\Rightarrow_{r}\mathsf{observe}(\mathsf{Dist}(e)=e_{v});p\end{array}\begin{array}[]{c}ae\Rightarrow_{r}e\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t\Rightarrow_{r}p\\ \end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}(x:x)\#id\Rightarrow_{g}id,\langle\{id\rightarrow\perp\},\emptyset,\emptyset\rangle\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}(x(id^{\prime}):v)\#id\Rightarrow_{g}id,\langle\{id\rightarrow\perp\},\{\langle id^{\prime},id\rangle\},\emptyset\rangle\end{array}\begin{array}[]{c}\end{array}\end{array}\\ \\ \begin{array}[]{c}(\lambda.x\leavevmode\nobreak\ e:\langle\lambda.x\leavevmode\nobreak\ e,\sigma_{v},\sigma_{id}\rangle)\#id\Rightarrow_{g}id,\langle\{id\rightarrow\perp\},\emptyset,\emptyset\rangle\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae:v)\#id\Rightarrow_{g}\\ id,\langle{\mathcal{}N}[id\rightarrow\perp],{\mathcal{}D}\cup\{\langle id_{1},id\rangle,\langle id_{e},id\rangle\},{\mathcal{}E}\cup\{\langle id_{1},id_{n}\rangle|id_{n}\in\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}_{e}\}\rangle\end{array}\begin{array}[]{c}ae_{1}\Rightarrow_{g}id_{1},\langle{\mathcal{}N}_{1},{\mathcal{}D}_{1},{\mathcal{}E}_{1}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae_{2}\Rightarrow_{g}id_{2},\langle{\mathcal{}N}_{2},{\mathcal{}D}_{2},{\mathcal{}E}_{2}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae\Rightarrow_{g}id_{e},\langle{\mathcal{}N}_{e},{\mathcal{}D}_{e},{\mathcal{}E}_{e}\rangle\\ \langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle=\langle{\mathcal{}N}_{1}\cup{\mathcal{}N}_{2}\cup{\mathcal{}N}_{e},{\mathcal{}D}_{1}\cup{\mathcal{}D}_{2}\cup{\mathcal{}D}_{e},{\mathcal{}E}_{1}\cup{\mathcal{}E}_{2}\cup{\mathcal{}E}_{e}\rangle\end{array}\\ \\ \begin{array}[]{c}((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id\Rightarrow_{g}\\ id,\langle{\mathcal{}N}\cup{\mathcal{}N}^{\prime}[id\rightarrow\perp],{\mathcal{}D}\cup{\mathcal{}D}^{\prime}\{\langle id_{1},id\rangle,\langle id_{2},id\rangle\},{\mathcal{}E}\cup{\mathcal{}E}\rangle\end{array}\begin{array}[]{c}ae_{1}\Rightarrow_{g}id_{1},\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae_{2}\Rightarrow_{g}id_{2},\langle{\mathcal{}N}^{\prime},{\mathcal{}D}^{\prime},{\mathcal{}E}^{\prime}\rangle\\ \end{array}\\ \\ \begin{array}[]{c}(\mathsf{Dist}(ae\#id^{\prime})=ae^{\prime}:v)\#id\Rightarrow_{g}\\ id,\langle{\mathcal{}N}_{r}[id\rightarrow\perp],{\mathcal{}D}_{r}\cup\{\langle id^{\prime}_{e},id\rangle,\langle id^{\prime},id\rangle\},{\mathcal{}E}_{r}\rangle\end{array}\begin{array}[]{c}ae\Rightarrow_{g}id_{e},\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae^{\prime}\Rightarrow_{g}id^{\prime}_{e},\langle{\mathcal{}N}^{\prime},{\mathcal{}D}^{\prime},{\mathcal{}E}^{\prime}\rangle\\ \langle{\mathcal{}N}_{r},{\mathcal{}D}_{r},{\mathcal{}E}_{r}\rangle=\\ \langle{\mathcal{}N}\cup{\mathcal{}N}^{\prime}[id^{\prime}\rightarrow\mathsf{Sample}],{\mathcal{}D}\cup{\mathcal{}D}^{\prime}\cup\{\langle id_{e},id^{\prime}\rangle\},{\mathcal{}E}\cup{\mathcal{}E}^{\prime}\cup\{\langle id^{\prime},id_{n}\rangle|id_{n}\in\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}^{\prime}\}\rangle\\ \end{array}\\ \end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}(x:x)\#id\Rightarrow_{g}id,\langle\{id\rightarrow\perp\},\emptyset,\emptyset\rangle\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}(x(id^{\prime}):v)\#id\Rightarrow_{g}id,\langle\{id\rightarrow\perp\},\{\langle id^{\prime},id\rangle\},\emptyset\rangle\end{array}\begin{array}[]{c}\end{array}\end{array}\\ \\ \begin{array}[]{c}(\lambda.x\leavevmode\nobreak\ e:\langle\lambda.x\leavevmode\nobreak\ e,\sigma_{v},\sigma_{id}\rangle)\#id\Rightarrow_{g}id,\langle\{id\rightarrow\perp\},\emptyset,\emptyset\rangle\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae:v)\#id\Rightarrow_{g}\\ id,\langle{\mathcal{}N}[id\rightarrow\perp],{\mathcal{}D}\cup\{\langle id_{1},id\rangle,\langle id_{e},id\rangle\},{\mathcal{}E}\cup\{\langle id_{1},id_{n}\rangle|id_{n}\in\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}_{e}\}\rangle\end{array}\begin{array}[]{c}ae_{1}\Rightarrow_{g}id_{1},\langle{\mathcal{}N}_{1},{\mathcal{}D}_{1},{\mathcal{}E}_{1}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae_{2}\Rightarrow_{g}id_{2},\langle{\mathcal{}N}_{2},{\mathcal{}D}_{2},{\mathcal{}E}_{2}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae\Rightarrow_{g}id_{e},\langle{\mathcal{}N}_{e},{\mathcal{}D}_{e},{\mathcal{}E}_{e}\rangle\\ \langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle=\langle{\mathcal{}N}_{1}\cup{\mathcal{}N}_{2}\cup{\mathcal{}N}_{e},{\mathcal{}D}_{1}\cup{\mathcal{}D}_{2}\cup{\mathcal{}D}_{e},{\mathcal{}E}_{1}\cup{\mathcal{}E}_{2}\cup{\mathcal{}E}_{e}\rangle\end{array}\\ \\ \begin{array}[]{c}((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id\Rightarrow_{g}\\ id,\langle{\mathcal{}N}\cup{\mathcal{}N}^{\prime}[id\rightarrow\perp],{\mathcal{}D}\cup{\mathcal{}D}^{\prime}\{\langle id_{1},id\rangle,\langle id_{2},id\rangle\},{\mathcal{}E}\cup{\mathcal{}E}\rangle\end{array}\begin{array}[]{c}ae_{1}\Rightarrow_{g}id_{1},\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae_{2}\Rightarrow_{g}id_{2},\langle{\mathcal{}N}^{\prime},{\mathcal{}D}^{\prime},{\mathcal{}E}^{\prime}\rangle\\ \end{array}\\ \\ \begin{array}[]{c}(\mathsf{Dist}(ae\#id^{\prime})=ae^{\prime}:v)\#id\Rightarrow_{g}\\ id,\langle{\mathcal{}N}_{r}[id\rightarrow\perp],{\mathcal{}D}_{r}\cup\{\langle id^{\prime}_{e},id\rangle,\langle id^{\prime},id\rangle\},{\mathcal{}E}_{r}\rangle\end{array}\begin{array}[]{c}ae\Rightarrow_{g}id_{e},\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae^{\prime}\Rightarrow_{g}id^{\prime}_{e},\langle{\mathcal{}N}^{\prime},{\mathcal{}D}^{\prime},{\mathcal{}E}^{\prime}\rangle\\ \langle{\mathcal{}N}_{r},{\mathcal{}D}_{r},{\mathcal{}E}_{r}\rangle=\\ \langle{\mathcal{}N}\cup{\mathcal{}N}^{\prime}[id^{\prime}\rightarrow\mathsf{Sample}],{\mathcal{}D}\cup{\mathcal{}D}^{\prime}\cup\{\langle id_{e},id^{\prime}\rangle\},{\mathcal{}E}\cup{\mathcal{}E}^{\prime}\cup\{\langle id^{\prime},id_{n}\rangle|id_{n}\in\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}^{\prime}\}\rangle\\ \end{array}\\ \end{array}

\begin{array}[]{c}\begin{array}[]{c}\mathsf{assume}\leavevmode\nobreak\ x=ae\Rightarrow_{g}\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\end{array}\begin{array}[]{c}ae\Rightarrow_{g}id,\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\end{array}\\ \\ \begin{array}[]{c}\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v})\Rightarrow_{g}\langle{\mathcal{}N}[id\rightarrow\mathsf{Sample}],{\mathcal{}D}\cup\{\langle id^{\prime},id\rangle\},{\mathcal{}E}\rangle\end{array}\begin{array}[]{c}ae\Rightarrow id^{\prime},\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{c}\mathsf{assume}\leavevmode\nobreak\ x=ae\Rightarrow_{g}\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\end{array}\begin{array}[]{c}ae\Rightarrow_{g}id,\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\end{array}\\ \\ \begin{array}[]{c}\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v})\Rightarrow_{g}\langle{\mathcal{}N}[id\rightarrow\mathsf{Sample}],{\mathcal{}D}\cup\{\langle id^{\prime},id\rangle\},{\mathcal{}E}\rangle\end{array}\begin{array}[]{c}ae\Rightarrow id^{\prime},\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\end{array}\end{array}

\begin{array}[]{cc}\\ \begin{array}[]{c}\emptyset\Rightarrow_{g}\langle\emptyset,\emptyset,\emptyset\rangle\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}as;t\Rightarrow_{g}\langle{\mathcal{}N}\cup{\mathcal{}N}_{s},{\mathcal{}D}\cup{\mathcal{}D}_{s},{\mathcal{}E}\cup{\mathcal{}E}_{s}\rangle\end{array}\begin{array}[]{c}as\Rightarrow_{g}\langle{\mathcal{}N}_{s},{\mathcal{}D}_{s},{\mathcal{}E}_{s}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t\Rightarrow_{g}\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\end{array}\end{array}

\begin{array}[]{cc}\\ \begin{array}[]{c}\emptyset\Rightarrow_{g}\langle\emptyset,\emptyset,\emptyset\rangle\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}as;t\Rightarrow_{g}\langle{\mathcal{}N}\cup{\mathcal{}N}_{s},{\mathcal{}D}\cup{\mathcal{}D}_{s},{\mathcal{}E}\cup{\mathcal{}E}_{s}\rangle\end{array}\begin{array}[]{c}as\Rightarrow_{g}\langle{\mathcal{}N}_{s},{\mathcal{}D}_{s},{\mathcal{}E}_{s}\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t\Rightarrow_{g}\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle\end{array}\end{array}

\begin{array}[]{c}\mathsf{infer}(\mathsf{SS},\mathsf{IT},t)\Rightarrow_{i}t^{\prime}\end{array}\begin{array}[]{c}\mathsf{SS}(t)={\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t^{\prime}=\mathsf{IT}(t,{\mathcal{}S})\\ t^{\prime}\in\mathsf{Traces}(\mathsf{Program}(t))\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\equiv t^{\prime}\end{array}

\begin{array}[]{c}\mathsf{infer}(\mathsf{SS},\mathsf{IT},t)\Rightarrow_{i}t^{\prime}\end{array}\begin{array}[]{c}\mathsf{SS}(t)={\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t^{\prime}=\mathsf{IT}(t,{\mathcal{}S})\\ t^{\prime}\in\mathsf{Traces}(\mathsf{Program}(t))\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\equiv t^{\prime}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash(x:x)\#id\equiv(x:x)\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}{\mathcal{}S}\vdash(x(id_{v}):v)\#id\equiv(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\lambda.x\leavevmode\nobreak\ e:v)\#id\equiv(\lambda.x\leavevmode\nobreak\ e:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{3}:v)\#id\equiv((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})x=ae^{\prime}_{3}:v^{\prime})\#id\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}\\ {\mathcal{}S}\vdash ae^{\prime}_{2}\equiv ae^{\prime}_{2}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{3}\equiv ae^{\prime}_{3}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id\equiv((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})\perp:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\notin{\mathcal{}S}\\ {\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{2}\equiv ae^{\prime}_{2}\\ \end{array}\end{array}\\ \\ \begin{array}[]{ccc}\begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id\equiv((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})aa^{\prime}:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\in{\mathcal{}S}\\ {\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{2}\equiv ae^{\prime}_{2}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae\#id_{e})=ae_{e}:v)\#id\equiv(\mathsf{Dist}(ae^{\prime}\#id_{e})=ae^{\prime}_{e}:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}id_{e}\notin{\mathcal{}S}\\ {\mathcal{}S}\vdash ae\equiv ae^{\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{v}\equiv ae^{\prime}_{v}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae\#id_{e})=ae_{e}:v)\#id\equiv(\mathsf{Dist}(ae^{\prime}\#id^{\prime}_{e})=ae^{\prime}_{e}:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}id_{e}\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae\equiv ae^{\prime}\\ \end{array}\\ \\ \end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash(x:x)\#id\equiv(x:x)\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}{\mathcal{}S}\vdash(x(id_{v}):v)\#id\equiv(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\lambda.x\leavevmode\nobreak\ e:v)\#id\equiv(\lambda.x\leavevmode\nobreak\ e:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{3}:v)\#id\equiv((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})x=ae^{\prime}_{3}:v^{\prime})\#id\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}\\ {\mathcal{}S}\vdash ae^{\prime}_{2}\equiv ae^{\prime}_{2}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{3}\equiv ae^{\prime}_{3}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id\equiv((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})\perp:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\notin{\mathcal{}S}\\ {\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{2}\equiv ae^{\prime}_{2}\\ \end{array}\end{array}\\ \\ \begin{array}[]{ccc}\begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id\equiv((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})aa^{\prime}:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\in{\mathcal{}S}\\ {\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{2}\equiv ae^{\prime}_{2}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae\#id_{e})=ae_{e}:v)\#id\equiv(\mathsf{Dist}(ae^{\prime}\#id_{e})=ae^{\prime}_{e}:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}id_{e}\notin{\mathcal{}S}\\ {\mathcal{}S}\vdash ae\equiv ae^{\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{v}\equiv ae^{\prime}_{v}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae\#id_{e})=ae_{e}:v)\#id\equiv(\mathsf{Dist}(ae^{\prime}\#id^{\prime}_{e})=ae^{\prime}_{e}:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}id_{e}\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae\equiv ae^{\prime}\\ \end{array}\\ \\ \end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash\emptyset\equiv\emptyset\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{assume}\leavevmode\nobreak\ x=ae;t\equiv\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime};t^{\prime}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae\equiv ae^{\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\equiv t^{\prime}\\ \end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t\equiv\mathsf{observe}(\mathsf{Dist}(ae^{\prime}\#id)=e_{v});t^{\prime}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae\equiv ae^{\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\equiv t^{\prime}\\ \end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash\emptyset\equiv\emptyset\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{assume}\leavevmode\nobreak\ x=ae;t\equiv\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime};t^{\prime}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae\equiv ae^{\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\equiv t^{\prime}\\ \end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t\equiv\mathsf{observe}(\mathsf{Dist}(ae^{\prime}\#id)=e_{v});t^{\prime}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae\equiv ae^{\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\equiv t^{\prime}\\ \end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash(x(id^{\prime}):v)\#id\Rightarrow_{ex}(x(id^{\prime}):v)\#id,\emptyset\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(x:x)\#id\Rightarrow_{ex}(x:x)\#id,\emptyset\end{array}\begin{array}[]{c}\end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\lambda.x\leavevmode\nobreak\ e:v)\#id\Rightarrow_{ex}(\lambda.x\leavevmode\nobreak\ e:v)\#id,\emptyset\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}\begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id\Rightarrow_{ex}((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})aa:v)\#id,t_{s};t^{\prime}_{s}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{\prime}_{1},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{\prime}_{2},t^{\prime}_{s}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id,t_{p}\Rightarrow_{ex}((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})\perp:v)\#id,t_{s};t^{\prime}_{s}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{\prime}_{1},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{\prime}_{2},t^{\prime}_{s}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})y=ae_{3}:v)\#id,t_{p}\Rightarrow_{ex}ae^{\prime}_{3},t^{\prime\prime\prime}_{s}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{\prime}_{1},t_{s}\\ {\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{\prime}_{2},t^{\prime}_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{3}\Rightarrow_{ex}ae^{\prime}_{3},t^{\prime\prime}_{s}\\ x\leftarrow\mathsf{Fresh\leavevmode\nobreak\ variable\leavevmode\nobreak\ name}\\ t^{\prime\prime\prime}_{s}=t_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime}_{1};t^{\prime}_{s};\mathsf{assume}\leavevmode\nobreak\ y=ae^{\prime}_{2};t^{\prime\prime}_{s}\\ \end{array}\end{array}\\ \\ \begin{array}[]{c}\begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae\#id^{\prime})=ae_{v}:v)\#id\Rightarrow_{ex}ae^{\prime}_{v},t_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}\#id^{\prime})=e_{v});t^{\prime}_{s}\end{array}\begin{array}[]{c}id^{\prime}\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae\Rightarrow_{ex}ae^{\prime},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae_{v}\Rightarrow_{r}e_{v}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{v}\Rightarrow_{ex}ae^{\prime}_{v},t^{\prime}_{s}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae\#id^{\prime})=ae_{v}:v)\#id\Rightarrow_{ex}(\mathsf{Dist}(ae^{\prime}\#id^{\prime})=ae_{v}:v)\#id,t_{s}\end{array}\begin{array}[]{c}id^{\prime}\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae\Rightarrow_{ex}ae^{\prime},t_{s}\\ \end{array}\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash(x(id^{\prime}):v)\#id\Rightarrow_{ex}(x(id^{\prime}):v)\#id,\emptyset\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(x:x)\#id\Rightarrow_{ex}(x:x)\#id,\emptyset\end{array}\begin{array}[]{c}\end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\lambda.x\leavevmode\nobreak\ e:v)\#id\Rightarrow_{ex}(\lambda.x\leavevmode\nobreak\ e:v)\#id,\emptyset\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}\begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id\Rightarrow_{ex}((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})aa:v)\#id,t_{s};t^{\prime}_{s}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{\prime}_{1},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{\prime}_{2},t^{\prime}_{s}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id,t_{p}\Rightarrow_{ex}((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})\perp:v)\#id,t_{s};t^{\prime}_{s}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{\prime}_{1},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{\prime}_{2},t^{\prime}_{s}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae_{1}\leavevmode\nobreak\ ae_{2})y=ae_{3}:v)\#id,t_{p}\Rightarrow_{ex}ae^{\prime}_{3},t^{\prime\prime\prime}_{s}\end{array}\begin{array}[]{c}\mathsf{ID}(ae_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{\prime}_{1},t_{s}\\ {\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{\prime}_{2},t^{\prime}_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{3}\Rightarrow_{ex}ae^{\prime}_{3},t^{\prime\prime}_{s}\\ x\leftarrow\mathsf{Fresh\leavevmode\nobreak\ variable\leavevmode\nobreak\ name}\\ t^{\prime\prime\prime}_{s}=t_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime}_{1};t^{\prime}_{s};\mathsf{assume}\leavevmode\nobreak\ y=ae^{\prime}_{2};t^{\prime\prime}_{s}\\ \end{array}\end{array}\\ \\ \begin{array}[]{c}\begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae\#id^{\prime})=ae_{v}:v)\#id\Rightarrow_{ex}ae^{\prime}_{v},t_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}\#id^{\prime})=e_{v});t^{\prime}_{s}\end{array}\begin{array}[]{c}id^{\prime}\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae\Rightarrow_{ex}ae^{\prime},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ ae_{v}\Rightarrow_{r}e_{v}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae_{v}\Rightarrow_{ex}ae^{\prime}_{v},t^{\prime}_{s}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae\#id^{\prime})=ae_{v}:v)\#id\Rightarrow_{ex}(\mathsf{Dist}(ae^{\prime}\#id^{\prime})=ae_{v}:v)\#id,t_{s}\end{array}\begin{array}[]{c}id^{\prime}\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae\Rightarrow_{ex}ae^{\prime},t_{s}\\ \end{array}\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash\emptyset\Rightarrow_{ex}\emptyset\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{assume}\leavevmode\nobreak\ x=ae;t\Rightarrow_{ex}t_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime};t^{\prime}_{s}\end{array}\begin{array}[]{c}id=\mathsf{ID}(ae)\nobreak\leavevmode\nobreak\leavevmode\nobreak\leavevmode\nobreak\leavevmode\nobreak\leavevmode\\ {\mathcal{}S}\vdash ae\Rightarrow_{ex}ae^{\prime},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\Rightarrow_{ex}t^{\prime}_{s}\end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t\Rightarrow_{ex}t_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}\#id)=e_{v});t^{\prime}_{s}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae\Rightarrow_{ex}ae^{\prime},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\Rightarrow_{ex}t^{\prime}_{s}\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{cc}\begin{array}[]{c}{\mathcal{}S}\vdash\emptyset\Rightarrow_{ex}\emptyset\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{assume}\leavevmode\nobreak\ x=ae;t\Rightarrow_{ex}t_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime};t^{\prime}_{s}\end{array}\begin{array}[]{c}id=\mathsf{ID}(ae)\nobreak\leavevmode\nobreak\leavevmode\nobreak\leavevmode\nobreak\leavevmode\nobreak\leavevmode\\ {\mathcal{}S}\vdash ae\Rightarrow_{ex}ae^{\prime},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\Rightarrow_{ex}t^{\prime}_{s}\end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t\Rightarrow_{ex}t_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}\#id)=e_{v});t^{\prime}_{s}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae\Rightarrow_{ex}ae^{\prime},t_{s}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t\Rightarrow_{ex}t^{\prime}_{s}\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{c}{\mathcal{}S}\vdash(x:v)\#id,(x:v^{\prime})\#id^{\prime},\emptyset\Rightarrow_{st}(x:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(x(id_{v}):v)\#id,(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime},\emptyset\Rightarrow_{st}(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\lambda.x\leavevmode\nobreak\ e:v)\#id^{\prime},(\lambda.x\leavevmode\nobreak\ e:v^{\prime})\#id,\emptyset\Rightarrow_{st}(\lambda.x\leavevmode\nobreak\ e:v^{\prime})\#id\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})aa^{\prime}:v^{\prime})\#id^{\prime},((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id,t^{\prime}_{p};t^{\prime\prime}_{p}\Rightarrow_{st}((ae^{\prime\prime}_{1}\leavevmode\nobreak\ ae^{\prime\prime}_{2})aa:v)\#id\end{array}\begin{array}[]{c}\mathsf{ID}(ae^{\prime}_{1})\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{2},ae_{2},t^{\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{2}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{1},ae_{1},t^{\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{1}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})y=ae^{\prime}_{3}:v^{\prime})\#id,ae_{3},t_{p}\Rightarrow_{st}((ae^{\prime\prime}_{1}\leavevmode\nobreak\ ae^{\prime\prime}_{2})y=ae^{\prime\prime}_{3}:{\mathcal{}V}(ae^{\prime\prime}_{3}))\#id\end{array}\begin{array}[]{c}\mathsf{ID}(ae^{\prime}_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{3},ae_{3},t^{\prime\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{3}\\ {\mathcal{}S}\vdash ae^{\prime}_{2},ae_{2},t^{\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{2}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{1},ae_{1},t^{\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{1}\\ t_{p}=t^{\prime}_{p};\mathsf{assume}\leavevmode\nobreak\ x=ae_{1};t^{\prime\prime}_{p};\mathsf{assume}\leavevmode\nobreak\ y=ae_{2};t^{\prime\prime\prime}_{p}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})\perp:v^{\prime})\#id^{\prime},((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id,t^{\prime}_{p};t^{\prime\prime}_{p}\Rightarrow_{st}((ae^{\prime\prime}_{1}\leavevmode\nobreak\ ae^{\prime\prime}_{2})\perp:v)\#id\end{array}\begin{array}[]{c}\mathsf{ID}(ae^{\prime}_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{2},ae_{2},t^{\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{2}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{1},ae_{1},t^{\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{1}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae^{\prime}\#id^{\prime}_{v})=ae^{\prime}_{v}:v^{\prime})\#id^{\prime},(\mathsf{Dist}(ae\#id_{v})=ae_{v}:v)\#id,t_{p}\Rightarrow_{st}\\ (\mathsf{Dist}(ae\#id_{v})=ae_{v}:v)\#id\end{array}\begin{array}[]{c}id^{\prime}_{v}\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime},ae,t_{p}\Rightarrow_{st}ae^{\prime\prime}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae^{\prime}\#id_{v})=ae^{\prime}_{v}:v^{\prime})\#id^{\prime},ae_{v},t_{p}\Rightarrow_{st}(\mathsf{Dist}(ae\#id_{v})=ae^{\prime\prime}_{v}:{\mathcal{}V}(ae^{\prime\prime}_{v}))\#id\end{array}\begin{array}[]{c}id^{\prime}_{v}\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{v},ae_{v},t^{\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{v}\\ {\mathcal{}S}\vdash ae^{\prime},ae,t^{\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t_{p}=t^{\prime}_{p};\mathsf{observe}(\mathsf{Dist}(ae\#id^{\prime}_{v})=e_{v});t^{\prime\prime}_{p}\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{c}{\mathcal{}S}\vdash(x:v)\#id,(x:v^{\prime})\#id^{\prime},\emptyset\Rightarrow_{st}(x:v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(x(id_{v}):v)\#id,(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime},\emptyset\Rightarrow_{st}(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime}\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\lambda.x\leavevmode\nobreak\ e:v)\#id^{\prime},(\lambda.x\leavevmode\nobreak\ e:v^{\prime})\#id,\emptyset\Rightarrow_{st}(\lambda.x\leavevmode\nobreak\ e:v^{\prime})\#id\end{array}\begin{array}[]{c}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})aa^{\prime}:v^{\prime})\#id^{\prime},((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id,t^{\prime}_{p};t^{\prime\prime}_{p}\Rightarrow_{st}((ae^{\prime\prime}_{1}\leavevmode\nobreak\ ae^{\prime\prime}_{2})aa:v)\#id\end{array}\begin{array}[]{c}\mathsf{ID}(ae^{\prime}_{1})\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{2},ae_{2},t^{\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{2}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{1},ae_{1},t^{\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{1}\\ \end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})y=ae^{\prime}_{3}:v^{\prime})\#id,ae_{3},t_{p}\Rightarrow_{st}((ae^{\prime\prime}_{1}\leavevmode\nobreak\ ae^{\prime\prime}_{2})y=ae^{\prime\prime}_{3}:{\mathcal{}V}(ae^{\prime\prime}_{3}))\#id\end{array}\begin{array}[]{c}\mathsf{ID}(ae^{\prime}_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{3},ae_{3},t^{\prime\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{3}\\ {\mathcal{}S}\vdash ae^{\prime}_{2},ae_{2},t^{\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{2}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{1},ae_{1},t^{\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{1}\\ t_{p}=t^{\prime}_{p};\mathsf{assume}\leavevmode\nobreak\ x=ae_{1};t^{\prime\prime}_{p};\mathsf{assume}\leavevmode\nobreak\ y=ae_{2};t^{\prime\prime\prime}_{p}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})\perp:v^{\prime})\#id^{\prime},((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id,t^{\prime}_{p};t^{\prime\prime}_{p}\Rightarrow_{st}((ae^{\prime\prime}_{1}\leavevmode\nobreak\ ae^{\prime\prime}_{2})\perp:v)\#id\end{array}\begin{array}[]{c}\mathsf{ID}(ae^{\prime}_{1})\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{2},ae_{2},t^{\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{2}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{1},ae_{1},t^{\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{1}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae^{\prime}\#id^{\prime}_{v})=ae^{\prime}_{v}:v^{\prime})\#id^{\prime},(\mathsf{Dist}(ae\#id_{v})=ae_{v}:v)\#id,t_{p}\Rightarrow_{st}\\ (\mathsf{Dist}(ae\#id_{v})=ae_{v}:v)\#id\end{array}\begin{array}[]{c}id^{\prime}_{v}\in{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime},ae,t_{p}\Rightarrow_{st}ae^{\prime\prime}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash(\mathsf{Dist}(ae^{\prime}\#id_{v})=ae^{\prime}_{v}:v^{\prime})\#id^{\prime},ae_{v},t_{p}\Rightarrow_{st}(\mathsf{Dist}(ae\#id_{v})=ae^{\prime\prime}_{v}:{\mathcal{}V}(ae^{\prime\prime}_{v}))\#id\end{array}\begin{array}[]{c}id^{\prime}_{v}\notin{\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash ae^{\prime}_{v},ae_{v},t^{\prime\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}_{v}\\ {\mathcal{}S}\vdash ae^{\prime},ae,t^{\prime}_{p}\Rightarrow_{st}ae^{\prime\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t_{p}=t^{\prime}_{p};\mathsf{observe}(\mathsf{Dist}(ae\#id^{\prime}_{v})=e_{v});t^{\prime\prime}_{p}\end{array}\end{array}

\begin{array}[]{c}\begin{array}[]{ccc}\\ \begin{array}[]{c}{\mathcal{}S}\vdash\emptyset,\emptyset\Rightarrow_{st}\emptyset\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{assume}\leavevmode\nobreak\ x=ae;t,t_{p};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime};t^{\prime}_{p}\Rightarrow_{st}\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime\prime};t^{\prime}_{s}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae,ae^{\prime},t_{p}\Rightarrow_{st}ae^{\prime\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t,t^{\prime}_{p}\Rightarrow_{st}t^{\prime}_{s}\end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t,t_{p};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}\#id^{\prime})=e_{v});t^{\prime}_{p}\Rightarrow_{st}\\ \mathsf{observe}(\mathsf{Dist}(ae^{\prime\prime}\#id)=e_{v});t^{\prime}_{s}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae,ae^{\prime},t_{p}\Rightarrow_{st}ae^{\prime\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t,t^{\prime}_{p}\Rightarrow_{st}t^{\prime}_{s}\end{array}\par\end{array}

\begin{array}[]{c}\begin{array}[]{ccc}\\ \begin{array}[]{c}{\mathcal{}S}\vdash\emptyset,\emptyset\Rightarrow_{st}\emptyset\end{array}\begin{array}[]{c}\end{array}&\begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{assume}\leavevmode\nobreak\ x=ae;t,t_{p};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime};t^{\prime}_{p}\Rightarrow_{st}\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime\prime};t^{\prime}_{s}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae,ae^{\prime},t_{p}\Rightarrow_{st}ae^{\prime\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t,t^{\prime}_{p}\Rightarrow_{st}t^{\prime}_{s}\end{array}\end{array}\\ \\ \begin{array}[]{c}{\mathcal{}S}\vdash\mathsf{observe}(\mathsf{Dist}(ae\#id)=e_{v});t,t_{p};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}\#id^{\prime})=e_{v});t^{\prime}_{p}\Rightarrow_{st}\\ \mathsf{observe}(\mathsf{Dist}(ae^{\prime\prime}\#id)=e_{v});t^{\prime}_{s}\end{array}\begin{array}[]{c}{\mathcal{}S}\vdash ae,ae^{\prime},t_{p}\Rightarrow_{st}ae^{\prime\prime}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathcal{}S}\vdash t,t^{\prime}_{p}\Rightarrow_{st}t^{\prime}_{s}\end{array}\par\end{array}

\begin{array}[]{c}\mathsf{infer}(\mathsf{SS},\mathsf{IT}),t\Rightarrow_{i}t^{\prime}\end{array}\begin{array}[]{c}\mathsf{SS}(t)={\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t^{\prime}_{s}=\mathsf{IT}(t_{s})\\ t^{\prime}_{s}\in\mathsf{Traces}(\mathsf{Program}(t_{s}))\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},{\mathcal{}S})\end{array}

\begin{array}[]{c}\mathsf{infer}(\mathsf{SS},\mathsf{IT}),t\Rightarrow_{i}t^{\prime}\end{array}\begin{array}[]{c}\mathsf{SS}(t)={\mathcal{}S}\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t^{\prime}_{s}=\mathsf{IT}(t_{s})\\ t^{\prime}_{s}\in\mathsf{Traces}(\mathsf{Program}(t_{s}))\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},{\mathcal{}S})\end{array}

m (i \in I ⋃ A_{i}) = i \in I \sum m (A_{i})

m (i \in I ⋃ A_{i}) = i \in I \sum m (A_{i})

(f_{*} (π)) (B) = π (f^{- 1} (B)) for B \in Σ_{2}

(f_{*} (π)) (B) = π (f^{- 1} (B)) for B \in Σ_{2}

\int_{A} I_{A_{i}} (t) m (d t) = m (A_{i} \cap A)

\int_{A} I_{A_{i}} (t) m (d t) = m (A_{i} \cap A)

\int_{A} s (t) m (d t) = i = 1 \sum N a_{i} m (A_{i} \cap A)

\int_{A} s (t) m (d t) = i = 1 \sum N a_{i} m (A_{i} \cap A)

\int_{A} f (t) m (d t) = sup {\int_{A} s (t) m (d t) ∣ s (t) is simple, 0 \leq s \leq f}

\int_{A} f (t) m (d t) = sup {\int_{A} s (t) m (d t) ∣ s (t) is simple, 0 \leq s \leq f}

\int_{A} f (t) m (d t) = \int_{A} f_{+} (t) m (d t) - \int_{A} f_{-} (t) m (d t)

\int_{A} f (t) m (d t) = \int_{A} f_{+} (t) m (d t) - \int_{A} f_{-} (t) m (d t)

K^{n} (t, A) > 0

K^{n} (t, A) > 0

π K = π

π K = π

lim_{n \to \infty} ∣∣ K^{n} (t, .) - π ∣∣ = 0

lim_{n \to \infty} ∣∣ K^{n} (t, .) - π ∣∣ = 0

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Bayesian Modeling and Causal Inference · Formal Methods in Verification

Full text

1

Compositional Inference Metaprogramming with Convergence Guarantees

Shivam Handa

nnnn-nnnn-nnnn-nnnn

Massachusetts Institute of Technology

[email protected]

,

Vikash Mansinghka

Massachusetts Institute of Technology

and

Martin Rinard

Massachusetts Institute of Technology

Abstract.

Inference metaprogramming enables effective probabilistic programming by supporting the decomposition of executions of probabilistic programs into subproblems and the deployment of hybrid probabilistic inference algorithms that apply different probabilistic inference algorithms to different subproblems. We introduce the concept of independent subproblem inference (as opposed to entangled subproblem inference in which the subproblem inference algorithm operates over the full program trace) and present a mathematical framework for studying convergence properties of hybrid inference algorithms that apply different Markov-Chain Monte Carlo algorithms to different parts of the inference problem. We then use this formalism to prove asymptotic convergence results for probablistic programs with inference metaprogramming. To the best of our knowledge this is the first asymptotic convergence result for hybrid probabilistic inference algorithms defined by (subproblem-based) inference metaprogramming.

Probabilistic Programming, Metaprogramming, Inference Algorithms

††journal: PACMPL††journalvolume: 1††journalnumber: Under review as a conference paper at POPL 2020††article: 1††journalyear: 2019††publicationmonth: 1††doi: ††copyright: none

1. Introduction

Probabilistic modeling and inference are now mainstream approaches deployed in many areas of computing and data analysis (Russell and Norvig, 2003; Thrun et al., 2005; Liu, 2008; Murphy, 2012; Gelman et al., 2014; Forsyth and Ponce, 2002). To better support these computations, researchers have developed probabilistic programming languages, which include constructs that directly support probabilistic modeling and inference within the language itself (Milch et al., 2007; Goodman et al., 2008; Goodman and Stuhlmueller, 2014; Mansinghka et al., 2014; Gordon et al., 2014a; Tristan et al., 2014; Gordon et al., 2014b; Tolpin et al., 2015; Carpenter et al., 2016; Labs, 2017). Probabilistic inference strategies provide the probabilistic reasoning required to implement these constructs.

It is well known that no one probabilistic inference strategy is appropriate for all probabilistic inference and modeling tasks (Russell and Norvig, 2003; Mansinghka et al., 2018). Indeed, effective inference often involves breaking an inference problem down into subproblems, then applying different inference strategies to different subproblems as appropriate (Russell and Norvig, 2003; Mansinghka et al., 2018). Applying this approach to probabilistic programs, specifically by specifying subtask decompositions and inference strategies to apply to each subtask, is called inference metaprogramming. Inference metaprogramming has been shown to dramatically improve the execution time and accuracy of probabilistic programs (in comparison with monolithic inference strategies that apply a single inference strategy to the entire program) (Mansinghka et al., 2018).

1.1. Probabilistic Inference and Convergence

Executions of probabilistic programs typically use probabilistic inference to generate samples from the underlying probability distribution that the program defines (Mansinghka et al., 2018). Many probabilistic inference algorithms are iterative, i.e., they perform multiple steps that bring samples closer to the specified probability distribution. A standard correctness property of such algorithms is asymptotic convergence, i.e., a guarantee that, in the limit as the number of iterations increases, the resulting sample will be drawn from the defined posterior distribution. Markov-Chain Monte-Carlo (MCMC) algorithms (which include Metropolis-Hastings (Chib and Greenberg, 1995) and Gibbs sampling (Meyn and Tweedie, 2012)) comprise a widely-used (Milch et al., 2007; Goodman et al., 2008; Goodman and Stuhlmueller, 2014; Mansinghka et al., 2014) class of probabilistic inference algorithms that often come with asymptotic convergence guarantees (Geyer, 1998).

Using inference metaprogramming to decompose and solve inference problems into subprograms produces new hybrid probabilistic inference algorithms. Whether or not these new hybrid inference algorithms (as implemented in the inference metaprogramming language) also asymptotically converge is often a question of interest (because it directly relates to the compositional soundness of the inference metaprogram).

Over the last several decades the field has developed many iterative probabilistic inference algorithms (Meyn and Tweedie, 2012) and proved convergence results for these algorithms (Berti et al., 2008). Many of these algorithms compose inference steps applied to different parts of the problem and would therefore seem to be a promising candidate for proving convergence properties of probabilistic programs with inference metaprogramming. Unfortunately, these algorithms, and their associated convergence proofs, have several onerous restrictions. Specifically, they model the state of the system as a product space over a fixed set of random choices and work with policies whose selection of random choices to resample does not depend on the state of the system. The basic mathematical framework (and associated convergence proofs) is therefore not applicable to probabilistic programming with inference metaprogramming — in this new setting, the random choices in the subproblems (as defined by the random variables that the program defines and samples) may change over time, are potentially unbounded (e.g., Open Universe Probabilistic Models (Milch and Russell, 2010; Wu et al., 2016)), and may depend on the state of the system as realized in the current values of the random choices. For example, our framework supports programs that sample a stochastic choice, then compute the set of stochastic choices to include in a subproblem as a function of the sampled stochastic choice.

1.2. Our Result

We present the first asymptotic convergence result for hybrid probabilistic inference algorithms defined by inference metaprogramming. Given a probabilistic program with posterior distribution $\pi$ , we show that the hybrid algorithms applied to that program are $\pi$ -irreducible, aperiodic, and have $\pi$ as their stationary distribution. This result stands on two foundational new results:

•

Independent Subproblems: We consider executions of probabilistic programs that produce program traces (Mansinghka et al., 2018; Wingate et al., 2011). These traces record the random choices made during the execution. With inference metaprogramming, the subproblem inference algorithms must operate only over the random choices in the subproblem. Previous formulations of subproblem inference, however, define subproblem inference as operating over the entire trace (Mansinghka et al., 2018). This approach entangles the subproblem with the full program trace and complicates the analysis of the interaction between subproblems and inference metaprogramming.

We instead formalize subproblem inference using a new technique that extracts each subproblem from the original program trace into its own independent trace. Inference is then performed over the full extracted trace, with the newly generated trace then stitched back into the original trace to complete the subproblem inference. By detangling the subproblem from the full trace, this approach delivers the clean separation of subproblems and subproblem inference required to state and prove the new asymptotic convergence result.

•

Mathematical Framework: We present a new and more general mathematical framework for studying the composition of probabilistic inference algorithms applied to subproblems. A key aspect of this framework is that it supports state-dependent embeddings between program trace spaces defined by different probabilistic programs. The framework therefore enables us to model subproblem inference by embedding the original program trace into the space of program traces defined by the detangled subproblem, moving the embedded trace within this new space of program traces, then injecting the new trace back into the original trace space.

We note that this mathematical framework is not specifically tied to probabilistic programming. It supports compositional asymptotic convergence results for a range of probabilistic inference algorithms that operate over general probability spaces (even uncomputable ones) by mapping subspaces of the original space into new isolated probability spaces, iteratively applying MCMC inference algorithms to the isolated space, then mapping the results back into the original space. An important property is that the applied inference algorithms, the isolated probability spaces, and the mappings may all be state-dependent.

Building on these results, we prove a new asymptotic convergence result for inference metaprograms that apply asymptotically converging MCMC algorithms to appropriately defined subproblems. This result identifies two key restrictions on the subproblem selection strategies that the inference metaprogram uses to identify subproblems. These restrictions guarantee asymptotic convergence for inference metaprograms that apply a large class of asymptotically converging MCMC algorithms to the specified subproblems:

•

Reversibility: The subproblem selection strategy must be reversible, i.e., given $n$ traces $t_{1},t_{2}\ldots t_{n}$ such that $t_{i}$ can be transformed into trace $t_{i+1}$ by modifying parts of the trace $t_{i}$ selected by the subproblem selection strategy, then it must be possible to transform trace $t_{n}$ into trace $t_{1}$ by modifying parts of the trace $t_{n}$ selected by the subproblem selection strategy. Intuitively, it must be possible to reverse any changes that can be made by countably applying the same subproblem selection strategy to a starting trace.

•

Connectivity: The combination of all of the subproblem selection strategies in the inference metaprogram must connect the entire probability space. Given two traces $t$ and $t^{\prime}$ , we say that the subproblem selection strategies connect $t$ and $t^{\prime}$ if it is possible to transform $t$ into $t^{\prime}$ by modifying the parts of $t$ selected by one of the subproblem selection strategies.

The subproblem selection strategies connect the probability space if there do not exist sets of traces $U$ and $V$ such that 1) the probability of $U\cup V$ sums to one and 2) there does not exist $t\in U$ and $t^{\prime}\in V$ such that the subproblem selection strategies connect $t$ and $t^{\prime}$ .

Conceptually, these two restrictions together ensure that the hybrid inference algorithm defined by the inference metaprogram does not become stuck in a subset of the positive probability space and therefore unable to sample some positive probability set.

Effective probabilistic programming requires subproblem identification and hybrid probabilistic inference algorithms applied to the identified subproblems. The results in this paper enable the sound and complete decomposition of otherwise intractable probabilistic inference problems into tractable subproblems solved by different inference algorithms. It also characterizes properties that entail asymptotic convergence of these resulting hybrid probabilistic inference algorithms.

2. Language and Execution Model

Our treatment of subproblem selection, extraction, and stitching works with a core probabilistic programming language (Figure 1) based on the lambda calculus. A program in this language is a sequence of $\mathsf{assume}$ and $\mathsf{observe}$ statements. Expressions are derived from the untyped lambda calculus augmented with the $\mathsf{Dist}(e)$ expression, which allows the program to sample from a distribution $\mathsf{Dist}$ given parameter $e$ .

The core language supports computable distributions over computable expressions (including computable reals). We believe it is straightforward to generalize the language to include more general probability spaces (e.g., probability spaces including uncomputable reals) at the cost of a larger formalism. The mathematical framework we use to prove convergence (Section 4) works over general probability spaces including probability spaces with uncomputable objects.

$\mathsf{Dist}(e)$ can be seen as a set of probabilistic lambda calculus expressions $\{e_{d}|e_{d}\in\mathsf{Dist}(e)\subseteq E_{v}\}$ . Based on the parameter expression $e$ , $\mathsf{Dist}(e)$ makes a stochastic choice and returns an expression $e_{v}\in\mathsf{Dist}(e)$ . We define:

[TABLE]

Traces: When a program executes, it produces an execution trace (Figure 2). This trace records the executed sequence of assume and observe commands, including the value of each evaluated (sub)expression. It also assigns a unique identifier to each evaluated (sub)expression and stochastic choice. These identifiers will be later used to construct a dependence graph used to define the subproblem given a set of stochastic choices in the subproblem.

We define the execution, including the generation of valid traces $t$ , with the transition relation $\Rightarrow_{s}\subseteq\Sigma_{v}\times\Sigma_{id}\times P\rightarrow T$ (Figure 3). Conceptually, the transition relation executes program $p$ under the environment $\sigma_{v},\sigma_{id}$ to obtain a trace $t$ , where $\sigma_{v}:Vars\rightarrow V$ and $\sigma_{id}:Vars\rightarrow ID$ . $\sigma_{v}$ is a map from variable name to its corresponding assigned value, whereas $\sigma_{id}$ gives the $id$ of the expression which assigned this value to that variable. Because of the nondeterminism associated with stochastic choices, the execution strategy matters for the semantics of the language. We use call by value as the execution strategy and forbid the execution of expressions within a lambda.

Given a program $p$ , we define the set of all valid traces which can be obtained by executing $p$ as $T_{p}=\mathsf{Traces}(p)$ .

[TABLE]

Given a trace, we can drop the computed values and assigned $id$ s and reroll the augmented expressions to recover the underlying program. The transition relation $\Rightarrow_{r}\subseteq T\rightarrow P$ (Figure 4) formalizes this procedure. Given a trace $t$ , we define

[TABLE]

Note that $\forall\leavevmode\nobreak\ t,p.\leavevmode\nobreak\ t\in\mathsf{Traces}(p)\implies p=\mathsf{Program}(t)$ . The reverse may not be true as there are additional constraints that valid traces must satisfy. Two traces are equivalent if and only if they differ at most in the choice of unique identifiers selected for each augmented expression and stochastic choice.

Dependence Graphs: Given a trace $t$ , we define the dependence graph $\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle=\mathsf{Graph}(t)$ as a 3-tuple $\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle$ where ${\mathcal{}N}:\mathsf{ID}\rightarrow\{\perp,\mathsf{Sample}\}$ is a map from $ID$ to either $\perp$ (when the corresponding augmented expression for an $id\in ID$ is a deterministic computation) or $\mathsf{Sample}$ (when the augmented expression for an $id\in ID$ makes a stochastic choice).

${\mathcal{}D}\subseteq\mathsf{ID}\times\mathsf{ID}$ are data dependence edges. There is a data dependence edge $\langle id_{1},id_{2}\rangle\in{\mathcal{}D}$ if the value of the augmented expression $id_{2}$ directly depends on the augmented expression $id_{1}$ . ${\mathcal{}E}\subseteq\mathsf{ID}\times\mathsf{ID}$ are existential edges. There is a existential edge $\langle id_{1},id_{2}\rangle\in{\mathcal{}E}$ if the value of the augmented expression $id_{1}$ controls whether or not an augmented expression $id_{2}$ executed. For example, in a lambda application $(ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{3}$ , all augmented expressions in $ae_{3}$ were executed only because of the value of $ae_{1}$ . Changing the value of $ae_{1}$ would require dropping the augmented expression $ae_{3}$ and recomputing another expression based on the new value of $ae_{1}$ .

We formalize the dependence graph generation procedure as a transition relation $\Rightarrow_{g}\subseteq T\rightarrow\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle$ (Figure 5). We use the shorthand $\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle=\mathsf{Graph}(t)$ if $\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle$ is the dependence graph for trace $t$ i.e. $t\Rightarrow_{g}\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle$ .

Valid Subproblems: Subproblem inference must 1) change only the identified subproblem and not the enclosing trace while 2) producing a valid trace for the full probabilistic program. Valid subproblems must therefore include all parts of the trace that may change if any part of the subproblem changes. We formalize this requirement as follows.

Given a trace $t$ with dependence graph $\langle{\mathcal{}N},{\mathcal{}D},{\mathcal{}E}\rangle=\mathsf{Graph}(t)$ , a valid subproblem ${\mathcal{}S}\subseteq\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}$ must satisfy two properties: 1) there are no outgoing existential edges and 2) all outgoing data dependence edges must terminate at a stochastic choice ( $\mathsf{Sample}$ node).

The first property ensures that parts of the trace which were executed due to values of expressions in the subproblem are also part of the subproblem. An example of this is lambda evaluation $(ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{3}$ . If the value of $ae_{1}$ can be changed by the subproblem inference, $ae_{3}$ may or may not exist. Hence $ae_{3}$ should be within the subproblem to ensure that the inference algorithm can change it if necessary.

The second property ensures that any change made by the subproblem inference can be absorbed by a stochastic choice. For example, when the internal parameter of a $\mathsf{Dist}$ changes, the change can be absorbed by changing the probability of the trace to account for the change in the probability of the value generated by the execution of the absorbing $\mathsf{Dist}$ node. The changes are absorbed by the stochastic choice and do not propagate further into the remaining parts of the trace outside the subproblem. We formalize these two properties as follows:

•

$\forall id\in{\mathcal{}S}.\leavevmode\nobreak\ \langle id,id_{o}\rangle\in{\mathcal{}E}\implies id_{o}\in{\mathcal{}S}$

•

$\forall id\in{\mathcal{}S}.\leavevmode\nobreak\ \langle id,id_{o}\rangle\in{\mathcal{}D}\wedge id_{o}\in\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}-{\mathcal{}S}\implies{\mathcal{}N}(id_{o})=\mathsf{Sample}$

The absorbing set ${\mathcal{}A}\subseteq\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}-{\mathcal{}S}$ of a subproblem ${\mathcal{}S}$ is the set of stochastic choices whose value directly depends on the nodes in the subproblem i.e. ${\mathcal{}A}=\{id_{a}|id_{a}\in\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}-{\mathcal{}S}\wedge\exists\leavevmode\nobreak\ id_{i}\in{\mathcal{}S}.\leavevmode\nobreak\ \langle id_{i},id_{o}\rangle\in{\mathcal{}D}\}$ . The input boundary ${\mathcal{}B}\subseteq\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}-{\mathcal{}S}$ of a subproblem ${\mathcal{}S}$ is the set of nodes on which the subproblem directly depends, i.e., ${\mathcal{}B}=\{id_{b}|id_{b}\in\mathsf{dom}\leavevmode\nobreak\ {\mathcal{}N}-{\mathcal{}S}\wedge\forall\leavevmode\nobreak\ id_{i}\in{\mathcal{}S}.\leavevmode\nobreak\ \langle id_{b},id_{i}\rangle\in{\mathcal{}D}\}$ .

Entangled Subproblem Inference: Following (Mansinghka et al., 2018), we define entangled subproblem inference using the $\mathsf{infer}$ procedure (Mansinghka et al., 2018), which takes as parameters a subproblem selection strategy $\mathsf{SS}$ , an inference tactic $\mathsf{IT}$ , and an input trace $t$ . The subproblem inference mutates $t$ to produce a new trace $t^{\prime}$ .

[TABLE]

This formulation works with arbitrary subproblem selection strategies $\mathsf{SS}$ . The requirement is that, given a trace $t$ , $\mathsf{SS}$ must produce a valid subproblem ${\mathcal{}S}$ over $t$ . In practice, one way to satisfy this requirement is to allow the programmer to specify a (potentially arbitrary) set of stochastic choices that must be in the subproblem, with the language implementation completing these choices into a valid subproblem (Mansinghka et al., 2018).

We also work with inference algorithms $\mathsf{IT}$ that take as input a full program trace $t$ and a valid subproblem ${\mathcal{}S}$ and return a mutated full program trace $t^{\prime}$ . We require that the output trace $t^{\prime}$

is from the same program as the trace $t$ and
$t^{\prime}$ differs from $t$ only in a) the stochastic choices from the subproblem ${\mathcal{}S}$ and b) the deterministic computations that depend on these stochastic choices. We formalize these constraints as

•

$t^{\prime}\in\mathsf{Traces}(\mathsf{Program}(t))$

•

${\mathcal{}S}\vdash t\equiv t^{\prime}$

Figure 6 presents the definition of $\equiv$ . Note that operating with entangled subproblems forces the inference tactic $\mathsf{IT}$ to take the full program trace $t$ as a parameter even though it must modify at most only the subproblem.

3. Independent Subproblem Inference

The basic idea of independent subproblem inference is to extract an independent subtrace $t_{s}$ from the original trace $t$ given a subproblem ${\mathcal{}S}$ , perform inference over the extracted subtrace $t_{s}$ to obtain a new trace $t_{s}^{\prime}$ , then stitch $t_{s}^{\prime}$ back into $t$ to obtain a new trace for the full program. Here, consistent with standard inference techniques for probabilistic programs (Wingate et al., 2011; Mansinghka et al., 2018), $t_{s}$ and $t_{s}^{\prime}$ are valid traces of the same program $p_{s}$ (the subprogram for the subtraces $t_{s}$ and $t_{s}^{\prime}$ ). The key challenge is converting the entangled subproblem (which is typically incomplete and therefore not a valid trace of any program) into a valid trace by transforming the subproblem to include external dependences and correctly scope both internal and external dependences in the extracted trace without giving the inference algorithm access to any external stochastic choices (including latent choices nested inside certain lambda expressions which would otherwise override choices outside the subproblem) which it must not change.

Extract Trace: We define the extraction procedure $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ using the transition relation $\Rightarrow_{ex}\subseteq{\mathcal{}P}(ID)\times T\rightarrow T$ (Figure 7). The extraction procedure removes $\mathsf{Dist}(ae\#id)=ae_{v}$ augmented expressions which are not within the subproblem and converts them into $\mathsf{observe}$ statements. This transformation constrains the value of these stochastic choices to the values present in the original trace. It leaves the stochastic choices in the subproblem in place and therefore accessible to the inference algorithm.

For augmented expressions of the form $(ae_{1}\leavevmode\nobreak\ ae_{2})y=ae_{3}$ , when $ae_{1}$ is within the subproblem, its value can change and hence existential edges place the augmented expressions in $ae_{3}$ within the subproblem. When $ae_{1}$ is not within the subproblem, then some stochastic choices may or not be within the subproblem. If we keep the augmented expression as is, the inference algorithm may unroll $ae_{3}$ and execute it again, changing some stochastic choices in $ae_{3}$ but not in the subproblem. If we modify $ae_{3}$ , the constraint of $ae_{3}$ being a valid lambda application breaks. We solve this problem by introducing $\mathsf{assume}$ statements and correctly scoping the resulting dependences.

Stitch Trace: Given a trace $t$ , a valid subproblem ${\mathcal{}S}$ over the trace, and a subtrace $t_{s}$ , the stitching procedure stitches back the trace $t_{s}$ into $t$ to get a new trace $t^{\prime}$ . We define the stitching procedure $t^{\prime}=\mathsf{StitchTrace}(t,t_{s},{\mathcal{}S})$ using a transition relation $\Rightarrow_{st}\subseteq{\mathcal{}P}(ID)\times T\times T\rightarrow T$ (Figure 8), where ${\mathcal{}S}\vdash t,t_{s}\Rightarrow_{st}t^{\prime}\iff t^{\prime}=\mathsf{StitchTrace}(t,t_{s},{\mathcal{}S})$ . Stitching is the dual of extraction. It uses the original trace to figure out the structure of the resultant trace, then stitches back the expressions to get a new trace $t^{\prime}$ .

Independent Inference: We define independent subproblem inference using the $\mathsf{infer}$ procedure. $\mathsf{infer}$ takes as input a subproblem selection strategy $\mathsf{SS}$ , a trace $t$ and an inference tactic $\mathsf{IT}$ . This differs from tangled inference in that inference tactic $\mathsf{IT}$ takes only the extracted subtrace as input and not the entire program trace. This approach enables the use of standard inference algorithms which are designed to operate on complete traces (and not entangled subproblems).

The new inference procedure works as described below:

[TABLE]

Soundness and Completeness: Given a trace $t$ , a valid subproblem ${\mathcal{}S}$ , an inferred trace $t^{\prime}$ and $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ , we prove that extraction and stitching is sound and complete. Soundness in this context means that for all possible mutated subtraces $t^{\prime}_{s}\in\mathsf{Traces}(\mathsf{Program}(t_{s}))$ , the stitched trace $t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},{\mathcal{}S})$ is a valid inferred trace. Completeness means that for all possible inferred traces $t^{\prime}$ , there exists a mutated subtrace $t^{\prime}_{s}$ such that $t^{\prime}_{s}\in\mathsf{Traces}(\mathsf{Program}(t_{s}))$ and $t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},{\mathcal{}S})$ .

We summarize the comparison between the entangled subproblem inference and independent subproblem inference approaches in Figure 9. We present the theorems, lemmas and proofs of soundness and completeness in Appendix A.2 and A.3.

4. Convergence of Stochastic Alternating Class Kernels

We next introduce the concept of class functions and class kernels, which we use to prove the convergence of hybrid inference algorithms based on asymptotically converging MCMC-algorithms. We present key definitions, theorems, and lemmas required to prove these results.

4.1. Preliminaries

We begin by introducing some key measure theory definitions (Geyer, 1998; Meyn and Tweedie, 2012; Tierney, 1994). Readers familiar with measure theory may wish to skip this subsection.

Definition 0 (Topology).

A Topology on set $T$ is a collection ${\mathcal{}T}$ of subsets of $T$ having the following properties:

(1)

$\emptyset\in{\mathcal{}T}$ * and $T\in{\mathcal{}T}$ .* 2. (2)

${\mathcal{}T}$ * is closed under arbitrary unions. i.e. For any collection $\{A_{i}\}_{i\in I}$ , if for all $i\in I$ , $A_{i}\in{\mathcal{}T}$ , then $\bigcup\limits_{i\in I}A_{i}\in{\mathcal{}T}$ .* 3. (3)

${\mathcal{}T}$ * is closed under finitely many intersections. i.e. For any finite collection $\{A_{i}\}_{i\in I}$ , if for all $i\in I$ , $A_{i}\in{\mathcal{}T}$ , then $\bigcap\limits_{i\in I}A_{i}\in{\mathcal{}T}$ .*

Given a set $T$ and a topology ${\mathcal{}T}$ defined on $T$ , the pair $(T,{\mathcal{}T})$ is called a topological space. Given a topological space $(T,{\mathcal{}T})$ , all sets $A\in{\mathcal{}T}$ are called open sets. From this point on, when we say $T$ is a topological space, we assume we are talking about any topology ${\mathcal{}T}$ on $T$ .

Definition 0 ( $\sigma$ -algebra).

Let $T$ be a set. A collection $\Sigma$ of subsets of $T$ is a $\sigma$ -field* or a $\sigma$ -algebra over $T$ if and only if $T\in\tau$ and $\tau$ is closed under countable unions, intersections and complements, i.e.,*

(1)

$T\in\Sigma$ * and $\emptyset\in\Sigma$ .* 2. (2)

$A\in\Sigma$ * implies $A^{c}\in\Sigma$ .* 3. (3)

$\Sigma$ * is closed under countable unions, i.e. For any countable collection $\{A_{i}\}_{i\in I}$ , if for all $i\in I$ , $A_{i}\in\Sigma$ , then $\bigcup\limits_{i\in I}A_{i}\in\Sigma$ .*

*A measurable space is a pair $(T,\Sigma)$ such that $T$ is a set and $\Sigma$ is a $\sigma$ -algebra over $T$ . *

Definition 0 (Measure).

$m:\Sigma\rightarrow\mathbb{R}\cup\{-\infty,\infty\}$ * is a measure over a measurable space $(T,\Sigma)$ if*

(1)

$m(A)\geq m(\emptyset)=0$ * for all $A\in\Sigma$ ,* 2. (2)

For all countable collections $\{A_{i}\}_{i\in I}$ of pairwise disjoint sets in $\Sigma$ ,

[TABLE]

Given a measurable space $(T,\Sigma)$ , a measure $\pi$ on $(T,\Sigma)$ is called probability measure if $\pi(T)=1$ . We call the tuple $(T,\Sigma,\pi)$ a probability space, if $\pi$ is a probability measure over the measurable space $(T,\Sigma)$ . Given a set $T$ , a collection of subsets $A_{\alpha}\subseteq T$ (not necessarily countable), we denote the smallest $\sigma$ -algebra $\Sigma$ such that $A_{\alpha}\in\Sigma$ for all $\alpha$ by $\sigma(\{A_{\alpha}\})$ .

Definition 0 (Borel $\sigma$ -algebra).

Given a topological space $T$ , a Borel $\sigma$ -algebra, ${\mathcal{}B}(T)$ is the smallest $\sigma$ -algebra containing all open sets of $T$ .

$(T,{\mathcal{}B}(T))$ is called a Borel Space when $T$ is a topological space and ${\mathcal{}B}(T)$ is a Borel $\sigma$ -algebra over $T$ .

Consider a topological space $(T,{\mathcal{}T})$ . For any set $A\in{\mathcal{}T}$ , $(A,{\mathcal{}T}_{A})$ is also a topological space (where ${\mathcal{}T}_{A}=\{E\cap A|E\in{\mathcal{}T}\}$ ). Given a measurable space $(T,\Sigma)$ , for any set $A\in\Sigma$ , $(A,\Sigma_{A})$ is also a measurable space (where $\Sigma_{A}=\{E\cap A|E\in\Sigma\}$ ).

Topology and $\sigma$ -algebra over Reals: Consider the smallest topology $R$ over the real space $\mathbb{R}$ which contains all intervals $(a,\infty)\subseteq\mathbb{R}$ for all $-\infty<a<\infty$ . To avoid confusion, we will refer the topological space over $\mathbb{R}$ with the symbol ${\mathcal{}R}$ . We can now use this topology to define a Borel $\sigma$ -algebra ${\mathcal{}B}({\mathcal{}R})$ over this topological space. Using the above defined topology and $\sigma$ -algebra, we can define a topological space and a $\sigma$ -algebra for any open or closed intervel in $\mathbb{R}$ .

Definition 0 (Measurable Function over Measurable Spaces).

Given measurable spaces $(T_{1},\Sigma_{1})$ and $(T_{2},\Sigma_{2})$ , a function $h:T_{1}\rightarrow T_{2}$ is a measurable function from $(T_{1},\Sigma_{1})$ to $(T_{2},\Sigma_{2})$ if $h^{-1}\{B\}\in\Sigma_{1}$ for all sets $B\in\Sigma_{2}$ , where $h^{-1}\{B\}=\{x:h(x)\in B\}$ .

The measurable function $h$ is also known as a Random Variable from measurable space $(T_{1},\Sigma_{1})$ to $(T_{2},\Sigma_{2})$ . If $h:T_{1}\rightarrow T_{2}$ is a measurable function from measurable space $(T_{1},\Sigma_{1})$ to a measurable space $(T_{2},\Sigma_{2})$ , and $\pi$ is a probability measure on $(T_{1},\Sigma_{1})$ , then $\pi_{h}:\Sigma_{2}\rightarrow[0,1]$ defined as $\pi_{h}(A)=\pi(h^{-1}(A))$ is a probability measure on $(T_{2},\Sigma_{2})$ .

Definition 0 (Pushforward measure).

Given a probability space $(T_{1},\Sigma_{1},\pi)$ and a measurable function to a measurable space $(T_{2},\Sigma_{2})$ , then the pushforward measure of $\pi$ is defined as a probability measure $f_{*}(\pi):\Sigma_{2}\rightarrow[0,1]$ given by

[TABLE]

Definition 0 (Measurable).

A function $f$ is measurable if $f$ is a measurable function from a measurable space $(T,\Sigma)$ to $(\mathbb{R},{\mathcal{}B}({\mathcal{}R}))$ .

$f$ is measurable if and only if $\forall a\in\mathbb{R}.\{x\in T|f(x)>a\}\in\Sigma$ . Intuitively, for every real number $-\infty<a<\infty$ , there exists a set $A\in\tau$ containing all elements which $f$ maps to real numbers greater than $a$ .

Definition 0 (Simple Function).

Given a measurable space $(T,\Sigma)$ , $s:T\rightarrow[0,\infty)$ is a simple function if $s(t)=\Sigma_{i=1}^{N}a_{i}I_{A_{i}}(t)$ , where $a_{i}\in[0,\infty)$ , $A_{i}\in\Sigma$ , $I_{A_{i}}(t)=1$ if $t\in A_{i}$ and [math] otherwise, and the $A_{i}$ are disjoint.

Definition 0 (Lebesgue Integral).

Given a measurable space $(T,\Sigma)$ and a measure $m$ over $(T,\Sigma)$ , for each $A\in\Sigma$ and disjoint $A_{i}\in\Sigma$ , we define

[TABLE]

Hence

[TABLE]

Given a function $f:T\rightarrow[0,\infty)$ which is measurable, we define

[TABLE]

where $s\leq f$ if $\forall t.s(t)\leq f(t)$ , as the Lebesgue Integral of function $f$ over a set $E$ in measurable space $(T,\tau)$ with measure $m$ .

Given a function $f:T\rightarrow\mathbb{R}$ which is measurable, we define

[TABLE]

where $f_{+}(t)=\max(f(t),0)$ and $f_{-}(t)=\max(-f(t),0)$ . An integral of a measurable function $f$ is the sum of the integral of the positive part and the integral of the negative part. From this point on, $\int f(t)m(dt)$ denotes $\int_{T}f(t)m(dt)$ where $T$ is the set over which the measurable space and measure $m$ is defined.

Definition 0 (Markov Transition Kernel).

Let $(T,\Sigma)$ be a measurable space. A Markov Transition kernel on $(T,\Sigma)$ is a map $K:T\times\Sigma\rightarrow[0,1]$ such that :

(1)

for any fixed $A\in\Sigma$ , the function $K(.,A)$ is measurable function from $(T,\Sigma)$ to $[0,1]$ . 2. (2)

for any fixed $t\in T$ , the function $K(t,.)$ is a probability measure on $(T,\Sigma)$ .

Definition 0 ( $\pi$ -irreducible).

Given a probability space $(T,\tau,\pi)$ , a Markov Transition Kernel $K:T\times\tau\rightarrow[0,1]$ is $\pi$ -irreducible if for each $t\in T$ and each $A\in\tau$ , such that $\pi(A)>0$ , there exists an integer $n=n(t,A)\geq 1$ such that

[TABLE]

where $K^{n}(t,A)=\int_{T}K^{n-1}(t,dt^{\prime})K(t^{\prime},A)$ and $K^{1}(t,A)=K(t,A)$ .

Definition 0 (Stationary Distribution).

Given a probability space $(T,\tau,\pi)$ , $\pi$ is the stationary distribution of a $\pi$ -irreducible Markov Transition Kernel $K:T\times\tau\rightarrow[0,1]$ if

[TABLE]

where $(\pi K)(A)=\int K(t,A)\pi(dt)$ .

Definition 0 (Aperiodicity).

Given a probability space $(T,\tau,\pi)$ , a $\pi$ -irreducible Markov Transition Kernel $K:T\times\tau\rightarrow[0,1]$ is periodic if there exists an integer $d\geq 2$ and a sequence $\{E_{0},E_{1},\ldots E_{d-1}\}$ and $N$ of $d$ non-empty disjoint sets in $\tau$ such that, for all $i=0,1,\ldots d-1$ and for all $t\in E_{i}$ ,

(1)

$(\cup_{i=0}^{d}E_{i})\cup N=T$ ** 2. (2)

$K(t,E_{j})=1\text{ for }j=i+1(\mathsf{mod}\leavevmode\nobreak\ d)$ ** 3. (3)

$\pi(N)=0$ **

Otherwise $K$ is aperiodic.

Definition 0 (Asymptotic convergence).

Given a probability space $(T,\tau,\pi)$ and sample $t\in T$ , a Markov Transition Kernel $K:T\times\tau\rightarrow[0,1]$ is said to asymptotically converge to $\pi$ if

[TABLE]

where $||.||$ refers to the total variation norm of a measure $\lambda$ , defined over measurable space $(T,\tau)$ , defined as

[TABLE]

Theorem 15.

Given a probability space $(T,\tau,\pi)$ and a Markov Transition Kernel $K:T\times\tau\rightarrow[0,1]$ . If $K$ is $\pi$ -irreducible, aperiodic, and $\pi K=\pi$ holds, then for $\pi$ -almost all $t$ ,

[TABLE]

i.e., $K$ converges to $\pi$ . $\pi$ -almost all $t$ means that there exists a set $D\subseteq T$ such that $\pi(D)=1$ and for all $t\in D$ , $\mathsf{lim}_{n\rightarrow\infty}||K^{n}(t,.)-\pi||=0$ .

Athreya, Doss, and Sethuraman proved this theorem (Athreya et al., 1996). All popular asymptotically converging Markov Chain Algorithms, like variants of the Metropolis Hasting and Gibbs Algorithm, when parameterized over probability space $(T,\tau,\pi)$ , are $\pi$ -irreducible and aperiodic with stationary distribution $\pi$ .

Definition 0 (Subalgebra).

${\mathcal{}E}$ * is a $\sigma$ -subalgebra of a measurable space $(T,\tau)$ if ${\mathcal{}E}$ is a $\sigma$ -algebra of some set $A\subseteq T$ and ${\mathcal{}E}\subseteq\tau$ .*

Definition 0 (Induced Probability space).

Given a probability space $(T,\tau,\pi)$ and $A\in\tau$ , define $\tau_{A}=\{B\cap A|B\in\tau\}$ . Note that since $\tau$ is a $\sigma$ -algebra, $\tau_{A}$ is a sub-algebra over set $A$ . $(A,\tau_{A})$ is a measurable space. If $\pi(A)>0$ , then function $\pi_{A}:\tau_{A}\rightarrow[0,1]$ , defined as $\pi_{A}(x)=\pi(x)/\pi(A)$ , is a probability measure over $(A,\tau_{A})$ . $(A,\tau_{A},\pi_{A})$ is the probability space induced by $A\in\tau$ .

Definition 0 (Regular Conditional Probability Measure over Product Space).

Given a product probability space $(T_{1}\times T_{2},\Sigma_{1}\otimes\Sigma_{2},\pi)$ , a regular conditional probability measure $v:T_{1}\times\Sigma_{2}\rightarrow[0,1]$ is a transition kernel such that

•

For all $A\in\Sigma_{2}$ , $v(.,A)$ is a measurable.

•

For all $t\in T_{1}$ , $v(t,.)$ is a probability measure over $(T_{1},\Sigma_{1})$ .

•

For all $B\in\Sigma_{1}$ , $\pi(B\times A)=\int_{B}v(t,A)\pi(dt\times T_{2})$ .

4.2. Class Functions and Class Kernels

We next introduce class functions and class kernels, which formalize the concept of subproblem based inference metaprograms. Gibbs sampling is a special case of this framework. To aid the reader, we highlight how the framework is specialized to Gibbs sampling as we introduce the definitions, lemmas, and theorems.

Definition 0 (Two-way measurable function).

Given a measurable space $(T_{1},\Sigma_{1})$ and a measurable space $(T_{2},\Sigma_{2})$ , a measurable function $f$ from $(T_{1},\Sigma_{1})$ to $(T_{2},\Sigma_{2})$ is a two-way measurable function, if for all sets $A\in\Sigma_{1}$ there exists a set $B\in\Sigma_{2}$ , such that

[TABLE]

i.e., the map of any set $A\in\Sigma_{1}$ is a set in $\Sigma_{2}$ .

Given a two-way measurable function $f$ , we can extend $f$ to a function $g:\Sigma_{1}\rightarrow\Sigma_{2}$ between sets in $\Sigma_{1}$ to sets in $\Sigma_{2}$ , where $g(A)=\{f(t)|t\in A\}$ . Since $f$ is a measurable function, the function $f^{-1}:\Sigma_{2}\rightarrow\Sigma_{1}$ is also defined which maps sets in $\Sigma_{2}$ to sets in $\Sigma_{1}$ .

Note: From this point on, given a two-way measurable function $f$ , we will simply use it to represent function $g$ defined above, mapping sets from $\Sigma_{1}$ to $\Sigma_{2}$ . We also use $f^{-1}$ as a reverse map from $\Sigma_{2}$ to $\Sigma_{1}$ defined above. Note that $f^{-1}$ may or may not be the inverse of the function $f$ .

Example 0.

(Gibbs) Given measurable spaces $(X,{\mathcal{}X})$ , $(Y,{\mathcal{}Y})$ and product space $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y})$ , the projection functions $\mathsf{proj}_{x}$ and $\mathsf{proj}_{y}$ are two-way measurable functions from $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y})$ to $(X,{\mathcal{}X})$ and $(Y,{\mathcal{}Y})$ , respectively.

Definition 0 (Generalized Product Space).

Given countable sets of disjoint measurable spaces $\{(X_{i},{\mathcal{}X}_{i})|i\in I\}$ and $\{(Y_{i},{\mathcal{}Y}_{i})|i\in I\}$ , we define the generalized product space $(C,{\mathcal{}C})$ as:

[TABLE]

Given a probability measure $\pi:{\mathcal{}C}\rightarrow[0,1]$ on a generalized product space $(C,{\mathcal{}C})$ , we can define a conditional distribution $\pi_{i}$ over each product space $(X_{i}\times Y_{i},{\mathcal{}X}_{i}\otimes{\mathcal{}Y}_{i})$ such that

[TABLE]

In the following we require, for each product probability space $(X_{i}\times Y_{i},{\mathcal{}X}_{i}\otimes{\mathcal{}Y}_{i},\pi_{i})$ , that we can construct a regular conditional probability measure $v_{\pi_{i}}:X_{i}\times{\mathcal{}Y}_{i}\rightarrow[0,1]$ , such that for each $U\times V\in{\mathcal{}X}_{i}\otimes{\mathcal{}Y}_{i}$

[TABLE]

When constructing a generalized product space, we always prove the above assumption, i.e., a regular conditional probability measure exists.

Example 0.

(Gibbs) Given measurable spaces $(X,{\mathcal{}X})$ , $(Y,{\mathcal{}Y})$ , consider the product probability spaces $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y},\pi)$ and $(Y\times X,{\mathcal{}Y}\otimes{\mathcal{}X},\pi^{\prime})$ , where

[TABLE]

A Gibbs sampler, sampling over $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y},\pi)$ , requires regular conditional probabilities $v_{\pi}:X\times{\mathcal{}Y}\rightarrow[0,1]$ and $v_{\pi^{\prime}}:Y\times{\mathcal{}X}\rightarrow[0,1]$ for independence sampling. Therefore both $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y})$ and $(Y\times X,{\mathcal{}X}\otimes{\mathcal{}Y})$ are generalized product spaces.

Definition 0 (Class Functions).

Given a measurable space $(T,\Sigma)$ and a generalized product space $(C,{\mathcal{}C})$ , a class function is a one-to-one two-way measurable function $f$ from $(T,\Sigma)$ to $(C,{\mathcal{}C})$ .

Given a class function $f$ and the target product space $(C,{\mathcal{}C})$ , the projection functions

[TABLE]

are also two-way measurable functions from space $(T,\Sigma)$ to projection spaces $\big{(}\bigcup\limits_{i\in I}X_{i},\sigma(\bigcup\limits_{i\in I}{\mathcal{}X}_{i})\big{)}$ and $\big{(}\bigcup\limits_{i\in I}Y_{i},\sigma(\bigcup\limits_{i\in I}{\mathcal{}Y}_{i})\big{)}$ respectively.

Section 5 uses class functions to model subproblem selection strategies — each class function produces a tuple $\langle x,y\rangle$ that identifies the parts of the trace that are outside ( $x$ ) and inside ( $y$ ) the selected subproblem. Because the class function may depend on the input trace $t$ , our framework supports subproblem selection strategies that depend on the input trace (and specifically on the values of stochastic choices in the trace).

Example 0.

(Gibbs) Given measurable space $(X,{\mathcal{}X})$ , $(Y,{\mathcal{}Y})$ and generalized product spaces $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y})$ and $(Y\times X,{\mathcal{}Y}\otimes{\mathcal{}X})$ , the identity function $id:X\times Y\rightarrow X\times Y$ and the reverse function $re:X\times Y\rightarrow Y\times X$ (i.e. $re(\langle x,y\rangle)=\langle y,x\rangle$ ) are class functions from $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y})$ to $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y})$ and $(Y\times X,{\mathcal{}Y}\otimes{\mathcal{}X})$ . Note that $id_{x}=\mathsf{proj}_{x}$ and $re_{x}=\mathsf{proj}_{y}$ .

Given a probability space $(T,\Sigma,\pi)$ , a class function $f$ to a generalized product space $(C^{f},{\mathcal{}C}^{f})=\big{(}\bigcup\limits_{i\in I}X^{f}_{i}\times Y^{f}_{i},\sigma(\bigcup\limits_{i\in I}{\mathcal{}X}^{f}_{i}\otimes{\mathcal{}Y}^{f}_{i})\big{)}$ , $f$ maps each point $t\in T$ to some point $\langle x,y\rangle$ in $X^{f}_{i}\times Y^{f}_{i}$ for some $i$ .

For each $i\in I$ , we require a function $K_{i}:X^{f}_{i}\rightarrow(Y^{f}_{i}\times{\mathcal{}Y}^{f}_{i})\rightarrow[0,1]$ such that

•

$K_{i}(x):Y^{f}_{i}\times{\mathcal{}Y}^{f}_{i}\rightarrow[0,1]$ is a $v_{f_{*}(\pi)_{i}}(x,.)$ -irreducible, aperiodic Markov Transition Kernel with $v_{f_{*}(\pi)_{i}}(x,.)$ as it’s stationary distribution.

•

$K_{i}(.)(y,A)$ is measurable for all $y\in Y^{f}_{i}$ and all $A\in{\mathcal{}Y}^{f}_{i}$ .

When we apply the framework to prove the convergence of inference metaprograms (Section 5), each $K_{i}$ represents an function that, when given a parameter $x$ , returns a converging Markov kernel based on $x$ , where $x$ corresponds loosely to the parts of the program trace that lie outside the scope of the selected subproblem. These parameterized $K_{i}$ support sophisticated subproblem inference strategies that depend on how the subproblem decomposes the program trace. For example, the metaprogram may apply one inference strategy to subproblems with discrete random choices and another to subproblems with continuous choices.

Definition 0 (Class Kernels).

Given $(T,\Sigma,\pi),f,$ and $K_{i}$ as above, we define a class kernel $K_{f}:T\times\Sigma\rightarrow[0,1]$ as a Markov Transition Kernel defined as

[TABLE]

where $f(t)=\langle x,y\rangle\in X_{i}\times Y_{i}$ and $U\times V=f(A)\cap(X_{i}\times Y_{i})$ .

Example 0.

(Gibbs) Consider a probability space $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y},\pi)$ and class functions $id$ and $re$ , Markov transition Kernels $K_{id}(\langle x,y\rangle,U\times V)=v_{\pi}(x,V)I(x,U)$ and $K_{re}(\langle x,y\rangle,V\times U)=v_{\pi^{\prime}}(y,U)I(y,V)$ representing independence samplers are class kernels.

4.3. Properties of Class Kernels

Consider a probability space $(T,\Sigma,\pi)$ and a class function $f$ to a generalized product space $(C^{f},{\mathcal{}C}^{f})=\big{(}\bigcup\limits_{i\in I}X^{f}_{i}\times Y^{f}_{i},\sigma(\bigcup\limits_{i\in I}{\mathcal{}X}^{f}_{i}\otimes{\mathcal{}Y}^{f}_{i})\big{)}$ . Below, we prove properties of a Class Kernel $K_{f}$ within this context.

Lemma 0.

For all $t\in T$ and $A\in\Sigma$ ,

[TABLE]

where $f(t)=\langle x,y\rangle\in X_{i}\times Y_{i}$ and $U\times V=f(A)\cap X_{i}\times Y_{i}$ .

We present the proof of this lemma in Appendix A.1 (Lemma 1).

Lemma 0.

[TABLE]

We present the proof of this lemma in Appendix A.1 (Lemma 2).

Lemma 0.

$K_{f}$ * is aperiodic if for at least one $x\in X_{i}$ for some $i\in I$ , $K_{i}(x):Y_{i}\times{\mathcal{}Y}_{i}\rightarrow[0,1]$ is aperiodic.*

We present the proof of this lemma in Appendix A.1 (Lemma 3).

4.4. Connecting the probability space

We use a finite set of class functions ${\mathcal{}F}=\{f_{1},f_{2}\ldots f_{n}\}$ to model the inference steps of the hybrid inference metaprogram (Section 5). A critical concept here is that, together, the ${\mathcal{}F}$ must connect the underlying probability space — conceptually, starting at any positive probability set contained within the space, it must be possible to reach any other positive probability set by following a path of positive probability sets as defined by ${\mathcal{}F}$ . If ${\mathcal{}F}$ does not connect the space, it is possible for the inference metaprogram to become stuck within an isolated subspace, with some positive probability sets unreachable even in the limit.

Definition 0 (Connecting the space $(T,\Sigma,\pi)$ ).

Given a probability space $(T,\Sigma,\pi)$ , a finite set of class functions ${\mathcal{}F}=\{f_{1},f_{2}\ldots f_{n}\}$ connect the probability space $(T,\Sigma,\pi)$ , if for all sets $A\in\Sigma$ and any two functions $f,g\in{\mathcal{}F}$

[TABLE]

For standard two-component Gibbs sampling, there are only two class functions, specifically $id$ and $re$ :

Example 0 (Connected product space).

(Gibbs) Given a probability space $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y},\pi)$ and class functions $id$ and $re$ , then $id$ and $re$ connect the space $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y},\pi)$ , if for all sets $U\times V\in{\mathcal{}X}\otimes{\mathcal{}Y}$ ,

[TABLE]

as

[TABLE]

and

[TABLE]

4.5. Stochastic Alternating Class Kernels

Consider a probability space $(T,\Sigma,\pi)$ where $\Sigma$ is countably generated. We construct a new Markov Chain Transition Kernel using a finite set of class functions ${\mathcal{}F}=\{f_{1},f_{2}\ldots f_{m}\}$ and respective class Kernels $K_{f_{1}},K_{f_{2}}\ldots K_{f_{m}}$ .

Definition 0 (Stochastic Alternating Markov Chain Transition Kernel).

Given $m$ positive real numbers $p_{k}\in(0,1)$ which sum to $1$ (i.e. $\sum\limits_{k=1}^{m}p_{k}=1$ ), we define a stochastic alternating Markov Chain Transition Kernel $K:T\times\Sigma\rightarrow[0,1]$ as

[TABLE]

This transition kernel corresponds to randomly picking a class kernel $K_{f_{k}}$ with probability $p_{k}$ and using it to transition into the next Markov Chain State.

Example 0.

(Gibbs) The Markov Transition Kernel for a 2 component Gibbs sampler over product probability space $(X\times Y,{\mathcal{}X}\otimes{\mathcal{}Y},\pi)$ is

[TABLE]

We now consider the question of convergence of the Stochastic Alternating Transition Kernel.

Let $R_{t}^{k}=\{A\in\Sigma\wedge\exists 1\leq n\leq k.K^{n}(t,A)>0\}$ be the set of all sets in $\Sigma$ which are reachable by kernel $K$ in $k$ steps starting from element $t\in T$ .

Consider the limiting case $R_{t}^{\infty}$ . Let $B_{t}^{\infty}=\{t\in T|\forall\leavevmode\nobreak\ A\in\Sigma.A\notin R_{t}^{\infty}\implies t\notin A\}$ which is the set of elements $t^{\prime}\in T$ such that any set $A$ in $\Sigma$ which contains $t^{\prime}$ is reachable by kernel $K$ .

Lemma 0.

For any $f\in{\mathcal{}F}$ , any element $t^{\prime}\in B_{t}^{\infty}$ such that $f(t^{\prime})=\langle x,y\rangle\in X_{i}^{f}\times Y_{i}^{f}$ , and any set $U\times V\in\sigma({\mathcal{}X}_{i}^{f}\otimes{\mathcal{}Y}_{i}^{f})$ such that $A=f^{-1}(U\times V)$ , the following condition holds:

[TABLE]

We present the proof of this lemma in Appendix A.1 (Lemma 4).

Lemma 0.

For any positive probability set $A$ and any function $f\in{\mathcal{}F}$ , if $A\subseteq f^{-1}_{x}(f_{x}(B_{t}^{\infty}))$ then $A\in R_{t}^{\infty}$ .

We present the proof of this lemma in Appendix A.1 (Lemma 5).

Lemma 0.

If ${\mathcal{}F}$ connects the space $(T,\Sigma,\pi)$ then there does not exist a positive probability set $A\in\Sigma$ , such that $A\subseteq\bigcap_{f\in{\mathcal{}F}}f_{x}^{-1}((f_{x}(B^{\infty}_{t}))^{c})$ .

We present the proof of this lemma in Appendix A.1 (Lemma 6).

Theorem 37.

If ${\mathcal{}F}$ connects the space $(T,\Sigma,\pi)$ then the Markov Transition Kernel $K$ is $\pi$ -irreducible.

We present the proof of this theorem in Appendix A.1 (Theorem 7).

Theorem 38.

$\pi$ * is the stationary distribution of Markov Kernel $K$ , i.e.*

[TABLE]

We present the proof of this theorem in Appendix A.1 (Theorem 8).

Theorem 39.

The Markov Transition Kernel $K$ is aperiodic if at least one of the class kernels $K_{f_{j}}$ is aperiodic.

We present the proof of this theorem in Appendix A.1 (Theorem 9).

Theorem 40.

Markov Transition Kernel $K$ converges to probability distribution $\pi$ .

Proof.

Using Theorems 15, 37, 38, and 39. ∎

5. Inference Metaprograms

We next formalize the concept of the probability of a trace, introduce inference metaprogramming, and use the results in Section 4 to prove the convergence of inference metaprograms.

5.1. Preliminaries

Here we relate the concepts in Section 4 to concepts used in probabilistic programming.

Probability of a Trace: Because two traces are equivalent if they differ (if at all) only in the choice of unique identifiers, within this section, for clarity, we drop the $id$ ’s associated with augmented expressions and stochastic choices in traces and augmented expressions wherever they are not required.

Assuming a countable set of variable names allowed in our probabilistic lambda calculus language, a countable number of expressions and a countable number of programs can be described in our probabilistic lambda calculus language.

Within our probabilistic lambda calculus language, we assume that all stochastic distributions $\mathsf{Dist}_{k}:V\times{\mathcal{}P}(E)\rightarrow[0,1]$ are functions from a tuple of value and set of lambda expressions in our language to a real number between [math] and $1$ , such that for any $v\in V$ , $\mathsf{Dist}_{k}(v,.)$ is a probability measure over probability space $(E,{\mathcal{}P}(E))$ . We assume that for each distribution $\mathsf{Dist}_{k}$ , we are given a probability density function $\mathsf{pdf}_{\mathsf{Dist}_{k}}:V\times E\rightarrow[0,1]$ , such that

[TABLE]

Let $T_{p}=\mathsf{Traces}(p)$ be the set of valid traces of a program $p$ . $T_{p}$ contains a countable number of traces. We define a $\sigma$ -algebra $\Sigma_{p}$ over set $T_{p}$ such that, for all $t\in T_{p}$ , $\{t\}\in\Sigma_{p}$ . Given a trace $t$ , $\mathsf{pdf}\llbracket t\rrbracket$ is the unnormalized probability density of the trace $t$ (Figure 10). The normalized probability distribution $\mu_{p}:\Sigma_{p}\rightarrow[0,1]$ for a given program $p$ is defined as

[TABLE]

Reversible Subproblem Selection Strategy: Let $p$ be a probabilistic program, $t$ and $t^{\prime}$ be valid traces from program $p$ (i.e. $t,t^{\prime}\in\mathsf{Traces}(p)$ ), and $\mathsf{SS}$ be a subproblem selection strategy that returns a valid subproblem over $t$ .

Definition 0 (Reversible subproblem selection strategy).

A subproblem selection strategy $\mathsf{SS}$ is reversible if given any natural number $n$ and $n$ valid traces $t_{1},t_{2},\ldots t_{n}\in\mathsf{Traces}(p)$ ,

[TABLE]

i.e., given any $n$ traces $t_{1},t_{2}\ldots t_{n}$ , if we can transform the trace $t_{i}$ to get trace $t_{i+1}$ by only modifying parts of the trace $t_{i}$ selected by the subproblem selection strategy $\mathsf{SS}$ , then it is possible to transform the trace $t_{n}$ to get trace $t_{1}$ by modifying parts of the trace $t_{n}$ selected by the subproblem selection strategy $\mathsf{SS}$ .

In essence, reversible subproblem selection strategies always allow a subproblem based inference algorithm to reverse a countable number of changes it has made to a trace (given that the trace was not modified under a different subproblem selection strategy).

We use the shorthand $\mathsf{SS}\vdash t\equiv t^{\prime}$ to denote $\mathsf{SS}(t)\vdash t\equiv t^{\prime}$ .

Theorem 2.

A reversible subproblem selection strategy $\mathsf{SS}$ divides the trace space of program $p$ into equivalence classes.

We present the proof of this theorem in Appendix A.4 (Theorem 20).

A reversible subproblem selection strategy $\mathsf{SS}$ divides the trace space $T_{p}$ into a countable number of equivalence classes where each equivalence class contains traces which can be modified into any other trace in that class under the subproblem selection strategy. A trace from one equivalence class cannot be modified by any subproblem based inference algorithm to a trace from a different equivalence class under the given subproblem selection strategy.

Given a reversible subproblem selection strategy $\mathsf{SS}$ , let ${\mathcal{}C}_{SS}=\{c_{1},\ldots c_{n},\ldots\}$ be the countable set of equivalence classes created by $\mathsf{SS}$ over the trace space $T_{p}$ and $\{T_{c_{1}},\ldots T_{c_{n}},\ldots\}$ be the equivalence partitions created by $\mathsf{SS}$ over $T_{p}$ . Note that for all $c\in{\mathcal{}C}_{SS}$ and $t,t^{\prime}\in T_{c}$ ,

[TABLE]

and for all $c_{i},c_{j}\in{\mathcal{}C}_{SS},t\in T_{c_{i}}$ and $t^{\prime}\in T_{c_{j}}$ , where $c_{i}\neq c_{j}$

[TABLE]

In practice, subproblems are often specified by associating labels with stochastic choices, then specifying the labels whose stochastic choices should be included in the subproblem (Mansinghka et al., 2018). A standard strategy is to have a fixed set of labels, with the labels partitioning the choices into classes. Any strategy that always specifies the subproblem via a fixed subset of labels is reversible. Because of this property, all of the subproblem selection strategies presented in (Mansinghka et al., 2018) are reversible.

Any subproblem selection strategy that always selects a fixed set of variables is also reversible. This property ensures that the subproblem selection strategy in Block Gibbs sampling, for example, is reversible. Hence if two traces differ only in the choice of their id’s, all reversible subproblem selection strategies will assign them to the same equivalence class.

Class functions given a subproblem selection strategy: Consider a reversible subproblem selection strategy $\mathsf{SS}$ which creates equivalence classes ${\mathcal{}C}_{SS}=\{c_{1},\ldots c_{n}\ldots\}$ . and equivalence partitions $\{T_{c_{1}},\ldots T_{c_{n}},\ldots\}$ . We create a generalized product space and class functions using the given subproblem selection strategy. Consider the countable set of disjoint measurable spaces $\big{\{}(C_{1},{\mathcal{}C}_{1}),\ldots(C_{n},{\mathcal{}C}_{n}),\ldots\big{\}}$ where $C_{k}=\{c_{k}\}$ and ${\mathcal{}C}_{k}=\{\emptyset,C_{k}\}$ . Also consider disjoint measurable spaces $\big{\{}(T_{c_{1}},\Sigma_{c_{1}}),\ldots(T_{c_{n}},\Sigma_{c_{n}}),\ldots\big{\}}$ . where $\Sigma_{c_{k}}=\{A\cap T_{c_{k}}|A\in\Sigma_{p}\}$ . We then construct the generalized product space

[TABLE]

Given a probability measure $\pi$ on $(C,{\mathcal{}C})$ , we compute the conditional distribution $\pi_{i}$ on $(C_{i}\times T_{c_{i}},{\mathcal{}C}_{SS}\otimes\Sigma_{c_{k}})$ , when $\pi(C_{i}\times T_{c_{i}})>0$ where

[TABLE]

We then define the regular conditional probability measure $v_{i}:C_{i}\times\Sigma_{c_{i}}\rightarrow[0,1]$ , where $v_{i}(c_{i},A)=\pi_{i}(C_{i}\times A)$ . We create the class function $f_{SS}:T_{p}\rightarrow C$ , where $f_{SS}(t)=\langle c,t\rangle$ where $c$ is the equivalence class of trace $t$ . Since $f_{SS}$ is a one-to-one function, it is straightforward to prove that $f_{SS}$ is a two-way measurable function.

Probability of the subtraces: For all traces $t,t^{\prime}\in T_{p}$ such that $\mathsf{SS}\vdash t\equiv t^{\prime}$ , subtraces $t_{s}=\mathsf{ExtractTraces}(t,\mathsf{SS}(t))$ and $t^{\prime}_{s}=\mathsf{ExtractTraces}(t^{\prime},\mathsf{SS}(t^{\prime}))$ are traces from the same program, i.e., $t_{s},t^{\prime}_{s}\in\mathsf{Traces}(p_{s})$ , where $p_{s}$ is the subprogram (Soundness). Similarly, for all traces $t\in T_{p}$ and subtraces $t_{s}=\mathsf{ExtractTrace}(t,\mathsf{SS}(t))$ , for all subtraces $t^{\prime}_{s}\in\mathsf{Traces}(p_{s})$ (where $p_{s}=\mathsf{Program}(t_{s})$ ) and $t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},\mathsf{SS}(t))$ then $\mathsf{SS}\vdash t\equiv t^{\prime}$ (Completeness).

Hence given a equivalence class $c_{i}$ and partitioned trace space $T_{c_{i}}$ created by subprogram selection strategy $\mathsf{SS}$ , there exists a subprogram $p_{s}$ such that for all traces $t\in T_{c_{i}}$ , subtraces $t_{s}=\mathsf{ExtractTrace}(t,\mathsf{SS}(t))$ are valid traces of $p_{s}$ , i.e., $t_{s}\in T_{p_{s}}$ . We can therefore associate traces from an equivalence class to valid subtraces of a subprogram.

Theorem 3.

Given a trace $t$ and a valid subproblem ${\mathcal{}S}$ on $t$ , then for subtrace $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ ,

[TABLE]

i.e., the unnormalized densities of $t$ and $t_{s}$ are equal.

We present the proof of this theorem in the Appendix A.4 (Theorem 22).

Consider an equivalence class $c_{i}$ and partitioned trace space $T_{c_{i}}$ . Let $p_{s}$ be the subprogram such that all subtraces of traces in $T_{c_{i}}$ are valid traces of $p_{s}$ . $(T_{p_{s}},\Sigma_{p_{s}})$ is the measurable space over traces of subprogram $p_{s}$ . The normalized probability distribution $\mu_{p_{s}}:\Sigma_{p_{s}}\rightarrow[0,1]$ for the subprogram $p_{s}$ is

[TABLE]

where $A^{\prime}=\big{\{}t^{\prime}\big{|}t_{s}\in A,t^{\prime}=\mathsf{StitchTrace}(t,t_{s},\mathsf{SS}(t))\big{\}}$ for any trace $t\in T_{c_{i}}$ .

Hence, sampling/inference over subprogram $p_{s}$ is equivalent to sampling/inference over the original program $p$ with the constraint that all traces belong to the equivalence class $c_{i}$ .

Theorem 4.

Consider equivalence class $c_{i}$ and partitioned trace space $T_{c_{i}}$ . Let $p_{s}$ be the subprogram such that all subtraces of traces in $T_{c_{i}}$ are valid traces of $p_{s}$ . Then given a Markov Kernel $K:T_{p_{s}}\times\Sigma_{p_{s}}\rightarrow[0,1]$ which is $\mu_{p_{s}}$ -irreducible, aperiodic, and with stationary distribution $\mu_{p_{s}}$ , the Markov kernel $K(c_{i}):T_{c_{i}}\times\Sigma_{c_{i}}\rightarrow[0,1]$ defined as

[TABLE]

where $t_{s}=\mathsf{ExtractTrace}(t,\mathsf{SS}(t))$ and $A^{\prime}=\{t_{s}|t\in A,t_{s}=\mathsf{ExtractTrace}(t,\mathsf{SS}(t))\}$ , is $v_{i}(c_{i},.)$ -irreducible, aperiodic, and with stationary distribution $v_{i}(c_{i},.)$ .

Proof.

$\mathsf{ExtractTrace}$ is one-to-one function from $T_{c_{i}}$ to $T_{p_{s}}$ and the push forward measure of $v_{i}(c_{i},.)$ is $\mu_{p_{s}}$ . ∎

Definition 0 (Generalized Markov Kernel).

A Generalized Markov Kernel $K$ is a parameterized Markov kernel which, when parameterized with a probabilistic program $p$ , defines the probability space $(T_{p},\Sigma_{p},\mu_{p})$ (as defined above), and returns a Markov Kernel $K(p):T_{p}\times\Sigma_{p}\rightarrow[0,1]$ , which is $\mu_{p}$ -irreducible, aperiodic, and has the stationary distribution $\mu_{p}$ .

A generalized Markov Kernel formalizes the concept of Markov chain inference algorithms. The inference algorithms used within the probabilistic programming framework are generally coded to work with any input probabilistic program $p$ and still provide convergence guarantees. For example, Venture (Mansinghka et al., 2018) allows the programmer to use a variety of inference algorithms which in general work on all probabilistic programs which can be written in that language.

Definition 0 (Generalized Class Kernels).

Given a generalized Markov Kernel $K$ and a subproblem selection strategy $\mathsf{SS}$ , a generalized class Kernel $K_{f_{\mathsf{SS}}}$ is parameterized with a probabilistic program $p$ (which defines space $(T_{p},\Sigma_{p},\mu_{p})$ ), where

[TABLE]

and $t_{s}=\mathsf{ExtractTrace}(t,\mathsf{SS}(t))$ , $p_{s}=\mathsf{Program}(t_{s})$ , $f_{\mathsf{SS}}(t)=\langle c_{i},t\rangle$ and $A^{\prime}=\{t_{s}|t\in A,t_{s}=\mathsf{ExtractTrace}(t,\mathsf{SS}(t))\}$ .

5.2. Inference Metaprogramming

Using the concept of independent subproblem inference (Section 3) and generalized Markov Kernels, we define an Inference Metaprogramming Language (Figure 11). An inference metaprogram is either one of the black box Generalized Markov Kernel inference algorithms $b_{i}:T_{p}\rightarrow T_{p}$ in our framework, which takes a trace from an arbitrary program $p$ as an input and returns another trace from the same program, or a finite set $S=\{p_{1}\leavevmode\nobreak\ ic_{1},p_{2}\leavevmode\nobreak\ ic_{2},\ldots,p_{k}\leavevmode\nobreak\ ic_{k}\}$ of inference statements with an attached probability value $p_{i}\in(0,1)$ , such that $\sum\limits_{i=0}^{k}p_{k}=1$ . These probability values are used to randomly select a subproblem inference statement to execute. Each $\mathsf{infer}$ statement is parameterized with a subproblem selection strategy $\mathsf{SS}$ , which returns a valid subproblem over input trace $t$ and an inference metaprogram that is executed over the subtrace. Figure 12 presents the execution semantics of our inference metaprogramming language. In comparison with entangled subproblem inference, one benefit of the approach is that it is straightforward to apply independent subproblem inference recursively.

Theorem 7.

If all the subproblems used in our inference metaprograms are reversible and connect the space of their respective input probabilistic programs, then all inference metaprograms in our inference metaprogramming language implement a generalized Markov kernel.

Proof.

Proof by induction over structure of inference metaprograms.

Base Case: All black box inference algorithms in our inference metaprogramming language are generalized Markov kernels. Hence given traces of program $p$ (which define the probability space $(T_{p},\Sigma_{p},\mu_{p})$ ) the black box inference algorithm is $\mu_{p}$ -irreducible, aperiodic, and has the stationary distribution $\mu_{p}$ .

Induction Case: Consider the inference metaprogram $ip=\{p_{1}\leavevmode\nobreak\ ic_{1},p_{2},ic_{2},\ldots,p_{k}\leavevmode\nobreak\ ic_{k}\}$ , where $ic_{i}=\mathsf{infer}(\mathsf{SS}_{i},ip_{i})$ and $\sum\limits_{i=1}^{k}p_{i}=1$ .

Using the induction hypothesis, we assume, for all $i\in\{1,2,\ldots k\}$ , all $ip_{i}$ implement a generalized Markov kernel $K^{ip_{i}}$ . Since our subproblem $\mathsf{SS}$ is reversible, we lift the generalized Markov kernel to generalized class kernel (Definition 6) $K_{f_{SS_{i}}}$ , which for any program $p$ (which defines space $(T_{p},\Sigma_{p},\mu_{p})$ ), is a class kernel.

Given an probabilistic program $p$ , the inference metaprogram $ip$ implements the Generalized Markov Kernel $K$ , where

[TABLE]

Using Theorems 37, 38, and 39, if the class functions $f_{SS_{1}},f_{SS_{2}},\ldots,f_{SS_{k}}$ connect the space $(T_{p},\Sigma_{p},\mu_{p})$ , $K(p)$ is $\mu_{p}$ -irreducible, aperiodic, and has $\mu_{p}$ as its stationary distribution. ∎

Corollary 0.

Given a probabilistic program $p$ (defining trace space $(T_{p},\Sigma_{p},\mu_{p})$ ), inference metaprograms which use reversible subproblem selection strategies which connect the space of their respective probabilistic programs asymptotically converge to $\mu_{p}$ .

6. Related Work

Probabilistic Programming Languages: Over the last several decades researchers have developed a range of probabilistic programming languages. With current practice each language typically comes paired with one/a few black box inference strategies. Example language/inference pairs include Stan (Carpenter et al., 2016) with Hamiltonian Monte Carlo inference (Andrieu et al., 2003); Anglician (Tolpin et al., 2015) with particle Gibbs, etc. Languages like LibBi (Murray, 2013), Edward (Tran et al., 2017) and Pyro (Labs, 2017) provide inference customization mechanisms, but without subproblems or asymptotic convergence guarantees.

Compilation Strategies for Probabilistic Programs: Techniques for efficiently executing probabilistic programs are a prerequisite for their widespread adoption. The Swift (Wu et al., 2016) and Augur (Huang et al., 2017; Tristan et al., 2014) compilers generate efficient compiled implementations of inference algorithms that operate over probabilistic programs. We anticipate that applying these compilation techniques to subproblems can deliver significant performance improvements for the hybrid inference algorithms we study in this paper.

Subproblem Inference: Both Turing (Ge et al., 2018) and Venture (Mansinghka et al., 2018) provide inference metaprogramming constructs with subproblems and different inference algorithms that operate on these subproblems. In Venture subproblem inference is performed over full program traces, with subproblems entangled with the full trace. The inference algorithms in Venture must therefore operate over the entire trace while ensuring that the inference effects do not escape the specified subproblem. Our extraction and stitching technique eliminates this entanglement and enables the use of standard inference algorithms that operate over complete traces while still supporting subproblem identification and inference. Turing only provides mechanisms that target specific stochastic choices in the context of the complete probabilistic computation.

There is work on extending Gibbs-like algorithms to work on Open Universe Probabilisitic Models (Arora et al., 2012; Milch and Russell, 2010). This research studies algorithms that apply one strategy to choose a single variable at each iteration and rely on empirical evidence of convergence. Our research, in contrast, supports a wide range of subproblem selection strategies and provides formal proofs of asymptotic convergence properties for MCMC algorithms applied to these subproblems.

Asymptotic Convergence: There is a vast literature on asymptotic convergence of Markov chain algorithms in various statistics and probability settings (Meyn and Tweedie, 2012; Tierney, 1994). Our work is unique in that it provides the first characterization of asymptotic convergence for subproblem inference in probabilistic programs. Complications that occur in this setting include mixtures of discrete and continuous variables, stochastic choices with cascading effects that may change the number of stochastic choices in the computation, and resulting sample spaces with unbounded numbers of random variables. Standard results from computational statistics, computational physics, and Monte-Carlo methods focus on finite dimensional discrete state spaces, a context in which linear algebra (i.e., spectral analysis of the transition matrix of the underlying Markov chain (Diaconis and Stroock, 1991) or coupling arguments (Levin and Peres, 2017)) is sufficient to prove convergence. State spaces with continuous random variables are outside the scope of these formal analyses. Measure-theoretic treatments are more general (Roberts et al., 2004). Our results show how to apply the concepts in these treatments to prove asymptotic convergence results for probabilistic programs with interference metaprogramming.

Proving Properties of Probabilistic Inference: Researchers have recently developed techniques for proving a variety of properties of different inference algorithms for probabilistic programs (Scibior et al., 2018; Ścibior et al., 2017; Atkinson and Carbin, 2017; Anonymous, 2020). Our unique contribution relates to the treatment of convergence in the context of subproblems, specifically 1) the identification of subproblem extraction and stitching to obtain independent subproblems, 2) support for a general class of (potentially state-dependent) subproblems, including programmable subproblem selection strategies that may depend on the values of stochastic choices from the current execution trace, and 3) the mathematical formulation that enables us to state and prove asymptotic convergence results for hybrid inference metaprograms that apply (a general class of potentially very different) MCMC algorithms to different parts of the inference problem. We see our research and the research cited above as synergistic — one potential synergy is that the research cited above can prove properties of the black-box MCMC algorithms that our inference metaprograms deploy.

7. Conclusion

Inference metaprogramming, subproblem inference, and asymptotic convergence are key issues in probabilistic programming. Detangling the subproblem from the surrounding program trace allows us to cleanly analyze subproblem based inference. Our mathematical framework introduces new concepts which enable us to model subproblem based inference and prove asymptotic convergence properties of the resulting hybrid probabilistic inference algorithms.

Appendix A Appendix

A.1. Convergence

Lemma 0.

For all $t\in T$ and $A\in\Sigma$ ,

[TABLE]

where $f(t)=\langle x,y\rangle\in X_{i}\times Y_{i}$ and $U\times V=f(A)\cap X_{i}\times Y_{i}$ .

Proof.

Proof by induction.

Base case:

[TABLE]

using definition of $K_{f}$ .

Induction Hypothesis:

For all $1\leq n\leq m$ , the following statement is true

[TABLE]

Induction Case:

[TABLE]

Note that $f(x^{\prime},y^{\prime})\in X_{i}\times Y_{i}$

[TABLE]

Hence proved. ∎

Lemma 0.

[TABLE]

Proof.

Every set $A\in\Sigma$ can be written as $f^{-1}(U\times V)$ for some $U\times V\in{\mathcal{}C}^{f}$ as $f$ is a two-way measurable function.

[TABLE]

We can split the integral into sum over the constituent product spaces $X_{i}\times Y_{i}$ ,

[TABLE]

We can rewrite $K_{f}(f^{-1}(x,y),f^{-1}(U\times V))$ as $K(x)(y,V\cap Y_{i})I(x,U\cap X_{i})$ .

[TABLE]

$\int_{x\in X_{i}}f(x)I(x,U\cap X_{i})m(dx)=\int_{x\in U\cap X_{i}}f(x)m(dx)$ as $I(x,U\cap X_{i})$ is zero for all $x\notin U\cap X_{i}$ .

[TABLE]

We can rewrite $f_{*}(\pi)_{i}(dx^{\prime}\times dy)$ as $v_{f_{*}(\pi)_{i})}(x,dy)f_{*}(\pi)_{i}(dx^{\prime}\times Y_{i})$ using the definition of regular conditional probability distribution.

[TABLE]

$\int_{y\in Y_{i}}K_{i}(x)(y,V\cap Y_{i})v_{f_{*}(\pi)_{i}}(x,dy)=v_{f_{*}(\pi)_{i}}(x,V\cap Y_{i})$ as $v_{f_{*}(\pi)_{i}}(x,.)$ is the stationary distribution for kernel $K_{i}(x)$ .

[TABLE]

∎

Lemma 0.

$K_{f}$ * is aperiodic if for at least one $x\in X_{i}$ for some $i\in I$ , $K_{i}(x):Y_{i}\times{\mathcal{}Y}_{i}\rightarrow[0,1]$ is aperiodic.*

Proof.

Proof by contradiction. Let us assume $K_{f}$ is periodic, i.e., there exists an integer $d\geq 2$ , and a sequence $\{E_{0},E_{1},\ldots E_{d-1}\}$ and $N$ of $d$ non-empty disjoint sets in $\tau$ such that, for all $i=0,1,\ldots d-1$ and for all $t\in E_{i}$ ,

(1)

$(\cup_{i=0}^{d}E_{i})\cup N=T$ 2. (2)

$K_{f}(t,E_{j})=1\text{ for }j=i+1(\mathsf{mod}\leavevmode\nobreak\ d)$ 3. (3)

$\pi(N)=0$

For all $t\in E_{i}$ , $K_{f}(t,E_{j})=1\text{ for }j=i+1(\mathsf{mod}\leavevmode\nobreak\ d)$ Consider any $k\in I$ , any $x\in X_{k}$ and $U_{i}\times V_{i}=f(E_{i})\cap X_{k}\times Y_{k}$ .

Consider a trace $t\in E_{i}$ , such that $f_{x}(t)=x$ . Since $K_{f}(t,E_{i+1})=1\leq I(x,U_{i+1})$ , for all $i=0,1\ldots d-1$ , there exists a trace $t^{\prime}\in E_{i}$ , such that $f(t^{\prime})=\langle x,y\rangle$ .

$K_{f}(t,E_{i+1})=1\leq K(x)(y,V_{i+1})$ , Hence if $K_{f}$ is aperiodic, then for all $x$ , $K(x)$ is aperiodic.

∎

Lemma 0.

For any $f\in{\mathcal{}F}$ , any element $t^{\prime}\in B_{t}^{\infty}$ such that $f(t^{\prime})=\langle x,y\rangle\in X_{i}^{f}\times Y_{i}^{f}$ , and any set $U\times V\in\sigma({\mathcal{}X}_{i}^{f}\otimes{\mathcal{}Y}_{i}^{f})$ such that $A=f^{-1}(U\times V)$ , the following condition holds true

[TABLE]

Proof.

Consider class Kernel $K_{f}(t^{\prime},A)=K_{i}(x)(y,V)I(x,U)$ . Since $v_{f_{*}(\pi)_{i}}(x,V)>0$ and $K_{i}$ is $v_{f_{*}(\pi)_{i}}$ -irreducible, there exists an $n$ such that

[TABLE]

Since $x\in U$ , using Lemma 27

[TABLE]

Since $t^{\prime}\in B_{t}^{\infty}$ , there exists an $n^{\prime}$ such that, for all sets $B\in\Sigma$ with $t^{\prime}\in B$ , $K^{n^{\prime}}(t,B)>0$ . Hence

[TABLE]

∎

Lemma 0.

For any positive probability set $A$ and any function $f\in{\mathcal{}F}$ , if $A\subseteq f^{-1}_{x}(f_{x}(B_{t}^{\infty}))$ then $A\in R_{t}^{\infty}$ .

Proof.

Given a set $A$ , we can treat $f(A)$ as a union of sets $\{U_{i}\times V_{i}|i\in I^{f}\}$ , where $U_{i}\times V_{i}$ are elements of set $f(A)$ which are elements of the set $X_{i}^{f}\times Y_{i}^{f}$ (i.e. $U_{i}\times V_{i}=f(A)\cap X_{i}^{f}\times Y_{i}^{f}$ ). Since $\pi(A)>0$ , $f_{*}(\pi)(f(A))>0$ and for at least for one $i\in I_{f}$ , $f_{*}(\pi)_{i}(U_{i}\times V_{i})>0$ .

Since $A\subseteq f^{-1}_{x}(f_{x}(B_{t}^{\infty}))$ , for each $x\in U_{i}$ there exists at least one element $t^{\prime}\in B_{t}^{\infty}$ such that $f_{x}(t^{\prime})=x$ .

If $f_{*}(\pi)_{i}(U_{i}\times V_{i})>0$ , there exists at least one $t^{\prime}\in B_{t}^{\infty}$ such that $f(t^{\prime})=\langle x,y\rangle\in U_{i}\times Y_{i}^{f}$ and $v_{f_{*}(\pi)_{i}}(x,V_{i})>0$ . Hence $f^{-1}(U_{i}\times V_{i})\in R_{t}^{\infty}$ . ∎

Lemma 0.

If ${\mathcal{}F}$ connects the space $(T,\Sigma,\pi)$ then there does not exist a positive probability set $A\in\Sigma$ , such that $A\subseteq\bigcap_{f\in{\mathcal{}F}}f_{x}^{-1}((f_{x}(B^{\infty}_{t}))^{c})$ .

Proof.

Proof by Contradiction.

Let us assume such a set $A$ exists. If $A$ is a positive probability set, then $\pi(\bigcap_{f\in{\mathcal{}F}}f_{x}^{-1}((f_{x}(B^{\infty}_{t}))^{c}))>0$ .

For any two functions $f,g\in{\mathcal{}F}$ ,

[TABLE]

The set $f_{x}^{-1}((f_{x}(B^{\infty}_{t}))^{c})$ only contains elements which are not in $B^{\infty}_{t}$ and $g_{x}^{-1}(g_{x}(B^{\infty}_{t}))$ contains elements $t^{\prime}$ such that there exists at least one element $t^{\prime\prime}\in B^{\infty}_{t}$ with $g_{x}(t^{\prime})=g_{x}(t^{\prime\prime})$ .

Any positive probability set $B\subseteq g_{x}^{-1}(g_{x}(B^{\infty}_{t}))$ is also a subset of $B^{\infty}_{t}$ (using Lemma 35). Hence

[TABLE]

Similarly

[TABLE]

Since $\pi(\bigcap_{f\in{\mathcal{}F}}f_{x}^{-1}((f_{x}(B^{\infty}_{t}))^{c}))>0$ ,

[TABLE]

But this contradicts the fact the ${\mathcal{}F}$ connects the space $(T,\Sigma,\pi)$ . Hence no such set $A$ exists. ∎

Theorem 7.

If ${\mathcal{}F}$ connects the space $(T,\Sigma,\pi)$ then the Markov Transition Kernel $K$ is $\pi$ -irreducible.

Proof.

Proof by contradiction. Let us assume $K$ is not $\pi$ -irreducible, then there exists a positive probability set $A\in\Sigma$ such that $A\notin R_{t}^{\infty}$ , then

If $\pi(A\cap B_{t}^{\infty})>0$ , there exists a set $B\subseteq B_{t}^{\infty}$ and $B\subseteq A$ which implies $A\in R_{t}^{\infty}$ . Hence $\pi(A\cap B_{t}^{\infty})=0$ .

For any $f\in{\mathcal{}F}$ , $\pi(A\cap f_{x}^{-1}(f_{x}(B^{\infty}_{t})))>0$ , there exists a set $B\in R_{t}^{\infty}$ and $B\subseteq A$ which implies $A\in R_{t}^{\infty}$ . Hence $\pi(A\cap f_{x}^{-1}(f_{x}(B^{\infty}_{t})))=0$ .

Since $f_{x}$ is a 2-way measurable function (and one-one function from sets to sets), for any set $B$ $f_{x}^{-1}(f_{x}(B))^{c}=f_{x}^{-1}(f_{x}(B)^{c})$ .

Since $\pi(A)>0$ and For any $f\in{\mathcal{}F}$ $\pi(A\cap f_{x}^{-1}(f_{x}(B^{\infty}_{t})))=0$ this means

[TABLE]

which means there exists a positive probability set $B\in\Sigma$ and $B\subseteq f_{x}^{-1}(f_{x}(B^{\infty}_{t})^{c}))$ , which is impossible.

Hence no such set $A$ exists. $K$ is $\pi$ -irreducible. ∎

Theorem 8.

$\pi$ * is the stationary distribution of Markov Kernel $K$ , i.e.*

[TABLE]

Proof.

[TABLE]

∎

Theorem 9.

The Markov Transition Kernel $K$ is aperiodic if at least one of the class kernels $K_{f_{j}}$ is aperiodic.

Proof.

Proof by Contradiction.

Let us assume $K$ is periodic. i.e., there exists an integer $d\geq 2$ and a sequence $\{E_{0},E_{1},\ldots E_{d-1}\}$ and $N$ of $d$ non-empty disjoint sets in $\tau$ such that, for all $i=0,1,\ldots d-1$ and for all $t\in E_{i}$ ,

(1)

$(\cup_{i=0}^{d}E_{i})\cup N=T$ 2. (2)

$K(t,E_{j})=1\text{ for }j=i+1(\mathsf{mod}\leavevmode\nobreak\ d)$ 3. (3)

$\pi(N)=0$

If $K(t,E_{j})=1$ then for all $f\in{\mathcal{}F}$ , $K_{f}(t,E_{j})=1$ . Therefore, for all $i=0,1,\ldots d-1$ and for all $t\in E_{i}$ , $K_{f}(t,E_{j})=1\text{ for }j=i+1(\mathsf{mod}\leavevmode\nobreak\ d)$ .

Hence if $K$ is periodic, then for all $f\in{\mathcal{}F}$ , $K_{f}$ is periodic.

Hence by contradiction, $K$ is aperiodic. ∎

A.2. Soundness

Observation 1.

Note that whenever a rule in $\Rightarrow_{ex}$ introduces an $\mathsf{assume}$ statement in the subtrace, it creates a new variable name. Therefore variable names in the new subtrace do not conflict with any variable names previously introduced in another part of the trace. This fact will be used at various points within this paper.

Observation 2.

Note that whenever ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t_{s}$ , ${\mathcal{}V}(ae)={\mathcal{}V}(ae_{s})$ and whenever ${\mathcal{}S}\vdash ae,ae_{s},t_{s}\Rightarrow_{st}ae^{\prime}$ , ${\mathcal{}V}(ae^{\prime})={\mathcal{}V}(ae_{s})$ .

To prove soundness of our interface we start by proving that for a given trace $t$ and a valid subproblem ${\mathcal{}S}$ on trace $t$ , if for any subtrace $t_{s}$ the stitching process succeeds, the output trace $t^{\prime}$ differs from trace $t$ only in parts which are within the subproblem (i.e. ${\mathcal{}S}\vdash t\equiv t^{\prime}$ ).

Formally, for any trace $t$ and subproblem ${\mathcal{}S}$ ,

[TABLE]

Using the definition of $\mathsf{StitchTrace}$ , the above lemma can be rewritten as

[TABLE]

One will note that the stucture of $t_{s}$ does not play a significant role in proving the above condition.

To prove the above statement we require a similar condition over augmented expressions embedded within traces. The lemma over augmented expressions is given below:

Lemma 0.

For all augmented expressions $ae,ae^{\prime}$ ,

[TABLE]

Proof.

Proof using induction.

Base Case:

Case 1: $ae=(x:x)\#id$ ,

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $ae^{\prime}=(x:x)\#id^{\prime}$ .

By definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(x:x)\#id$ ,

[TABLE]

Case 2: $ae=(x(id_{v}):v)\#id$

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $ae^{\prime}=(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime}$ .

By definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(x(id_{v}):v)\#id$ ,

[TABLE]

Case 3: $ae=(\lambda.x\leavevmode\nobreak\ e:v)\#id$

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $ae^{\prime}=(\lambda.x\leavevmode\nobreak\ e:v^{\prime})\#id^{\prime}$ .

By definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\lambda.x\leavevmode\nobreak\ e:v)\#id$ ,

[TABLE]

Induction Cases:

Case 1: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id$ and $ID(ae_{1})\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $ae^{\prime}=((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v^{\prime})\#id^{\prime}$ , ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae_{2},\_,\_\Rightarrow_{st}ae^{\prime}_{2}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2},\_,\_\Rightarrow_{st}ae^{\prime}_{2}$

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ and ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ , by definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id$ ,

[TABLE]

Case 2: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id$ and $ID(ae_{1})\in{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $ae^{\prime}=((ae_{1}\leavevmode\nobreak\ ae_{2})aa^{\prime}:v^{\prime})\#id^{\prime}$ , ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae_{2},\_,\_\Rightarrow_{st}ae^{\prime}_{2}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2},\_,\_\Rightarrow_{st}ae^{\prime}_{2}$

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ and ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ , by definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id$ ,

[TABLE]

Case 3: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{3}:v)\#id$ and $ID(ae_{1})\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $ae^{\prime}=((ae_{1}\leavevmode\nobreak\ ae_{2})y=ae_{3}^{\prime}:v^{\prime})\#id^{\prime}$ , ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae_{2},\_,\_\Rightarrow_{st}ae^{\prime}_{2}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2},\_,\_\Rightarrow_{st}ae^{\prime}_{2}$

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{3},\_,\_\Rightarrow_{st}ae^{\prime}_{3}$

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ and ${\mathcal{}S}\vdash ae_{3}\equiv ae^{\prime}_{3}$ , by definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{3}:v)\#id$ ,

[TABLE]

Case 4: $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}):v)\#id$ and $id_{e}\notin{\mathcal{}S}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $ae^{\prime}=(\mathsf{Dist}(ae^{\prime}_{1}\#id_{e})=ae^{\prime}_{2}):v^{\prime})\#id^{\prime}$ , ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae_{2},\_,\_\Rightarrow_{st}ae^{\prime}_{2}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2},\_,\_\Rightarrow_{st}ae^{\prime}_{2}$

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ and ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ , by definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}):v)\#id$ and $id_{e}\notin{\mathcal{}S}$ ,

[TABLE]

Case 5: $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}):v)\#id$ and $id_{e}\in{\mathcal{}S}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $ae^{\prime}=(\mathsf{Dist}(ae^{\prime}_{1}\#id_{e})=ae^{\prime}_{2}):v^{\prime})\#id^{\prime}$ and ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1},\_,\_\Rightarrow_{st}ae^{\prime}_{1}$

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , by definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}):v)\#id$ and $id_{e}\in{\mathcal{}S}$ ,

[TABLE]

Because all cases are covered, using induction, the following statement is true for all augmented expressions $ae,ae^{\prime}$ and subproblems ${\mathcal{}S}$ .

[TABLE]

∎

Next we use Lemma 10 to prove the lemma below::

Lemma 0.

Given traces $t$ and $t^{\prime}$ and a subproblem ${\mathcal{}S}$ ,

[TABLE]

Proof.

Proof by Induction

Base Case: $t=\emptyset$

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $t^{\prime}=\emptyset$ .

By definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $t=\emptyset$

[TABLE]

Induction Case:

Case 1: $t=t_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae$

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $t^{\prime}=t^{\prime}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime}$ , ${\mathcal{}S}\vdash ae,\_,\_\Rightarrow_{st}ae^{\prime}$ , and ${\mathcal{}S}\vdash t_{s},\_\Rightarrow_{st}t^{\prime}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash t_{s},\_\Rightarrow_{st}t^{\prime}_{s}$

[TABLE]

From Lemma 10

[TABLE]

Because ${\mathcal{}S}\vdash ae,\_,\_\Rightarrow_{st}ae^{\prime}$

[TABLE]

Because ${\mathcal{}S}\vdash t^{\prime}_{s}\equiv t_{s}$ , and ${\mathcal{}S}\vdash ae\equiv ae^{\prime}$ , by definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $t=t_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae$

[TABLE]

Case 2: $t=t_{s};\mathsf{observe}(\mathsf{Dist}(ae)=e_{v})$

By assumption

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Then $t^{\prime}=t^{\prime}_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime})=e_{v})$ , ${\mathcal{}S}\vdash ae,\_,\_\Rightarrow_{st}ae^{\prime}$ , and ${\mathcal{}S}\vdash t_{s},\_\Rightarrow_{st}t^{\prime}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash t_{s},\_\Rightarrow_{st}t^{\prime}_{s}$

[TABLE]

From Lemma 10

[TABLE]

Because ${\mathcal{}S}\vdash ae,\_,\_\Rightarrow_{st}ae^{\prime}$

[TABLE]

Because ${\mathcal{}S}\vdash t^{\prime}_{s}\equiv t_{s}$ and ${\mathcal{}S}\vdash ae\equiv ae^{\prime}$ , by definition of $\equiv$

[TABLE]

Therefore

[TABLE]

Therefore when $t=t_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime})=e_{v})$

[TABLE]

Because all cases have been covered, using induction, for all traces $t,t^{\prime}$ , subproblems ${\mathcal{}S}$ and subtrace $t_{s}$ ,

[TABLE]

∎

Corollary 0.

Given a valid trace $t$ , a valid subproblem ${\mathcal{}S}$ , a valid subtrace $t_{s}$ , for all traces $t^{\prime}$ :

[TABLE]

Therefore, given a trace $t$ and a valid subproblem ${\mathcal{}S}$ on $t$ , if the stitching process succeeds, then the output trace $t^{\prime}$ will only differ from trace $t$ with parts which are within the subproblem ${\mathcal{}S}$ .

Next, we prove that given a valid trace $t$ , a valid subproblem ${\mathcal{}S}$ on trace $t$ , and a subtrace $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ , for any subtrace $t^{\prime}_{s}\in\mathsf{Traces}(\mathsf{Program}(t_{s}))$ , the stitched trace $t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},{\mathcal{}S})$ is a valid trace from the program of trace $t$ (i.e. $t^{\prime}\in\mathsf{Traces}(\mathsf{Program}(t))$ ).

Formally, given a valid trace $t$ , a valid subproblem ${\mathcal{}S}$ on $t$ , and a subtrace $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ ,

[TABLE]

To prove the above statement, we require a few lemmas first which prove a similar condition for augmented expressions and traces under non-empty environment

Lemma 0.

Given environements $\sigma_{v},\sigma_{id},\sigma^{\prime}_{v}$ and $\sigma^{\prime}_{id}$ such that $\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma_{id}=\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{id}$ an augmented expression $ae$ within a trace $t$ , and a valid subproblem ${\mathcal{}S}$ over trace $t$

[TABLE]

Proof.

Proof by induction

Base Case:

Case 1: $ae=(x:x)\#id$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $e=x$ and $x\notin\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(x:x)\#id$ and $t_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=\mathsf{assume}\leavevmode\nobreak\ z=x$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$ , $\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}$

[TABLE]

Then $t^{\prime}_{s}=\emptyset$ and $ae^{\prime}_{s}=(x:x)\#id^{\prime}$ .

Consider $ae^{\prime}=(x:x)\#id^{\prime}$ .

By definition of $\Rightarrow_{st}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ and $x\in\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{v}$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(x:x)\#id$

[TABLE]

Case 2: $ae=(x(id_{v}):v)\#id$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $e=x$ , $v=\sigma_{v}(x)$ , and $id_{v}=\sigma_{id}(x)$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(x(id_{v}):v)\#id$ and $t_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=\mathsf{assume}\leavevmode\nobreak\ z=x$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$ , $\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}$

[TABLE]

Then $t^{\prime}_{s}=\emptyset$ , $ae^{\prime}_{s}=(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime}$ , $v^{\prime}=\sigma^{\prime}_{v}(x)$ , and $id^{\prime}_{v}=\sigma^{\prime}_{id}(x)$ .

Consider $ae^{\prime}=(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime}$ .

By definition of $\Rightarrow_{st}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ , $v^{\prime}=\sigma^{\prime}_{v}(x)$ , and $id^{\prime}_{v}=\sigma^{\prime}_{id}(x)$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(x(id_{v}):v)\#id$

[TABLE]

Case 3: $ae=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v)\#id$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $e=\lambda.x\leavevmode\nobreak\ e^{\prime}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v)\#id$ and $t_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=\mathsf{assume}\leavevmode\nobreak\ z=\lambda.x\leavevmode\nobreak\ e^{\prime}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$ , $\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}$

[TABLE]

Then $t^{\prime}_{s}=\emptyset$ and $ae^{\prime}_{s}=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v^{\prime})\#id^{\prime}$ .

Consider $ae^{\prime}=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v^{\prime})\#id^{\prime}$ .

By definition of $\Rightarrow_{st}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash\lambda.x\leavevmode\nobreak\ e^{\prime}\Rightarrow_{s}\_,\_,(\lambda.x\leavevmode\nobreak\ e^{\prime}:v^{\prime})\#id^{\prime}$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v)\#id$

[TABLE]

Induction Cases:

Case 1: $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{v}:v)\#id$ and $id_{e}\in{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $e=\mathsf{Dist}(e_{1})$ , $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{v}:v)\#id$ , $t_{s}=t^{1}_{s}$ , and ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=\mathsf{Dist}(e^{1}_{s})$ and $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$ ,

[TABLE]

Then $t^{\prime}_{s}=t^{2}_{s}$ , $ae^{\prime}_{s}=(\mathsf{Dist}(ae^{2}_{s}\#id_{e})=ae^{\prime}_{v}:v^{\prime})\#id^{\prime}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e_{1}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ .

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , $t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{1}_{s}\Rightarrow_{r}p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ .

[TABLE]

Consider $ae^{\prime}=(\mathsf{Dist}(ae^{\prime}_{1}\#id^{\prime}_{e})=ae^{\prime}_{v}:v^{\prime})\#id^{\prime}$ .

By definition of $\Rightarrow_{st}$ , ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ ,

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{v}:v)\#id$

[TABLE]

Case 2: $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}:v)\#id$ and $id_{e}\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $e=\mathsf{Dist}(e_{1})$ , $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , $ae_{2}\Rightarrow_{r}e_{2}$ and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=ae^{3}_{s}$ , $t_{s}=t^{1}_{s};\mathsf{observe}(\mathsf{Dist}(ae^{1}_{s})=e_{2});t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , and ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=p^{1}_{s};\mathsf{observe}(\mathsf{Dist}(e^{1}_{s})=e_{2});p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}$ , $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , $ae^{3}_{s}\Rightarrow_{r}e^{2}_{s}$ , and $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$ ,

[TABLE]

Then $t^{\prime}_{s}=t^{2}_{s};\mathsf{observe}(\mathsf{Dist}(ae^{2}_{s})=e_{2});t^{4}_{s}$ , $ae^{\prime}_{s}=ae^{4}_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ .

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , $t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{1}_{s}\Rightarrow_{r}p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ .

[TABLE]

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae_{2}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , $t^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{3}_{s}\Rightarrow_{r}p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ .

[TABLE]

Consider $ae^{\prime}=(\mathsf{Dist}(ae^{\prime}_{1}\#id_{e})=ae^{\prime}_{2}:v^{\prime})\#id^{\prime}$ .

By definition of $\Rightarrow_{st}$ , ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae_{2},ae^{4}_{s},t^{4}_{s}\Rightarrow_{st}ae^{\prime}_{2}$ ,

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{2}\Rightarrow_{s},\_,\_,ae^{\prime}_{2}$ ,

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}:v)\#id$

[TABLE]

Case 3: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id$ and $ID(ae_{1})\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $e=(e_{1}\leavevmode\nobreak\ e_{2})$ , $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=((ae^{1}_{s}\leavevmode\nobreak\ ae^{3}_{s})\perp:v)\#id$ , $t_{s}=t^{1}_{s};t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , and ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=p^{1}_{s};p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=(e^{1}_{s}\leavevmode\nobreak\ e^{2}_{s})$ , $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , $ae^{3}_{s}\Rightarrow_{r}e^{2}_{s}$ , and $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$ ,

[TABLE]

Then $t^{\prime}_{s}=t^{2}_{s};t^{4}_{s}$ , $ae^{\prime}_{s}=((ae^{2}_{s}\leavevmode\nobreak\ ae^{4}_{s})\perp:v)\#id$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ .

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , $t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{1}_{s}\Rightarrow_{r}p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ .

[TABLE]

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae_{2}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , $t^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{3}_{s}\Rightarrow_{r}p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ .

[TABLE]

Consider $ae^{\prime}=((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})\perp:v^{\prime})\#id^{\prime}$ .

By definition of $\Rightarrow_{st}$ , ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae_{2},ae^{4}_{s},t^{4}_{s}\Rightarrow_{st}ae^{\prime}_{2}$ ,

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{2}\Rightarrow_{s},\_,\_,ae^{\prime}_{2}$ , and because $ae_{1}$ is not in the subproblem ${\mathcal{}S}$ , therefore its value will not change

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id$ , $ID(ae_{1})\notin{\mathcal{}S}$

[TABLE]

Case 4: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id$ and $ID(ae_{1})\in{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $e=(e_{1}\leavevmode\nobreak\ e_{2})$ , $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=((ae^{1}_{s}\leavevmode\nobreak\ ae^{3}_{s})aa:v)\#id$ , $t_{s}=t^{1}_{s};t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , and ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=p^{1}_{s};p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=(e^{1}_{s}\leavevmode\nobreak\ e^{2}_{s})$ , $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , $ae^{3}_{s}\Rightarrow_{r}e^{2}_{s}$ , and $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$ ,

[TABLE]

Then $t^{\prime}_{s}=t^{2}_{s};t^{4}_{s}$ , $ae^{\prime}_{s}=((ae^{2}_{s}\leavevmode\nobreak\ ae^{4}_{s})aa^{\prime}:v)\#id$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ .

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , $t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{1}_{s}\Rightarrow_{r}p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ .

[TABLE]

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae_{2}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , $t^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{3}_{s}\Rightarrow_{r}p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ .

[TABLE]

Consider $ae^{\prime}=((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})aa^{\prime}:v^{\prime})\#id^{\prime}$ .

By definition of $\Rightarrow_{st}$ , ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae_{2},ae^{4}_{s},t^{4}_{s}\Rightarrow_{st}ae^{\prime}_{2}$ ,

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{2}\Rightarrow_{s},\_,\_,ae^{\prime}_{2}$ , and because ${\mathcal{}V}(ae^{\prime}_{1})={\mathcal{}V}(ae^{2}_{s})$ , ${\mathcal{}V}(ae^{\prime}_{2})={\mathcal{}V}(ae^{4}_{s})$ (Observation 2)

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id$ , $ID(ae_{1})\in{\mathcal{}S}$

[TABLE]

Case 5: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})y=ae_{3}:v)\#id$ and $ID(ae_{1})\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $e=(e_{1}\leavevmode\nobreak\ e_{2})$ , $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae_{2}$ , $\sigma_{v}^{\prime\prime},\sigma_{id}^{\prime\prime}\vdash e_{3}\Rightarrow_{s}\_,\_,ae_{3}$ , and ${\mathcal{}V}(ae_{1})=\langle\lambda.x\leavevmode\nobreak\ e_{3},\sigma^{\prime\prime}_{v},\sigma^{\prime\prime}_{id}\rangle$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=ae^{5}_{s}$ , $t_{s}=t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{1}_{s};t^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ y=ae^{3}_{s};t^{5}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , and ${\mathcal{}S}\vdash ae_{3}\Rightarrow_{ex}ae^{5}_{s},t^{5}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ x=e^{1}_{s};p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ y=e^{2}_{s};p^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{3}_{s}$ , $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , $ae^{3}_{s}\Rightarrow_{r}e^{2}_{s}$ , $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , $ae^{5}_{s}\Rightarrow_{r}e^{3}_{s}$ , and $t^{5}_{s}\Rightarrow_{r}p^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$ ,

[TABLE]

Then $t^{\prime}_{s}=t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{2}_{s};t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ y=ae^{4}_{s};t^{6}_{s}$ , $ae^{\prime}_{s}=ae^{6}_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ , $\sigma_{v}^{\prime\prime\prime},\sigma_{id}^{\prime\prime\prime}\vdash p^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{3}_{s}\Rightarrow_{s}t^{6}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{6}_{s}$ .

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae_{1}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , $t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{1}_{s}\Rightarrow_{r}p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ .

[TABLE]

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae_{2}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , $t^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{3}_{s}\Rightarrow_{r}p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ .

[TABLE]

By induction hypothesis

[TABLE]

Because $\sigma_{v}^{\prime\prime},\sigma_{id}^{\prime\prime}\vdash e_{3}\Rightarrow_{s}\_,\_,ae_{3}$ , ${\mathcal{}S}\vdash ae_{3}\Rightarrow_{ex}ae^{5}_{s},t^{5}_{s}$ , $t^{5}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{5}_{s}\Rightarrow_{r}p^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{3}_{s}$ , and $\sigma_{v}^{\prime\prime\prime},\sigma_{id}^{\prime\prime\prime}\vdash p^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{3}_{s}\Rightarrow_{s}t^{6}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{6}_{s}$ .

[TABLE]

Consider $ae^{\prime}=((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})y=ae^{\prime}_{3}:v^{\prime})\#id^{\prime}$ .

By definition of $\Rightarrow_{st}$ , ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2},ae^{4}_{s},t^{4}_{s}\Rightarrow_{st}ae^{\prime}_{2}$ , and ${\mathcal{}S}\vdash ae_{3},ae^{6}_{s},t^{6}_{s}\Rightarrow_{st}ae^{\prime}_{3}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e_{2}\Rightarrow_{s},\_,\_,ae^{\prime}_{2}$ , and because ${\mathcal{}V}(ae^{\prime}_{1})={\mathcal{}V}(ae^{2}_{s})$ , ${\mathcal{}V}(ae^{\prime}_{1})={\mathcal{}V}(ae^{2}_{s})$ (Observation 2), $\sigma_{v}^{\prime\prime\prime},\sigma_{id}^{\prime\prime\prime}\vdash e_{3}\Rightarrow_{s}\_,\_,ae^{\prime}_{3}$

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})y=ae_{3}:v)\#id$ , $ID(ae_{1})\notin{\mathcal{}S}$

[TABLE]

Because we have covered all cases, using induction, the lemma is true. ∎

Lemma 0.

Given environements $\sigma_{v},\sigma_{id},\sigma^{\prime}_{v}$ and $\sigma^{\prime}_{id}$ such that $\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma_{id}=\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{id}$ a partial trace $t$ within a trace $t_{p}$ ( $t$ * is a suffix of trace $t_{p}$ ), and a valid subproblem ${\mathcal{}S}$ over trace $t_{p}$ *

[TABLE]

Proof.

Proof by induction

Base Case: $t=\emptyset$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $p=\emptyset$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $t_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $t^{\prime}_{s}=\emptyset$ .

Consider $t^{\prime}=\emptyset$ .

By definition of $\Rightarrow_{st}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Therefore

[TABLE]

Therefore when $t=\emptyset$

[TABLE]

Induction Case:

Case 1: $t=\mathsf{assume}\leavevmode\nobreak\ x=ae;t_{1}$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $p=\mathsf{assume}\leavevmode\nobreak\ x=e;p_{1}$ , $\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}v,id,ae$ , $\sigma^{\prime\prime}_{v}=\sigma_{v}[x\rightarrow v]$ , $\sigma^{\prime\prime}_{id}=\sigma_{id}[x\rightarrow id]$ , and $\sigma_{v}^{\prime\prime},\sigma_{id}^{\prime\prime}\vdash p_{1}\Rightarrow_{s}t_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $t_{s}=t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae_{s};t^{3}_{s}$ , ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ , and ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ x=e_{s};p^{2}_{s}$ , $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , $ae_{s}\Rightarrow_{r}e_{s}$ , and $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $t^{\prime}_{s}=t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime}_{s};t^{4}_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ x=e_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime}_{s}$ , $\sigma^{\prime\prime\prime}_{v}=\sigma^{\prime}_{v}[x\rightarrow{\mathcal{}V}(ae^{\prime}_{s})]$ , $\sigma^{\prime\prime\prime}_{id}=\sigma^{\prime}_{id}[x\rightarrow ID(ae^{\prime}_{s})]$ , and $\sigma_{v}^{\prime\prime\prime},\sigma_{id}^{\prime\prime\prime}\vdash p^{2}_{s}\Rightarrow_{s}t^{4}_{s}$ (Observation 1, variable names in $t^{2}_{s}$ do not collide with variable names in $t^{4}_{s}$ ).

By induction hypothesis

[TABLE]

Because $\sigma_{v}^{\prime\prime},\sigma_{id}^{\prime\prime}\vdash p_{1}\Rightarrow_{s}t_{1}$ , ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ , ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ , $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , and $\sigma_{v}^{\prime\prime\prime},\sigma_{id}^{\prime\prime\prime}\vdash p^{2}_{s}\Rightarrow_{s}t^{4}_{s}$ ,

[TABLE]

By statement 13

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}\_,\_,ae$ , ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ , $t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae_{s}\Rightarrow_{r}p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e_{s}$ and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{\prime}_{s}$ ,

[TABLE]

Consider $t^{\prime}=\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime};t^{\prime}_{1}$ .

By definition of $\Rightarrow_{st}$ , ${\mathcal{}S}\vdash t_{1},t^{4}_{s}\Rightarrow_{st}t^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae,ae^{\prime}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}$ ,

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e\Rightarrow_{s}\_,\_,ae^{\prime}$ , and $\sigma_{v}^{\prime\prime\prime},\sigma_{id}^{\prime\prime\prime}\vdash p^{2}_{s}\Rightarrow_{s}t^{4}_{s}$ ( ${\mathcal{}V}(ae^{\prime})={\mathcal{}V}(ae^{\prime}_{s})$ and $ID(ae^{\prime})=ID(ae^{\prime}_{s})$ Observation 2)

[TABLE]

Therefore

[TABLE]

Therefore when $t=\mathsf{assume}\leavevmode\nobreak\ x=ae;t_{1}$

[TABLE]

Case 2: $t=\mathsf{observe}(\mathsf{Dist}(ae)=e_{v});t_{1}$

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $p=\mathsf{observe}(\mathsf{Dist}(e)=e_{v});p_{1}$ , $\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}v,id,ae$ , and $\sigma_{v},\sigma_{id}\vdash p_{1}\Rightarrow_{s}t_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $t_{s}=t^{1}_{s};\mathsf{observe}(\mathsf{Dist}(ae_{s})=e_{v});t^{3}_{s}$ , ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ , and ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p_{s}=p^{1}_{s};\mathsf{observe}(\mathsf{Dist}(ae_{s})=e_{v});p^{2}_{s}$ , $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , $ae_{s}\Rightarrow_{r}e_{s}$ , and $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $t^{\prime}_{s}=t^{2}_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}_{s})=e_{v});t^{4}_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{observe}(\mathsf{Dist}(e_{s})=e_{v})\Rightarrow_{s}t^{2}_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}_{s})=e_{v})$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s}\Rightarrow_{s}t^{4}_{s}$ (Observation 1, variable names in $t^{2}_{s}$ do not collide with variable names in $t^{4}_{s}$ ).

By induction hypothesis

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash p_{1}\Rightarrow_{s}t_{1}$ , ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ , ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ , $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s}\Rightarrow_{s}t^{4}_{s}$ ,

[TABLE]

By statement 13

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}\_,\_,ae$ , ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ , $t^{1}_{s};\mathsf{observe}(\mathsf{Dist}(ae_{s})=e_{v})\Rightarrow_{r}p^{1}_{s};\mathsf{observe}(\mathsf{Dist}(e_{s})=e_{v})$ and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{1}_{s};\mathsf{observe}(\mathsf{Dist}(e_{s})=e_{v})\Rightarrow_{s}t^{2}_{s};\mathsf{observe}(\mathsf{Dist}(ae^{\prime}_{s})=e_{v})$ ,

[TABLE]

Consider $t^{\prime}=\mathsf{observe}(\mathsf{Dist}(ae^{\prime})=e_{v});t^{\prime}_{1}$ .

By definition of $\Rightarrow_{st}$ , ${\mathcal{}S}\vdash t_{1},t^{4}_{s}\Rightarrow_{st}t^{\prime}_{1}$ , and ${\mathcal{}S}\vdash ae,ae^{\prime}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}$ ,

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash e\Rightarrow_{s}\_,\_,ae^{\prime}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s}\Rightarrow_{s}t^{4}_{s}$

[TABLE]

Therefore

[TABLE]

Therefore when $t=\mathsf{observe}(\mathsf{Dist}(ae)=e_{v});t_{1}$

[TABLE]

Because we have covered all cases, using induction, the below statement is true.

[TABLE]

∎

Lemma 0.

Given a valid trace $t$ and a valid subproblem ${\mathcal{}S}$ and subtrace $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ for all possible subtraces $t^{\prime}_{s}$ :

[TABLE]

Proof.

Since statement 14 is true for all $\sigma_{v},\sigma_{id},\sigma^{\prime}_{v}$ , and $\sigma^{\prime}_{id}$ , given $\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma_{id}=\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{v}=\mathsf{dom}\leavevmode\nobreak\ \sigma^{\prime}_{id}$ , replacing $\sigma_{v},\sigma_{id},\sigma^{\prime}_{v}$ , and $\sigma^{\prime}_{id}$ with $\emptyset$ (empty environment) will result in

[TABLE]

∎

Theorem 16.

Given a valid trace $t$ , a valid subproblem ${\mathcal{}S}$ , subtrace $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ . For all possible subtraces $t_{s}^{\prime}$ , $t_{s}^{\prime}\in\mathsf{Traces}(\mathsf{Program}(t_{s}))$ implies there exist a trace $t^{\prime}$ such that:

•

$t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},{\mathcal{}S})$ **

•

$t^{\prime}\in\mathsf{Traces}(\mathsf{Program}(t))$ **

•

${\mathcal{}S}\vdash t\equiv t^{\prime}$ **

Proof.

From Corollary 12 and Lemma 15. ∎

A.3. Completeness

Within this section we prove that our interface is complete, i.e. given a valid trace $t$ , a valid subproblem ${\mathcal{}S}$ on $t$ , a subtrace $t_{s}=\mathsf{ExtractTrace}(t,t_{s},{\mathcal{}S})$ , for any trace $t^{\prime}$ which can be achived from entangled subproblem interface (i.e. $t^{\prime}\in\mathsf{Traces}(\mathsf{Program}(t))$ and ${\mathcal{}S}\vdash t\equiv t^{\prime}$ ), there exists a subtrace $t^{\prime}_{s}\in\mathsf{Traces}(\mathsf{Program}(t_{s}))$ such that $t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},{\mathcal{}S})$ .

Formally, given valid trace $t$ , a valid subproblem ${\mathcal{}S}$ on $t$ , a subtrace $t_{s}$ , and any trace $t^{\prime}$

[TABLE]

We need to prove a few lemmas which will aid us in proving the above statement.

Lemma 0.

Given an augmented expression $ae$ and a subproblem ${\mathcal{}S}$ , a subtrace $t_{s}$ , subaugmented expression $ae_{s}$ and an augmented expressions $ae^{\prime}$ such that

[TABLE]

then, there exists an $ae^{\prime}_{s}$ , $t^{\prime}_{s}$ , $p_{s}$ and $e_{s}$ such that

[TABLE]

Proof.

Proof by Induction

Base Case:

Case 1: $ae=(x:x)\#id$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(x:x)\#id$ and $t_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $ae^{\prime}=(x:x)\#id^{\prime}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $e=x$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $x\notin\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}$ and $id^{\prime}$ is a unique id.

Consider $ae^{\prime}_{s}=(x:x)\#id^{\prime}$ , $t^{\prime}_{s}=\emptyset$ , $p_{s}=\emptyset$ and $e_{s}=x$ . By definition of $\Rightarrow_{st}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Therefore

[TABLE]

Because $x\notin\mathsf{dom}\leavevmode\nobreak\ \sigma_{v}$ , the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(x:x)\#id$ ,

[TABLE]

implies

[TABLE]

Case 2: $ae=(x(id_{v}):v)\#id$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(x(id_{v}):v)\#id$ and $t_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $ae^{\prime}=(x(id^{\prime}_{v}):v)\#id^{\prime}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $e=x$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $\sigma_{v}(x)=v^{\prime},\sigma_{id}(x)=id^{\prime}_{v}$ , and $id^{\prime}$ is a unique id.

Consider $ae^{\prime}_{s}=(x(id^{\prime}_{v}):v^{\prime})\#id^{\prime}$ , $t^{\prime}_{s}=\emptyset$ , $p_{s}=\emptyset$ and $e_{s}=x$ . By definition of $\Rightarrow_{st}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v}(x)=v^{\prime},\sigma_{id}(x)=id^{\prime}_{v}$ , the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(x(id_{v}):v)\#id$ ,

[TABLE]

implies

[TABLE]

Case 3: $ae=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v)\#id$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v)\#id$ and $t_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $ae^{\prime}=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v)\#id^{\prime}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $e=\lambda.x\leavevmode\nobreak\ e^{\prime}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $id^{\prime}$ is a unique id.

Consider $ae^{\prime}_{s}=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v^{\prime})\#id^{\prime}$ , $t^{\prime}_{s}=\emptyset$ , $p_{s}=\emptyset$ and $e_{s}=\lambda.x\leavevmode\nobreak\ e^{\prime}$ . By definition of $\Rightarrow_{st}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash\lambda.x\leavevmode\nobreak\ e^{\prime}\Rightarrow_{s}\_,\_,(\lambda.x\leavevmode\nobreak\ e^{\prime}:v^{\prime})\#id^{\prime}$ , the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\lambda.x\leavevmode\nobreak\ e^{\prime}:v)\#id$ ,

[TABLE]

implies

[TABLE]

Induction Case:

Case 1: $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}:v)\#id$ and $id_{e}\in{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(\mathsf{Dist}(ae^{1}_{s}\#id_{e})=ae_{2}:v)\#id$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ and $t_{s}=t^{1}_{s}$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $ae^{\prime}=(\mathsf{Dist}(ae^{\prime}_{1}\#id^{\prime}_{e})=ae^{\prime}_{2}:v^{\prime})\#id^{\prime}$ and ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $e=\mathsf{Dist}(e_{1})$ and $ae_{1}\Rightarrow_{r}e_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ , $e_{2}\in\mathsf{dom}\leavevmode\nobreak\ \mathsf{Dist}(v^{\prime})$ , and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , $ae_{1}\Rightarrow_{r}e_{1}$ , and $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ ,

[TABLE]

Consider $ae^{\prime}_{s}=(\mathsf{Dist}(ae^{2}_{s}\#id_{e})=ae^{\prime}_{2}:v^{\prime})\#id^{\prime}$ , $t^{\prime}_{s}=t^{2}_{s}$ , $p_{s}=p^{1}_{s}$ , and $e_{s}=\mathsf{Dist}(e^{1}_{s})$ . Because ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ and $id_{e}\in{\mathcal{}S}$ , the definition of $\Rightarrow_{st}$ implies

[TABLE]

Therefore

[TABLE]

Because $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ ,

[TABLE]

Because $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , the definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ , and all variable names introduced by $t^{2}_{s}$ do not conflict with variable names in $ae^{\prime}_{2}$ (Observation 1), the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}:v)\#id$ and $id_{e}\in{\mathcal{}S}$ , and assuming induction hypothesis,

[TABLE]

Case 2: $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}:v)\#id$ and $id_{e}\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(ae^{3}_{s}:v)\#id$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , $ae_{2}\Rightarrow_{r}e_{2}$ , and $t_{s}=t^{1}_{s};\mathsf{observe}(\mathsf{Dist}(ae^{1}_{s})=e_{2});t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $ae^{\prime}=(\mathsf{Dist}(ae^{\prime}_{1}\#id_{e})=ae^{\prime}_{2}:v^{\prime})\#id^{\prime}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $e=\mathsf{Dist}(e_{1})$ and $ae_{1}\Rightarrow_{r}e_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ , $e_{2}\in\mathsf{dom}\leavevmode\nobreak\ \mathsf{Dist}(v^{\prime})$ , and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , $ae_{1}\Rightarrow_{r}e_{1}$ , and $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ , $ae_{2}\Rightarrow_{r}e_{2}$ , and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ ,

[TABLE]

Consider $ae^{\prime}_{s}=(ae^{4}_{s}:v^{\prime})\#id^{\prime}$ , $t^{\prime}_{s}=t^{2}_{s};\mathsf{observe}(\mathsf{Dist}(ae^{2}_{s})=e_{2});t^{4}_{s}$ , $p_{s}=p^{1}_{s};\mathsf{observe}(\mathsf{Dist}(e^{1}_{s})=e_{2});p^{2}_{s}$ , and $e_{s}=e^{2}_{s}$ . Because ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2},ae^{4}_{s},t^{4}_{s}\Rightarrow_{st}ae^{\prime}_{s}$ , and $id_{e}\in{\mathcal{}S}$ , the definition of $\Rightarrow_{st}$ implies

[TABLE]

Therefore

[TABLE]

Because $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $ae^{3}_{s}\Rightarrow_{r}e^{2}_{s}$ , the definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , $\sigma_{v},\sigma_{id}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ , and all variable names introduced by $t^{2}_{s}$ do not conflict with variable names in $ae^{4}_{s}$ and $t^{4}_{s}$ (Observation 1), the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $ae=(\mathsf{Dist}(ae_{1}\$ id_{e})=ae_{2}:v)#id $and$ id_{e}\in{\mathcal{}S}$, and assuming induction hypothesis,

[TABLE]

Case 3: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id$ and $ID(ae_{1})\in{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=((ae^{1}_{s}\leavevmode\nobreak\ ae^{3}_{s})aa:v)\#id$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , and $t_{s}=t^{1}_{s};t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $ae^{\prime}=((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})aa^{\prime}:v^{\prime})\#id^{\prime}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $e=(e_{1}\leavevmode\nobreak\ e_{2})$ , $ae_{1}\Rightarrow_{r}e_{1}$ , and $ae_{2}\Rightarrow_{r}e_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ .

If ${\mathcal{}V}(ae^{\prime}_{1})=\langle\lambda.x\leavevmode\nobreak\ e^{\prime},\sigma^{\prime}_{v},\sigma^{\prime}_{id}\rangle$ , $\sigma_{v}^{\prime}[y\rightarrow{\mathcal{}V}(ae^{\prime}_{2})],\sigma_{id}^{\prime}[y\rightarrow ID(ae^{\prime}_{2})]\vdash e^{\prime}[y/x]\Rightarrow_{s}\_,\_,ae_{e}$ , $aa=y=ae_{e}$ , else $aa=\perp$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , $ae_{1}\Rightarrow_{r}e_{1}$ , and $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ , $ae_{2}\Rightarrow_{r}e_{2}$ , and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ ,

[TABLE]

Consider $ae^{\prime}_{s}=((ae^{2}_{s}\leavevmode\nobreak\ ae^{4}_{s})aa^{\prime}:v^{\prime})\#id^{\prime}$ , $t^{\prime}_{s}=t^{2}_{s};t^{4}_{s}$ , $p_{s}=p^{1}_{s};p^{2}_{s}$ , and $e_{s}=(e^{1}_{s}\leavevmode\nobreak\ e^{2}_{s})$ . Because ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2},ae^{4}_{s},t^{4}_{s}\Rightarrow_{st}ae^{\prime}_{2}$ , and $id_{e}\in{\mathcal{}S}$ , the definition of $\Rightarrow_{st}$ implies

[TABLE]

Therefore

[TABLE]

Because $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ and $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $ae^{3}_{s}\Rightarrow_{r}e^{2}_{s}$ and $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , the definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , $\sigma_{v},\sigma_{id}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ , and all variable names introduced by $t^{2}_{s}$ do not conflict with variable names in $ae^{4}_{s}$ and $t^{4}_{s}$ (Observation 1), the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})aa:v)\#id$ and $id_{e}\in{\mathcal{}S}$ , and assuming induction hypothesis,

[TABLE]

Case 4: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id$ and $ID(ae_{1})\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=((ae^{1}_{s}\leavevmode\nobreak\ ae^{3}_{s})\perp:v)\#id$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , and $t_{s}=t^{1}_{s};t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $ae^{\prime}=((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})\perp:v^{\prime})\#id^{\prime}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $e=(e_{1}\leavevmode\nobreak\ e_{2})$ , $ae_{1}\Rightarrow_{r}e_{1}$ , and $ae_{2}\Rightarrow_{r}e_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , $ae_{1}\Rightarrow_{r}e_{1}$ , and $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ , $ae_{2}\Rightarrow_{r}e_{2}$ , and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ ,

[TABLE]

Consider $ae^{\prime}_{s}=((ae^{2}_{s}\leavevmode\nobreak\ ae^{4}_{s})\perp:v^{\prime})\#id^{\prime}$ , $t^{\prime}_{s}=t^{2}_{s};t^{4}_{s}$ , $p_{s}=p^{1}_{s};p^{2}_{s}$ , and $e_{s}=(e^{1}_{s}\leavevmode\nobreak\ e^{2}_{s})$ . Because ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2},ae^{4}_{s},t^{4}_{s}\Rightarrow_{st}ae^{\prime}_{2}$ , and $id_{e}\in{\mathcal{}S}$ , the definition of $\Rightarrow_{st}$ implies

[TABLE]

Therefore

[TABLE]

Because $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ and $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $ae^{3}_{s}\Rightarrow_{r}e^{2}_{s}$ and $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , the definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , $\sigma_{v},\sigma_{id}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ , and all variable names introduced by $t^{2}_{s}$ do not conflict with variable names in $ae^{4}_{s}$ and $t^{4}_{s}$ (Observation 1), the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v)\#id$ and $id_{e}\notin{\mathcal{}S}$ , and assuming induction hypothesis,

[TABLE]

Case 5: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})y=ae_{3}:v)\#id$ and $ID(ae_{1})\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(ae^{5}_{s}:v)\#id$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{3}\rightarrow_{ex}ae^{5}_{s},t^{5}_{s}$ , and $t_{s}=t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{1}_{s};t^{3}_{s};\mathsf{assume}\leavevmode\nobreak\ y=ae^{3}_{s};t^{5}_{s}$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $ae^{\prime}=((ae^{\prime}_{1}\leavevmode\nobreak\ ae^{\prime}_{2})y=ae^{\prime}_{3}:v^{\prime})\#id^{\prime}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ , and ${\mathcal{}S}\vdash ae_{3}\equiv ae^{\prime}_{3}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $e=(e_{1}\leavevmode\nobreak\ e_{2})$ , $ae_{1}\Rightarrow_{r}e_{1}$ , and $ae_{2}\Rightarrow_{r}e_{2}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ , ${\mathcal{}V}(ae^{\prime}_{1})=\langle\lambda.x\leavevmode\nobreak\ e^{\prime}_{3},\sigma^{\prime}_{v},\sigma^{\prime}_{id}\rangle$ , $e_{3}=e^{\prime}_{3}[y/x]$ , and $\sigma_{v}^{\prime}[y\rightarrow{\mathcal{}V}(ae^{\prime}_{2})],\sigma_{id}^{\prime}[y\rightarrow ID(ae^{\prime}_{2})]\vdash e_{3}\Rightarrow_{s}\_,\_,ae^{\prime}_{3}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\equiv ae^{\prime}_{1}$ , $ae_{1}\Rightarrow_{r}e_{1}$ , and $\sigma_{v},\sigma_{id}\vdash e_{1}\Rightarrow_{s}\_,\_,ae^{\prime}_{1}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\equiv ae^{\prime}_{2}$ , $ae_{2}\Rightarrow_{r}e_{2}$ , and $\sigma_{v},\sigma_{id}\vdash e_{2}\Rightarrow_{s}\_,\_,ae^{\prime}_{2}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{3}\Rightarrow_{ex}ae^{5}_{s},t^{5}_{s}$ , ${\mathcal{}S}\vdash ae_{3}\equiv ae^{\prime}_{3}$ , $ae_{3}\Rightarrow_{r}e_{3}$ , and $\sigma_{v},\sigma_{id}\vdash e_{3}\Rightarrow_{s}\_,\_,ae^{\prime}_{3}$ ,

[TABLE]

Consider $ae^{\prime}_{s}=(ae^{6}_{s}:v^{\prime})\#id^{\prime}$ , $t^{\prime}_{s}=t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{2}_{s};t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ y=ae^{4}_{s};t^{6}_{s}$ , $p_{s}=p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ x=e^{1}_{s};p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ y=e^{2}_{s};p^{3}_{s}$ , and $e_{s}=e^{3}_{s}$ . Because ${\mathcal{}S}\vdash ae_{1},ae^{2}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae_{2},ae^{4}_{s},t^{4}_{s}\Rightarrow_{st}ae^{\prime}_{2}$ , ${\mathcal{}S}\vdash ae_{3},ae^{6}_{s},t^{6}_{s}\Rightarrow_{st}ae^{\prime}_{3}$ , the definition of $\Rightarrow_{st}$ implies

[TABLE]

Therefore

[TABLE]

Because $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , $t^{5}_{s}\Rightarrow_{r}p^{3}_{s}$ , $ae^{1}_{s}\Rightarrow_{r}e^{1}_{s}$ , and $ae^{3}_{s}\Rightarrow_{r}e^{2}_{s}$ , definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $ae^{5}_{s}\Rightarrow_{r}e^{3}_{s}$ , the definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{1}_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{2}_{s}$ , $\sigma_{v},\sigma_{id}\vdash p^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e^{2}_{s}\Rightarrow_{s}t^{4}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{4}_{s}$ , and all variable names introduced by $t^{2}_{s}$ do not conflict with variable names in $ae^{4}_{s}$ and $t^{4}_{s}$ (Observation 1), the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})y=ae_{3}:v)\#id$ and $id_{e}\notin{\mathcal{}S}$ , and assuming induction hypothesis,

[TABLE]

Because we have covered all cases, using induction, the following statement is true

[TABLE]

∎

Lemma 0.

For any two traces $t,t^{\prime}$ , a valid subproblem ${\mathcal{}S}$ on $t$ and a subtrace ${\mathcal{}S}\vdash t\Rightarrow_{ex}t_{s}$

[TABLE]

implies there exists a subtrace $t^{\prime}_{s}$ such that

[TABLE]

Proof.

Proof by Induction

Base Case: $t=\emptyset$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $t_{s}=\emptyset$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $t^{\prime}=\emptyset$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p=\emptyset$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Consider $t^{\prime}_{s}=\emptyset$ , $p_{s}=\emptyset$ . By definition of $\Rightarrow_{r}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Therefore

[TABLE]

By definition of $\Rightarrow_{st}$

[TABLE]

Therefore

[TABLE]

Therefore when $t=\emptyset$ ,

[TABLE]

Induction Case:

Case 1: $t=\mathsf{assume}\leavevmode\nobreak\ x=ae;t_{1}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $t_{s}=t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae_{s};t^{3}_{s}$ , ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ and ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $t^{\prime}=\mathsf{assume}\leavevmode\nobreak\ x=ae^{\prime};t^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae\equiv ae^{\prime}$ , and ${\mathcal{}S}\vdash t_{1}\equiv t^{\prime}_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p=\mathsf{assume}\leavevmode\nobreak\ x=e;p_{1}$ , $ae\Rightarrow_{r}e$ , and $t_{1}\Rightarrow_{r}p_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}\_,\_,ae^{\prime}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p_{1}\Rightarrow_{s}t^{\prime}_{1}$ , and $\sigma^{\prime}_{v}=\sigma_{v}[x\rightarrow{\mathcal{}V}(ae^{\prime})]$ , $\sigma^{\prime}_{id}=\sigma_{id}[x\rightarrow ID(ae^{\prime})]$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ , ${\mathcal{}S}\vdash t_{1}\equiv t^{\prime}_{1}$ , $t_{1}\Rightarrow_{r}p_{1}$ , and $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p_{1}\Rightarrow_{s}t^{\prime}_{1}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae\equiv ae^{\prime}$ , $ae\Rightarrow_{r}e$ , and $\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}\_,\_,ae^{\prime}$ ,

[TABLE]

Because ${\mathcal{}S}\vdash ae,ae^{\prime}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}$ and ${\mathcal{}S}\vdash t_{1},t^{4}_{s}\Rightarrow_{st}t^{\prime}_{1}$ , the definition of $\Rightarrow_{st}$ implies

[TABLE]

Therefore

[TABLE]

Because $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , and $ae_{s}\Rightarrow_{r}e_{s}$ , the definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{\prime}_{s}$ , $\sigma_{v}^{\prime},\sigma_{id}^{\prime}\vdash p^{2}_{s}\Rightarrow_{s}t^{4}_{s}$ , all variable names introduced by $t^{2}_{s}$ do not conflict with variable names in $t^{4}_{s}$ (Observation 1), ${\mathcal{}V}(ae^{\prime})={\mathcal{}V}(ae^{\prime}_{s})$ , and $ID(ae^{\prime})=ID(ae^{\prime}_{s})$ (Observation 2), the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $t=\mathsf{assume}\leavevmode\nobreak\ x=ae;t_{1}$ , and assuming the induction hypothesis,

[TABLE]

Case 2: $t=\mathsf{observe}(\mathsf{Dist}(ae)=e_{v});t_{1}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $t_{s}=t^{1}_{s};\mathsf{observe}(\mathsf{Dist}(ae_{s})=e_{v});t^{3}_{s}$ , ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ and ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ .

By assumption

[TABLE]

By definition of $\equiv$

[TABLE]

Then $t^{\prime}=\mathsf{observe}(\mathsf{Dist}(ae^{\prime})=e_{v});t^{\prime}_{1}$ , ${\mathcal{}S}\vdash ae\equiv ae^{\prime}$ , and ${\mathcal{}S}\vdash t_{1}\equiv t^{\prime}_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{r}$

[TABLE]

Then $p=\mathsf{observe}(\mathsf{Dist}(e)=e_{v});p_{1}$ , $ae\Rightarrow_{r}e$ , and $t_{1}\Rightarrow_{r}p_{1}$ .

By assumption

[TABLE]

By definition of $\Rightarrow_{s}$

[TABLE]

Then $\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}\_,\_,ae^{\prime}$ and $\sigma_{v},\sigma_{id}\vdash p_{1}\Rightarrow_{s}t^{\prime}_{1}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{3}_{s}$ , ${\mathcal{}S}\vdash t_{1}\equiv t^{\prime}_{1}$ , $t_{1}\Rightarrow_{r}p_{1}$ , and $\sigma_{v},\sigma_{id}\vdash p_{1}\Rightarrow_{s}t^{\prime}_{1}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae\equiv ae^{\prime}$ , $ae\Rightarrow_{r}e$ , and $\sigma_{v},\sigma_{id}\vdash e\Rightarrow_{s}\_,\_,ae^{\prime}$ ,

[TABLE]

Because ${\mathcal{}S}\vdash ae,ae^{\prime}_{s},t^{2}_{s}\Rightarrow_{st}ae^{\prime}$ and ${\mathcal{}S}\vdash t_{1},t^{4}_{s}\Rightarrow_{st}t^{\prime}_{1}$ , the definition of $\Rightarrow_{st}$ implies

[TABLE]

Therefore

[TABLE]

Because $t^{3}_{s}\Rightarrow_{r}p^{2}_{s}$ , $t^{1}_{s}\Rightarrow_{r}p^{1}_{s}$ , and $ae_{s}\Rightarrow_{r}e_{s}$ , the definition of $\Rightarrow_{r}$ implies

[TABLE]

Therefore

[TABLE]

Because $\sigma_{v},\sigma_{id}\vdash p^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ z=e_{s}\Rightarrow_{s}t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ z=ae^{\prime}_{s}$ , $\sigma_{v},\sigma_{id}\vdash p^{2}_{s}\Rightarrow_{s}t^{4}_{s}$ , and all variable names introduced by $t^{2}_{s}$ do not conflict with variable names in $t^{4}_{s}$ (Observation 1), the definition of $\Rightarrow_{s}$ implies

[TABLE]

Therefore

[TABLE]

Therefore when $t=\mathsf{observe}(\mathsf{Dist}(ae)=e_{v});t_{1}$ , and assuming the induction hypothesis,

[TABLE]

All cases have been covered therefore by induction the following statement is true.

[TABLE]

∎

Theorem 19.

Given a valid trace $t$ and a valid subproblem ${\mathcal{}S}$ of $t$ and a subtrace $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ . For all possible traces $t^{\prime}$ , $t^{\prime}\in\mathsf{Traces}(\mathsf{Program}(t))\leavevmode\nobreak\ \wedge\leavevmode\nobreak\ {\mathcal{}S}\vdash t\equiv t^{\prime}$ implies there exists a subtrace $t^{\prime}_{s}$ such that

•

$t^{\prime}_{s}\in\mathsf{Traces}(\mathsf{Program}(t_{s}))$ **

•

$t^{\prime}=\mathsf{StitchTrace}(t,t^{\prime}_{s},{\mathcal{}S})$ **

Proof.

From definitions of $\mathsf{Traces}$ , $\mathsf{Program}$ , $\mathsf{ExtractTrace}$ , and $\mathsf{StitchTrace}$ , given a trace $t$ and a valid subproblem ${\mathcal{}S}$

[TABLE]

Because Lemma 18 is true for all environments $\sigma_{v},\sigma_{id},\sigma^{\prime}_{v}$ , and $\sigma^{\prime}_{id}$ . The theorem is equivalent to the lemma, but with $\sigma_{v},\sigma_{id},\sigma^{\prime}_{v}$ , and $\sigma^{\prime}_{id}$ set to $\emptyset$ . ∎

A.4. Metaprogramming

Theorem 20.

A reversible subproblem selection strategy $\mathsf{SS}$ divides the trace space of program $p$ into equivalence classes.

Proof.

$\mathsf{SS}\vdash t\equiv t^{\prime}$ is an equivalence relation over traces $t,t^{\prime}\in T$ .

Reflexivity : $\mathsf{SS}\vdash t\equiv t$ is true by definition.

Symmetry : $\mathsf{SS}\vdash t\equiv t\wedge\mathsf{SS}\vdash t\equiv t^{\prime}\implies\mathsf{SS}\vdash t^{\prime}\equiv t$ . Hence its symmetric.

Transitivity : $\mathsf{SS}\vdash t_{1}\equiv t_{2}$ , $\mathsf{SS}\vdash t_{2}\equiv t_{3}$ then $\mathsf{SS}\vdash t_{1}\equiv t_{3}$ (by definition of reversibility and symmetry). ∎

Lemma 0.

For any augmented expression $ae$ and subproblem ${\mathcal{}S}$ ,

[TABLE]

Proof.

Proof by Induction

Base Case:

Case 1: $ae=x:x$ ,

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=x:x$ and $t_{s}=\emptyset$ .

By definition of $\mathsf{pdf}$ , $\mathsf{pdf}\llbracket x:x\rrbracket=1$ and $\mathsf{pdf}\llbracket\emptyset\rrbracket=1$ . Therefore

[TABLE]

Case 2: $ae=x:v$ ,

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=x:v$ and $t_{s}=\emptyset$ .

By definition of $\mathsf{pdf}$ , $\mathsf{pdf}\llbracket x:v\rrbracket=1$ and $\mathsf{pdf}\llbracket\emptyset\rrbracket=1$ . Therefore

[TABLE]

Case 3: $ae=\lambda.x\leavevmode\nobreak\ e:v$ ,

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $\mathsf{pdf}\llbracket\lambda.x\leavevmode\nobreak\ e:v\rrbracket=v,1$ and $\mathsf{pdf}\llbracket\emptyset\rrbracket=1$ . Therefore

[TABLE]

Induction Case:

Case 1: $ae=(ae_{1}\leavevmode\nobreak\ ae_{2})\perp:v$ ,

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(ae^{1}_{s}\leavevmode\nobreak\ ae^{2}_{s})\perp:v$ , $t_{s}=t^{1}_{s};t^{2}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , and ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{2}_{s},t^{2}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{2}_{s},t^{2}_{s}$

[TABLE]

From definition of $\mathsf{pdf}$

[TABLE]

From definition of $\mathsf{pdf}$ $\mathsf{pdf}\llbracket ae_{s}\rrbracket=\mathsf{pdf}\llbracket ae^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket ae^{2}_{s}\rrbracket$ and $\mathsf{pdf}\llbracket t_{s}\rrbracket=\mathsf{pdf}\llbracket t^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket t^{2}_{s}\rrbracket$ . Therefore

[TABLE]

Case 2: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{3}:v)\#id$ and $ID(ae_{1})\in{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=(ae^{1}_{s}\leavevmode\nobreak\ ae^{2}_{s})x=ae_{3}:v$ , $t_{s}=t^{1}_{s};t^{2}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , and ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{2}_{s},t^{2}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{2}_{s},t^{2}_{s}$

[TABLE]

From definition of $\mathsf{pdf}$

[TABLE]

From definition of $\mathsf{pdf}$ $\mathsf{pdf}\llbracket ae_{s}\rrbracket=\mathsf{pdf}\llbracket ae^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket ae^{2}_{s}\rrbracket*\mathsf{pdf}\llbracket ae_{3}\rrbracket$ and $\mathsf{pdf}\llbracket t_{s}\rrbracket=\mathsf{pdf}\llbracket t^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket t^{2}_{s}\rrbracket$ . Therefore

[TABLE]

Case 3: $ae=((ae_{1}\leavevmode\nobreak\ ae_{2})x=ae_{3}:v)\#id$ and $ID(ae_{1})\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=ae^{3}_{s}:v$ , $t_{s}=t^{1}_{s};\mathsf{assume}\leavevmode\nobreak\ y=ae^{1}_{s};t^{2}_{s};\mathsf{assume}\leavevmode\nobreak\ x=ae^{2}_{s};t^{3}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{2}_{s},t^{2}_{s}$ , ${\mathcal{}S}\vdash ae_{3}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{2}_{s},t^{2}_{s}$

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{3}\Rightarrow_{ex}ae^{3}_{s},t^{3}_{s}$

[TABLE]

From definition of $\mathsf{pdf}$

[TABLE]

From definition of $\mathsf{pdf}$ $\mathsf{pdf}\llbracket ae_{s}\rrbracket=\mathsf{pdf}\llbracket ae^{3}_{s}\rrbracket$ and $\mathsf{pdf}\llbracket t_{s}\rrbracket=\mathsf{pdf}\llbracket t^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket ae^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket t^{2}_{s}\rrbracket*\mathsf{pdf}\llbracket ae^{2}_{s}\rrbracket*\mathsf{pdf}\llbracket t^{3}_{s}\rrbracket$ . Therefore

[TABLE]

Case 4: $ae=\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}:v$ and $id_{e}\in{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=\mathsf{Dist}(ae_{1})=ae_{2}:v$ , $t_{s}=t^{1}_{s}$ , and ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ ,

[TABLE]

From definition of $\mathsf{pdf}$ , $ae_{2}\Rightarrow_{r}e$

[TABLE]

From definition of $\mathsf{pdf}$ $\mathsf{pdf}\llbracket ae_{s}\rrbracket=\mathsf{pdf}\llbracket ae^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket ae_{2}\rrbracket*\mathsf{pdf}_{\mathsf{Dist}}({\mathcal{}V}(ae_{1}),e)$ and $\mathsf{pdf}\llbracket t_{s}\rrbracket=\mathsf{pdf}\llbracket t^{1}_{s}\rrbracket$ . Therefore

[TABLE]

Case 5: $ae=(\mathsf{Dist}(ae_{1}\#id_{e})=ae_{2}:v)\#id$ and $id_{e}\notin{\mathcal{}S}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $ae_{s}=ae^{2}_{s}:v$ , $t_{s}=t^{1}_{s}$ , ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ , $ae_{2}\Rightarrow_{r}e_{v}$ and ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{1}\Rightarrow_{ex}ae^{1}_{s},t^{1}_{s}$ ,

[TABLE]

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash ae_{2}\Rightarrow_{ex}ae^{2}_{s},t^{2}_{s}$ ,

[TABLE]

From definition of $\mathsf{pdf}$ ,

[TABLE]

From definition of $\mathsf{pdf}$ $\mathsf{pdf}\llbracket ae_{s}\rrbracket=\mathsf{pdf}\llbracket ae^{2}_{s}\rrbracket$ , $\mathsf{pdf}\llbracket t_{s}\rrbracket=\mathsf{pdf}\llbracket t^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket ae^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket t^{2}_{s}\rrbracket*\mathsf{pdf}_{\mathsf{Dist}}({\mathcal{}V}(ae^{1}_{s}),e)$ and ${\mathcal{}V}(ae^{1}_{s})={\mathcal{}V}(ae_{1})$ (Observation 2). Therefore

[TABLE]

Because we have considered all cases, by induction, for augmented expression $ae$ , subproblem ${\mathcal{}S}$ , augmented subexpression $ae_{s}$ , and a subtrace $t_{s}$ ,

[TABLE]

∎

Theorem 22.

Given a trace $t$ and a valid subproblem ${\mathcal{}S}$ on $t$ , then for subtrace $t_{s}=\mathsf{ExtractTrace}(t,{\mathcal{}S})$ ,

[TABLE]

i.e. for the unnormalized density of $t$ and $t_{s}$ is equal.

Proof.

Proof by induction

Base Case: $t=\emptyset$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then $t_{s}=\emptyset$ .

By definition of $\mathsf{pdf}$ , $\mathsf{pdf}\llbracket\emptyset\rrbracket=1$

[TABLE]

Induction Case:

Case 1: $t=\mathsf{assume}\leavevmode\nobreak\ x=ae;t_{1}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{2}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{2}_{s}$ ,

[TABLE]

From Lemma 21 over augmented expressions

[TABLE]

Because ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$

[TABLE]

From definition of $\mathsf{pdf}$

[TABLE]

Because $\mathsf{pdf}\llbracket t_{s}\rrbracket=\mathsf{pdf}\llbracket ae_{s}\rrbracket*\mathsf{pdf}\llbracket t^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket t^{2}_{s}\rrbracket$

[TABLE]

Case 2: $t=\mathsf{observe}(\mathsf{Dist}(ae)=e);t_{1}$

By assumption

[TABLE]

By definition of $\Rightarrow_{ex}$

[TABLE]

Then ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$ , ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{2}_{s}$ .

By induction hypothesis

[TABLE]

Because ${\mathcal{}S}\vdash t_{1}\Rightarrow_{ex}t^{2}_{s}$ ,

[TABLE]

From Lemma 21 over augmented expressions

[TABLE]

Because ${\mathcal{}S}\vdash ae\Rightarrow_{ex}ae_{s},t^{1}_{s}$

[TABLE]

From definition of $\mathsf{pdf}$

[TABLE]

Because $\mathsf{pdf}\llbracket t_{s}\rrbracket=\mathsf{pdf}\llbracket ae_{s}\rrbracket*\mathsf{pdf}\llbracket t^{1}_{s}\rrbracket*\mathsf{pdf}\llbracket t^{2}_{s}\rrbracket*\mathsf{pdf}_{\mathsf{Dist}}({\mathcal{}V}(ae),e)$ and ${\mathcal{}V}(ae)={\mathcal{}V}(ae_{s})$ (Observation 2)

[TABLE]

Because we have covered all cases, by induction, for any trace $t$ and subproblem ${\mathcal{}S}$

[TABLE]

Therefore for any trace $t$ and subproblem ${\mathcal{}S}$

[TABLE]

∎

Bibliography52

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1)
2Andrieu et al . (2003) Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I. Jordan. 2003. An introduction to MCMC for machine learning. Machine learning 50, 1-2 (2003), 5–43.
3Anonymous (2020) Anonymous. 2020. A Type System and Semantics for Sound Programmable Inference in Probabilistic Languages, Anonymous Submission to POPL 2020. (2020).
4Arora et al . (2012) Nimar S. Arora, Rodrigo de Salvo Braz, Erik B. Sudderth, and Stuart J. Russell. 2012. Gibbs Sampling in Open-Universe Stochastic Languages. Co RR abs/1203.3464 (2012). ar Xiv:1203.3464 http://arxiv.org/abs/1203.3464
5Athreya et al . (1996) Krishna B Athreya, Hani Doss, Jayaram Sethuraman, et al . 1996. On the convergence of the Markov chain simulation method. The Annals of Statistics 24, 1 (1996), 69–100.
6Atkinson and Carbin (2017) Eric Atkinson and Michael Carbin. 2017. Typesafety for Explicitly-Coded Probabilistic Inference Procedures. (2017).
7Berti et al . (2008) Patrizia Berti, Luca Pratelli, Pietro Rigo, et al . 2008. Trivial intersection of σ 𝜎 \sigma -fields and Gibbs sampling. The Annals of Probability 36, 6 (2008), 2215–2234.
8Borgström et al . (2016) Johannes Borgström, Ugo Dal Lago, Andrew D Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In ACM SIGPLAN Notices , Vol. 51. ACM, 33–46.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Compositional Inference Metaprogramming with Convergence Guarantees

Abstract.

1. Introduction

1.1. Probabilistic Inference and Convergence

1.2. Our Result

2. Language and Execution Model

3. Independent Subproblem Inference

4. Convergence of Stochastic Alternating Class Kernels

4.1. Preliminaries

Definition 0 (Topology).

Definition 0 (σ\sigmaσ-algebra).

Definition 0 (Measure).

Definition 0 (Borel σ\sigmaσ-algebra).

Definition 0 (Measurable Function over Measurable Spaces).

Definition 0 (Pushforward measure).

Definition 0 (Measurable).

Definition 0 (Simple Function).

Definition 0 (Lebesgue Integral).

Definition 0 (Markov Transition Kernel).

Definition 0 (π\piπ-irreducible).

Definition 0 (Stationary Distribution).

Definition 0 (Aperiodicity).

Definition 0 (Asymptotic convergence).

Theorem 15.

Definition 0 (Subalgebra).

Definition 0 (Induced Probability space).

Definition 0 (Regular Conditional Probability Measure over Product Space).

4.2. Class Functions and Class Kernels

Definition 0 (Two-way measurable function).

Example 0.

Definition 0 (Generalized Product Space).

Example 0.

Definition 0 (Class Functions).

Example 0.

Definition 0 (Class Kernels).

Example 0.

4.3. Properties of Class Kernels

Lemma 0.

Lemma 0.

Lemma 0.

4.4. Connecting the probability space

Definition 0 (Connecting the space (T,Σ,π)(T,\Sigma,\pi)(T,Σ,π)).

Example 0 (Connected product space).

4.5. Stochastic Alternating Class Kernels

Definition 0 (Stochastic Alternating Markov Chain Transition Kernel).

Example 0.

Lemma 0.

Lemma 0.

Lemma 0.

Theorem 37.

Theorem 38.

Theorem 39.

Theorem 40.

Proof.

5. Inference Metaprograms

5.1. Preliminaries

Definition 0 (Reversible subproblem selection strategy).

Theorem 2.

Theorem 3.

Theorem 4.

Proof.

Definition 0 (Generalized Markov Kernel).

Definition 0 (Generalized Class Kernels).

5.2. Inference Metaprogramming

Theorem 7.

Proof.

Corollary 0.

6. Related Work

7. Conclusion

Appendix A Appendix

A.1. Convergence

Lemma 0.

Proof.

Definition 0 ( $\sigma$ -algebra).

Definition 0 (Borel $\sigma$ -algebra).

Definition 0 ( $\pi$ -irreducible).

Definition 0 (Connecting the space $(T,\Sigma,\pi)$ ).