Prepaid parameter estimation without likelihoods

Merijn Mestdagh; Stijn Verdonck; Kristof Meers; Tim Loossens; Francis; Tuerlinckx

arXiv:1812.09799·stat.CO·July 1, 2020·PLoS Comput. Biol.

Prepaid parameter estimation without likelihoods

Merijn Mestdagh, Stijn Verdonck, Kristof Meers, Tim Loossens, Francis, Tuerlinckx

PDF

TL;DR

This paper introduces a prepaid database approach for statistical inference that precomputes model outcomes, enabling rapid and accurate parameter estimation across various models and priors, significantly outperforming existing methods.

Contribution

The authors develop a prepaid database method that allows fast, accurate inference for intractable models by precomputing and interpolating model outcomes, facilitating resource sharing among researchers.

Findings

01

Achieves 23,000 to 100,000-fold speed improvements.

02

Handles models previously considered quasi inestimable.

03

Demonstrates effectiveness on three challenging models.

Abstract

In various fields, statistical models of interest are analytically intractable. As a result, statistical inference is greatly hampered by computational constraints. However, given a model, different users with different data are likely to perform similar computations. Computations done by one user are potentially useful for other users with different data sets. We propose a pooling of resources across researchers to capitalize on this. More specifically, we preemptively chart out the entire space of possible model outcomes in a prepaid database. Using advanced interpolation techniques, any individual estimation problem can now be solved on the spot. The prepaid method can easily accommodate different priors as well as constraints on the parameters. We created prepaid databases for three challenging models and demonstrate how they can be distributed through an online parameter estimation…

Tables9

Table 1. Table 1 : The RMSE of the estimates of the test set of the trait model. T obs subscript 𝑇 obs T_{\mbox{obs}} refers to the number of observations (i.e., vector with species frequencies) and Ω Ω \Omega is the number of prepaid points.

$T_{obs}$	version	$Ω$	$\log (I)$	$\log (A)$	$h$	$\log (σ)$
1	${ABC}^{Orig}$	/	0.17	0.67	7.45	0.74
1	${SL}_{PM}^{Grid}$	100000	0.17	0.66	7.49	0.7
1	${ABC}_{PM}^{Grid}$	100000	0.16	0.63	7.9	0.7
1	${ABC}_{PM}^{Grid}$	500000	0.16	0.62	8.17	0.7
1000	${ABC}_{PM}^{Grid}$	100000	0.07	0.35	6.41	0.61
1000	${ABC}_{PM}^{Grid}$	500000	0.05	0.27	4.83	0.48
1000	${ABC}_{PM}^{SVM}$	100000	0.03	0.23	5.24	0.42
1000	${ABC}_{PM}^{SVM}$	500000	0.03	0.21	4.39	0.4

Table 2. Table 2 : RMSE for the estimation of the parameters of the Ricker model for T = 10 5 𝑇 superscript 10 5 T=10^{5} using the SL ML Grid subscript superscript SL Grid ML \text{SL}^{\text{Grid}}_{\text{ML}} , SL ML SVM subscript superscript SL SVM ML \text{SL}^{\text{SVM}}_{\text{ML}} and SL ML Lin subscript superscript SL Lin ML \text{SL}^{\text{Lin}}_{\text{ML}} prepaid methods.

	r	$σ$	$ϕ$
${SL}_{ML}^{Grid}$	1.2	0.021	0.14
${SL}_{ML}^{SVM}$	0.43	0.0044	0.023
${SL}_{ML}^{Lin}$	0.54	0.013	0.091

Table 3. Table 3 : Average time in seconds needed for the SL Orig superscript SL Orig \text{SL}^{\text{Orig}} estimation for multiple T obs subscript 𝑇 obs T_{\mathrm{obs}} and the speed up for the SL ML Grid subscript superscript SL Grid ML \text{SL}^{\text{Grid}}_{\text{ML}} and SL ML SVM subscript superscript SL SVM ML \text{SL}^{\text{SVM}}_{\text{ML}} methods. The time for T obs = 10 4 subscript 𝑇 obs superscript 10 4 T_{\mathrm{obs}}=10^{4} and T obs = 10 5 subscript 𝑇 obs superscript 10 5 T_{\mathrm{obs}}=10^{5} was not measured, so these values are estimated and between brackets. (Figure 7 shows the corresponding accuracies.)

$T_{obs}$	$10^{2}$	$5 \cdot 10^{2}$	$10^{3}$	$10^{4}$	$10^{5}$
time ${SL}^{Orig}$	716 s	3549 s	5841 s	(50000 s)	(500000 s)
${SL}_{ML}^{GRID}$ times faster	16273	80659	132750	(1000000)	(10000000)
${SL}_{ML}^{SVM}$ times faster	194	959	1578	(10000)	(100000)

Table 4. Table 4 : The effective coverages of the test set for different T obs subscript 𝑇 obs T_{\mathrm{obs}} .

	$T_{obs}$	r	$σ$	$ϕ$
	$10^{2}$	0.9	0.89	0.93
${SL}^{Orig}$	$5 \cdot 10^{2}$	0.94	0.92	0.94
	$10^{3}$	0.92	0.91	0.92
	$10^{2}$	0.95	0.84	0.97
prepaid	$5 \cdot 10^{2}$	0.96	0.94	0.96
	$10^{3}$	0.97	0.95	0.97

Table 5. Table 5 : Population dynamics of the Chilo partellus [ 32 , 26 ] . We show the estimates, the 95% confidence intervals and computation time of the prepaid and synthetic likelihood estimation techniques.

	r	$σ$	$ϕ$	Time (in seconds)
${SL}^{Orig}$	1.05 (1.01– 1.1)	0.41 (0.31 – 0.51)	248.17 (139.53 – 493.2)	830
${SL}_{ML}^{GRID}$	1.10 (1.06– 1.34)	0.43 (0.30 – 0.54)	140.60 (43.94 – 208.19)	0.2
${SL}_{ML}^{SVM}$	1.06 (1.01– 1.24)	0.41 (0.21 – 0.56)	176.15 (19.27 – 427.65)	4

Table 6. Table 6 : RMSE of SL MAP GRID subscript superscript SL GRID MAP \text{SL}^{\text{GRID}}_{\text{MAP}} estimation of test sets with T obs = 100 subscript 𝑇 obs 100 T_{\mathrm{obs}}=100 created with priors P 1 subscript 𝑃 1 P_{1} , P 2 subscript 𝑃 2 P_{2} and P 3 subscript 𝑃 3 P_{3} and estimated by using priors P 1 subscript 𝑃 1 P_{1} , P 2 subscript 𝑃 2 P_{2} and P 3 subscript 𝑃 3 P_{3} . For each test set and parameter the best result is shown in bold.

parameter	r	$σ$	$ϕ$	r	$σ$	$ϕ$	r	$σ$	$ϕ$
	estimated with $P_{1}$			estimated with $P_{2}$			estimated with $P_{3}$
test set created with $P_{1}$	8.2	0.13	0.53	10	0.12	0.82	16	0.17	0.94
test set created with $P_{1}$	10	0.13	0.55	6.5	0.072	0.43	11	0.12	0.60
test set created with $P_{1}$	4.4	0.15	0.33	6.9	0.19	0.51	3.5	0.065	0.28

Table 7. Table 7 : RMSE for Ricker model data where T obs = 100 subscript 𝑇 obs 100 T_{\mathrm{obs}}=100 for an experimental set up with two conditions where r 𝑟 r and σ 𝜎 \sigma are equal over the conditions. Parameters are estimated by using SL MAP GRID subscript superscript SL GRID MAP \text{SL}^{\text{GRID}}_{\text{MAP}} with a flat prior (same as SL ML GRID subscript superscript SL GRID ML \text{SL}^{\text{GRID}}_{\text{ML}} )and with a prior from Equation 12

prior	r	$σ$	$ϕ$
flat prior	88	0.17	0.42
prior Equation 12	61	0.11	0.36

Table 8. Table 8 : The MAE of the estimations of the test set of the trait model.

$T_{obs}$	version	$Ω$	$\log (I)$	$\log (A)$	$h$	$\log (σ)$
1	${ABC}^{Orig}$	/	0.11	0.45	1.4	0.45
1	${SL}_{PM}^{Grid}$	100000	0.1	0.39	0.96	0.38
1	${ABC}_{PM}^{Grid}$	100000	0.1	0.4	1	0.4
1	${ABC}_{PM}^{Grid}$	500000	0.1	0.38	1	0.39
1000	${ABC}_{PM}^{Grid}$	100000	0.03	0.14	0.39	0.32
1000	${ABC}_{PM}^{Grid}$	500000	0.02	0.09	0.27	0.22
1000	${ABC}_{PM}^{SVM}$	100000	0.02	0.07	0.18	0.14
1000	${ABC}_{PM}^{SVM}$	500000	0.01	0.07	0.17	0.15

Table 9. Table 9 : The effective 95 % percent \% coverage of the estimations of the test set of the trait model.

$T_{obs}$	version	$Ω$	$\log (I)$	$\log (A)$	$h$	$\log (σ)$
1	${ABC}^{Orig}$	/	0.97	0.97	0.99	0.96
1	${ABC}^{Orig}$	100000	0.84	0.87	0.86	0.86
1	${ABC}_{PM}^{Grid}$	100000	0.94	0.95	0.95	0.94
1	${ABC}_{PM}^{Grid}$	500000	0.94	0.95	0.94	0.94
1000	${ABC}_{PM}^{Grid}$	100000	0.27	0.3	0.29	0.27
1000	${ABC}_{PM}^{Grid}$	500000	0.47	0.5	0.48	0.48
1000	${ABC}_{PM}^{SVM}$	100000	0.93	0.94	0.96	0.93
1000	${ABC}_{PM}^{SVM}$	500000	0.96	0.95	0.96	0.95

Equations117

y_{t} N_{t + 1} \sim Pois (ϕ N_{t}) = r N_{t} e^{- N_{t} + e_{t}}

y_{t} N_{t + 1} \sim Pois (ϕ N_{t}) = r N_{t} e^{- N_{t} + e_{t}}

M_{μ}

M_{μ}

= \frac{1}{N} j = 0 \sum N - 1 (μ_{(1)} + j Δ)

= μ_{(1)} + \frac{Δ}{N} j = 1 \sum N - 1 j

= μ_{(1)} + \frac{Δ ( N - 1 )}{2}

μ_{(1)} \equiv i \in 1, 2, ..., N min (μ_{(i)})

μ_{(1)} \equiv i \in 1, 2, ..., N min (μ_{(i)})

V_{μ}

V_{μ}

= \frac{1}{N} (j = 0 \sum N - 1 (μ_{(1)} + j Δ)^{2}) - M_{μ}^{2}

= \frac{1}{N} (j = 0 \sum N - 1 (μ_{(1)}^{2} + 2 j Δ μ_{(1)} + j^{2} Δ^{2})) - M_{μ}^{2}

= μ_{(1)}^{2} + \frac{2 Δ μ _{(1)}}{N} (j = 1 \sum N - 1 j) + \frac{Δ ^{2}}{N} (j = 1 \sum N - 1 j^{2}) - M_{μ}^{2}

= μ_{(1)}^{2} + Δ μ_{(1)} (N - 1) + \frac{Δ ^{2} ( N - 1 ) ( 2 N - 1 )}{6} - M_{μ}^{2}

= \frac{Δ ^{2} ( N - 1 ) ( 2 N - 1 )}{6} - \frac{Δ ^{2} ( N - 1 ) ^{2}}{4}

= \frac{Δ ^{2} ( N - 1 ) ( N + 1 )}{12}

\approx \frac{Δ ^{2} N ^{2}}{12}

(\hat{β}_{0} \hat{β}_{1})

(\hat{β}_{0} \hat{β}_{1})

σ_{0}^{2}

σ_{0}^{2}

σ_{1}^{2}

σ_{01}

\overset{μ}{^} = \frac{y ˉ - β ^ _{0}}{β ^ _{1}} .

\overset{μ}{^} = \frac{y ˉ - β ^ _{0}}{β ^ _{1}} .

(\overset{y}{ˉ} - \hat{β}_{0} \hat{β}_{1})

(\overset{y}{ˉ} - \hat{β}_{0} \hat{β}_{1})

E (\overset{μ}{^} ∣ \overset{y}{ˉ})

E (\overset{μ}{^} ∣ \overset{y}{ˉ})

\approx \frac{E ( y ˉ - β ^ _{0} ∣ y ˉ )}{E ( β ^ _{1} ∣ y ˉ )} - \frac{1}{E ( β ^ _{1} ∣ y ˉ ) ^{2}} \mbox C o v (\overset{y}{ˉ} - \hat{β}_{0}, \hat{β}_{1} ∣ \overset{y}{ˉ}) + \frac{E ( y ˉ - β ^ _{0} ∣ y ˉ )}{E ( β ^ _{1} ∣ y ˉ ) ^{3}} \mbox V a r (\hat{β}_{1} ∣ \overset{y}{ˉ})

\approx \frac{y ˉ}{1} - \frac{1}{1 ^{2}} \frac{s ^{2}}{T _{sim}} \frac{12 M _{μ}}{Δ ^{2} N ^{3}} + \frac{y ˉ}{1 ^{3}} \frac{s ^{2}}{T _{sim}} \frac{12}{Δ ^{2} N ^{3}}

= \overset{y}{ˉ} (1 + \frac{s ^{2}}{T _{sim}} \frac{12}{Δ ^{2} N ^{3}}) - \frac{M _{μ}}{T _{sim}} \frac{12 s ^{2}}{Δ ^{2} N ^{3}}

\mbox V a r (\overset{μ}{^} ∣ \overset{y}{ˉ})

\mbox V a r (\overset{μ}{^} ∣ \overset{y}{ˉ})

\approx \frac{E ( y ˉ - β ^ _{0} ∣ y ˉ ) ^{2}}{E ( β ^ _{1} ∣ y ˉ ) ^{2}} (\frac{\mbox V a r ( y ˉ - β ^ _{0} ∣ y ˉ )}{E ( y ˉ - β ^ _{0} ∣ y ˉ ) ^{2}} + \frac{\mbox V a r ( β ^ _{1} ∣ y ˉ )}{E ( β ^ _{1} ∣ y ˉ ) ^{2}} - \frac{2 \mbox C o v ( y ˉ - β ^ _{0} , β ^ _{1} ∣ y ˉ )}{E ( y ˉ - β ^ _{0} ∣ y ˉ ) E ( β ^ _{1} ∣ y ˉ )})

= \frac{y ˉ ^{2}}{1 ^{2}} (\frac{σ _{0}^{2}}{y ˉ ^{2}} + \frac{σ _{1}^{2}}{1 ^{2}} - \frac{2 ( - σ _{01} )}{y ˉ \cdot 1})

= σ_{0}^{2} + \overset{y}{ˉ}^{2} σ_{1}^{2} - 2 \overset{y}{ˉ} M_{μ} σ_{1}^{2}

\approx \frac{s ^{2}}{T _{sim} N} (1 + \frac{12 M _{μ}^{2} + 12 y ˉ ^{2} - 24 y ˉ M _{μ}}{Δ ^{2} N ^{2}})

= \frac{s ^{2}}{T _{sim} N} (1 + \frac{12 ( M _{μ} - y ˉ ) ^{2}}{Δ ^{2} N ^{2}}) .

E (\overset{μ}{^})

E (\overset{μ}{^})

\approx E (\overset{y}{ˉ}) (1 + \frac{s ^{2}}{T _{sim}} \frac{12}{Δ ^{2} N ^{3}}) - \frac{M _{μ}}{T _{sim}} \frac{12 s ^{2}}{Δ ^{2} N ^{3}}

= μ (1 + \frac{s ^{2}}{T _{sim}} \frac{12}{Δ ^{2} N ^{3}}) - \frac{M _{μ}}{T _{sim}} \frac{12 s ^{2}}{Δ ^{2} N ^{3}}

= μ - \frac{α}{T _{sim}} \frac{12 s ^{2}}{Δ ^{2} N ^{3}},

\mbox V a r (\overset{μ}{^})

\mbox V a r (\overset{μ}{^})

\approx \frac{s ^{2}}{T _{sim} N} (\frac{12 ( \frac{s ^{2}}{T _{obs}} + μ ^{2} ) + 12 M _{μ}^{2} - 24 M _{μ} μ}{Δ ^{2} N ^{2}} + 1) + \frac{s ^{2}}{T _{obs}} (1 + \frac{12 s ^{2}}{T _{sim} Δ ^{2} N ^{3}})^{2}

= \frac{s ^{2}}{T _{obs}} + \frac{24 s ^{4}}{T _{obs} T _{sim} Δ ^{2} N ^{3}} + \frac{144 s ^{6}}{T _{obs} T _{sim} ^{2} Δ ^{4} N ^{6}} + \frac{s ^{2}}{T _{sim} N} + \frac{12 s ^{2} ( \frac{s ^{2}}{T _{obs}} + μ ^{2} ) + 12 s ^{2} M _{μ}^{2} - 24 s ^{2} M _{μ} μ}{T _{sim} Δ ^{2} N ^{3}}

= \frac{s ^{2}}{T _{obs}} + \frac{s ^{2}}{T _{sim} N} + \frac{24 s ^{4}}{T _{sim} T _{obs} Δ ^{2} N ^{3}} + \frac{144 s ^{6}}{T _{sim}^{2} T _{obs} Δ ^{4} N ^{6}} + \frac{12 s ^{2}}{T _{sim} Δ ^{2} N ^{3}} (\frac{s ^{2}}{T _{obs}} + (μ - M_{μ})^{2})

= \frac{s ^{2}}{T _{obs}} + \frac{s ^{2}}{T _{sim} N} + \frac{12 s ^{2} ( μ - M _{μ} ) ^{2}}{T _{sim} Δ ^{2} N ^{4}} + \frac{36 s ^{4}}{T _{sim} T _{obs} Δ ^{2} N ^{3}} + \frac{144 s ^{6}}{T _{sim}^{2} T _{obs} Δ ^{4} N ^{6}}

l_{s} (θ) = - \frac{1}{2} (s \textsuperscript o b s - \hat{μ}_{θ})^{T} \hat{Σ}_{θ}^{- 1} (s \textsuperscript o b s - \hat{μ}_{θ}) - \frac{1}{2} lo g \hat{Σ}_{θ},

l_{s} (θ) = - \frac{1}{2} (s \textsuperscript o b s - \hat{μ}_{θ})^{T} \hat{Σ}_{θ}^{- 1} (s \textsuperscript o b s - \hat{μ}_{θ}) - \frac{1}{2} lo g \hat{Σ}_{θ},

r σ ϕ \sim U (1, 90) \sim U (0.05, 0.7) \sim U (0, 20) .

r σ ϕ \sim U (1, 90) \sim U (0.05, 0.7) \sim U (0, 20) .

\hat{Σ}_{θ, T_{obs}} = \frac{T _{prepaid}}{T _{obs}} \hat{Σ}_{θ, T_{prepaid}}

\hat{Σ}_{θ, T_{obs}} = \frac{T _{prepaid}}{T _{obs}} \hat{Σ}_{θ, T_{prepaid}}

log (r) σ log (ϕ) \sim U (log (1), l o g (200)) \sim U (0.05, 0.7) \sim U (- 2, 7) .

log (r) σ log (ϕ) \sim U (log (1), l o g (200)) \sim U (0.05, 0.7) \sim U (- 2, 7) .

l_{s}^{PP} (θ) = - \frac{1}{2} (s \textsuperscript o b s - \hat{f}_{s v m} (θ))^{T} \hat{Σ}_{θ, T_{o b s}}^{- 1} (s \textsuperscript o b s - \hat{f}_{s v m} (θ)) - \frac{1}{2} lo g \hat{Σ}_{θ, T_{o b s}},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Prepaid parameter estimation without likelihoods

Merijn Mestdagh