Sparsity enforcing priors in inverse problems via Normal variance mixtures: model selection, algorithms and applications
Mircea Dumitru

TL;DR
This paper develops hierarchical Bayesian models using Normal variance mixtures to enforce sparsity in inverse problems, deriving algorithms for estimation and comparing their performance in applications like 3D-CT and chronobiology.
Contribution
It introduces a unified framework for sparsity enforcing priors via Normal variance mixtures and develops iterative algorithms for inverse problems, including theoretical comparisons with regularization methods.
Findings
Algorithms perform well in 3D-CT and chronobiology applications.
Bayesian sparsity algorithms compare favorably with regularization techniques.
Hierarchical models effectively incorporate uncertainties in inverse problems.
Abstract
The sparse structure of the solution for an inverse problem can be modelled using different sparsity enforcing priors when the Bayesian approach is considered. Analytical expression for the unknowns of the model can be obtained by building hierarchical models based on sparsity enforcing distributions expressed via conjugate priors. We consider heavy tailed distributions with this property: the Student-t distribution, which is expressed as a Normal scale mixture, with the mixing distribution the Inverse Gamma distribution, the Laplace distribution, which can also be expressed as a Normal scale mixture, with the mixing distribution the Exponential distribution or can be expressed as a Normal inverse scale mixture, with the mixing distribution the Inverse Gamma distribution, the Hyperbolic distribution, the Variance-Gamma distribution, the Normal-Inverse Gaussian distribution, all three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Bayesian Methods and Mixture Models
Sparsity enforcing priors in inverse problems via Normal variance mixtures: model selection, algorithms and applications
Mircea Dumitru1
*1Laboratoire des signaux et systèmes (L2S),
CNRS – CentraleSupélec – Université Paris-Sud,
CentraleSupélec, Plateau de Moulon, 91192 Gif-sur-Yvette, France*
Abstract
The sparse structure of the solution for an inverse problem can be modelled using different sparsity enforcing priors when the Bayesian approach is considered. Analytical expression for the unknowns of the model can be obtained by building hierarchical models based on sparsity enforcing distributions expressed via conjugate priors. We consider heavy tailed distributions with this property: the Student-t distribution, which is expressed as a Normal scale mixture, with the mixing distribution the Inverse Gamma distribution, the Laplace distribution, which can also be expressed as a Normal scale mixture, with the mixing distribution the Exponential distribution or can be expressed as a Normal inverse scale mixture, with the mixing distribution the Inverse Gamma distribution, the Hyperbolic distribution, the Variance-Gamma distribution, the Normal-Inverse Gaussian distribution, all three expressed via conjugate distributions using the Generalized Hyperbolic distribution. For all distributions iterative algorithms are derived based on hierarchical models that account for the uncertainties of the forward model. For estimation, Maximum A Posterior (MAP) and Posterior Mean (PM) via variational Bayesian approximation (VBA) are used. The performances of resulting algorithm are compared in applications in 3D computed tomography (3D-CT) and chronobiology. Finally, a theoretical study is developed for comparison between sparsity enforcing algorithms obtained via the Bayesian approach and the sparsity enforcing algorithms issued from regularization techniques, like LASSO and some others.
Keywords: inverse problems, sparsity enforcing priors, conjugate priors, Student-t prior model (StPM), Laplace prior model (LPM), uncertainties model, estimation, variational Bayesian approximation (VBA) 3D computed tomography, chronobiology
Contents
-
2.1 General Perspective: from the forward model to inversion via a Bayesian approach
-
2.3 Student-t prior: expressed via conjugate priors: Normal and Inverse Gamma
-
2.3.2 Student-t distribution via the Generalized Hyperbolic distribution
-
2.4.1 Hyperbolic prior: via Generalized Hyperbolic distribution
-
2.5.3 Laplace prior: via the Generalized Hyperbolic distribution
-
2.6 Variance-Gamma distribution: expressed via conjugate priors
-
2.6.1 Variance-Gamma: via Generalized Hyperbolic distribution
-
2.7 Normal-Inverse Gaussian distribution: expressed via conjugate priors
-
2.7.1 Normal-Inverse Gaussian distribution: via Generalized Hyperbolic distribution
-
4.6.2 Posterior Mean estimation via VBA, partial separability
-
4.7.2 Posterior Mean estimation via VBA, partial separability
-
5.6.2 Posterior Mean estimation via VBA, partial separability
-
5.8.2 Posterior Mean estimation via VBA, partial separability
1 Introduction
In many applications, the prior information concerning the unknown(s) of the model, namely the classical linear forward model, Equation (1)
[TABLE]
used in inverse problems, can be translated as the sparse structure of the unknown(s) i.e. the in Equation (1). In particular, the linear forward model expressed in Equation (1) corresponds to many application such as signal deconvolution, image restoration, Computed Tomography (CT) image reconstruction, Fourier Synthesis (FS) inversion, microwave imaging [NMD94] and [FDMD07], ultrasound echography, seismic imaging, radio astronomy [KTB04] fluorescence imaging, inverse scattering [CMD97], [FDMD05], [AMD10] and [Gharsalli2013b], Eddy current non destructive testing [NMD96] or SAR imaging [AKZ06]. In all these examples the common inverse problem is to estimate from the observations of . In general, the inverse problems are ill-posed [Had01], since the conditioning number of the matrix is very high. This means that, in practice, the data alone is not sufficient to define an unique and satisfactory solution. The interpretation of the linear forward model, Equation (1) is presented in Figure (1).
When the Bayesian approach is considered, one way to build hierarchical models that are favouring a sparse solution is to consider distributions that are known to enforce sparsity for the prior. Such an approach gives the possibility to estimate the hyperparameters of the hierarchical model, i.e. the associated variances for and . A typical hierarchical model associated to the forward model Equation (1) is presented in Figure (2).
However, Figure (2) presents an hierarchical model for direct sparsity, i.e. an hierarchical model that asumes the sparse structure of . In many applications, is not sparse but can be expressed via a transformation on a sparse structure . Evidently, when considering the transformation on the sparse structure, the uncertainties and modelling errors have to accounted, Equation (2):
[TABLE]
In this cases, a more general general Hierarchical model is presented in Figure (3).
When referring to the strategy used in the Bayesian approach for searching sparse solution in the inverse problem context, we have used the word favouring. It is important to mention that generally, the linear forward model, Equation (1), may have an infinite number of solutions. Using a sparsity enforcing prior to model results in algorithms selecting sparse solutions, but this is possible only when the linear forward model, Equation (1) is allowing such solutions. Therefore, for those type of algorithms there is no guarantee for the sparse structure of the solution.
In this work we present three classes of sparsity enforcing priors and show how a hierarchical model can be build using these kind of priors. We discuss then the mechanism of sparsity enforcing and present the advantages of iterative algorithms using sparsity enforcing priors that can be expressed via conjugate priors. e place this work in the context of such heavy tailed distributions that in particular can be expressed via conjugate priors.
2 Sparsity enforcing priors via conjugate priors
This section presents some sparsity enforcing priors, namely heavy tailed distributions that can be expressed via conjugate priors. First, a brief presentation of the Bayesian approach in inverse problems, Subsection (2.1). The sparsity mechanism and its key factors that are to be considered when selecting a good sparsity enforcing prior is presented in Subsection (2.2). It is shown that not only the heavy-tailed form of the distribution is of great interest for enforcing sparsity but also the associated variance vector of the sparse structure plays a crucial role, for which we aim a specific behaviour: small variances associated with the zero or close to zero values and important variances for the other values. Subsection (2.3) presents the Student-t distribution, expressed via a Normal variance mixture, using as the mixing distribution the Inverse Gamma distribution. In particular, it is shown that via those two conjugate priors a two parameters version of the Student-t distribution is obtained when no condition is imposed for the scale and shape parameters of the Inverse Gamma distribution, for which the corresponding variance can be decreased to any positive value, which is a crucial fact in the sparsity mechanism. The same Normal variance mixture is obtained if the Student-t distribution is viewed as a generalized Hyperbolic distribution. In Subsection (2.5) the Laplace distribution is expressed via two conjugate distributions as a Normal variance mixture where the mixing distribution is the Exponential distribution. The same Normal variance mixture is obtained if the Laplace distribution is viewed as a generalized Hyperbolic distribution. Another way to express the Laplace distribution using conjugate distributions is via a Normal inverse-variance mixture where the mixing distribution is the Inverse Gamma distribution. In Subsection (2.6) the Variance-Gamma distribution is expressed via two conjugate distributions as a Normal mean-variance mixture where the mixing distribution is the Inverse Gamma distribution. The expression is obtained using the generalized Hyperbolic distribution. In Subsection (2.7) the Normal-Inverse Gaussian distribution is expressed via two conjugate distributions as a Normal mean-variance mixture where the mixing distribution is the Inverse Gaussian distribution.
2.1 General Perspective: from the forward model to inversion via a Bayesian approach
The strategy adopted for doing the inversion in Equation (1) (or an equivalent one) is to build an hierarchical model, accounting for the available prior informations (i.e. accounting for the sparsity when the prior information concerning is its sparse structure) and also accounting for the particularities of the errors and uncertainties of the model, modelled via based on a Bayesian approach. Considering the linear forward model, Equation (1), the Bayesian inference is based on the fundamental relation given by the Bayes rule:
[TABLE]
where represents the hyperparameters appearing in the hierarchical model (namely the variances {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon} associated with the unknowns of the linear forward model, and . The Bayes rule can be interpreted as a proportionality relation between the posterior law and the product of the prior law (the prior information, sparsity in our case) and the likelihood:
[TABLE]
Generally, the likelihood is obtained via the linear forward model, Equation (1) and the distribution considered for modelling the errors and the uncertainties . More details on this, in Section (3). An extension of Equation (4) is the general Bayesian Inference, where the hyperparameters \mbox{\boldmath\theta}=(\mbox{\boldmath\theta}_{\epsilon},\mbox{\boldmath\theta}_{f}) from Equation (4) are considered to be unknown and are to be estimated, along with the unknowns of the forward model, Equation (1):
[TABLE]
For the case when the sparsity appears via a transformation, the forward linear model Equation (1) and Equation (2) are considered (Figure (3)):
[TABLE]
Evidently, considering the general Bayesian Inference implies assigning distributions for the hyperparameters, i.e., assigning p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmath\theta}}|\psi). The set of distribution assigned for the prior, the likelihood and for the hyperparameters represents the hierarchical model, Figure (4).
The choices of the distributions are done in accordance with the application and with the available prior informations. In particular, when the prior information is the sparse structure of the unknown of the forward model Equation (1) sparsity enforcing priors will be used. From the hierarchical model the posterior distribution is obtained via Equation (5) from which the unknowns and the hyperparameters can be estimated. Another representation of an hierarchical model built on the linear forward model, Equation (1) is presented in Figure (4). We will see that when sparsity enforcing priors that can be expressed via conjugate distributions laws are considered, analytical expressions can be obtained for the unknowns of the hierarchical model. This is a great advantage and the fundamental reason for the great interest of such priors.
2.2 Iterative algorithms sparsity mechanism
In those type of approaches, the mechanism of sparsity is based not only on the heavy tailed property of the prior distribution (or its property to induce sparsity) but also on a particular behaviour of the associated variances. In such an approach a bivariate prior is set for the unknown of the model that needs to be estimated and for the corresponding variance. The algorithm that results is an iterative one, updating at every iteration both the unknown of the model and the corresponding variance. In order to obtain a sparse solution for the unknowns, the structure of the variance must be sparse itself. In particular the variances associated with the zero or close to zero points from the unknown of the model must be small, and the variances associated with the non-zero elements of the sparse unknowns of the model must be significant.
Therefore, the parameters of the distribution modelling the variance (for example the shape and scale parameters of the Inverse Gamma distribution appearing in the conjugate prior models for sparsity enforcing via Student-t or Laplace) must be chosen such that the variance vector {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}} is sparse, i.e. the expected value of the elements {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} is close to zero, \mbox{E}\left[{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right]\searrow 0. Furthermore, in a Bayesian approach, we may have a prior knowledge for the numerical value associated with the variance of the distribution modelling {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, i.e. \mbox{Var}\left[{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right]=w, where is the numerical value obtained via prior knowledge. Evidently, depending on the parameters corresponding to the distribution modelling the variance, the behaviour of the marginal, i.e. the sparsity enforcing prior distribution. We note that in order to select a prior model that enforces sparsity, those parameters must be chosen such that the prior distribution expressed via conjugate prior distributions, modelling the unknown of the linear forward model, Equation (1), is concentrated around zero, i.e. it’s variance is very small, \mbox{Var}_{Prior}\left[{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}kj}\right]\searrow 0.
2.3 Student-t prior: expressed via conjugate priors: Normal and Inverse Gamma
In the following, the Student-t distribution is discussed. It is a sparsity enforcing prior because of its heavy-tailed form. It can be expressed via conjugate priors as the marginal of a Normal variance mixture distribution, with the mixing distribution an Inverse Gamma distribution. The standard form, with one parameter representing the degrees of freedom is obtained when shape and scale parameters corresponding to the Inverse Gamma distribution are considered equal, . When this equality is not imposed, a two parameters Student-t distribution is obtained, which is of great importance in the context of sparsity enforcing since in this case, the corresponding variance (which generally needs to be small) can take any positive values. In probability and statistics, Student’s t-distribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown. It was developed by William Sealy Gosset under the pseudonym Student. Whereas a normal distribution describes a full population, t-distributions describe samples drawn from a full population; accordingly, the t-distribution for each sample size is different, and the larger the sample, the more the distribution resembles a normal distribution. The t-distribution plays a role in a number of widely used statistical analyses, including Student’s t-test for assessing the statistical significance of the difference between two sample means, the construction of confidence intervals for the difference between two population means, and in linear regression analysis. The Student’s t-distribution also arises in the Bayesian analysis of data from a normal family.
A random variable has a Student-t distribution if its probability density function is:
[TABLE]
The comparison between the standard Normal distribution and the standard Student-t distribution is presented in Figure (6).
2.3.1 Student-t distribution via Normal variance mixture
For the linear model expressed in Equation (1), the Student-t distribution can be used as a prior via the Student-t Prior Model (StPM), Equation (7), which considers a zero-mean Normal distribution for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, with the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\alpha,\beta modelled as an Inverse Gamma distribution, with the corresponding shape and scale parameters, and :
[TABLE]
The expression of the joint probability distribution {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\alpha,\beta is a bivariate Normal-Inverse Gamma distribution, Equation (8),
[TABLE]
and the marginal p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|\alpha,\beta) is, Equation (9):
[TABLE]
In Equation (9) an distribution can be identified inside the integral,
[TABLE]
so, the marginal from Equation (9) is two-parameters Student-t distribution:
[TABLE]
If , from Equation (11) we can conclude that the marginal is the Student-t distribution, p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|\alpha=\frac{\nu}{2},\beta=\frac{\nu}{2})={\cal S}t\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|\nu\right):
[TABLE]
Using the Student-t Prior Model (StPM), Equation (7), the Student-t distribution can be used as a sparsity enforcing prior for the forward linear model, Equation (1). The interest of expressing the prior distribution via prior models using conjugate priors is the possibility to compute analytical expressions for the unknown estimates.
Section (4) presents the developments of hierarchical models based on Student-t prior distribution for the linear forward models Figure (2) and Figure (3).
2.3.2 Student-t distribution via the Generalized Hyperbolic distribution
In the following, a short introduction of the Generalized Hyperbolic distribution. The interest of this distribution is its heavy tailed form for different parameters and the fact that it can be expressed via conjugate priors, namely the Normal distribution and the generalized Inverse Gaussian distribution.
The generalised Hyperbolic distribution (), introduced by Ole Barndorff-Nielsen, is a continuous probability distribution defined as the normal variance-mean mixture where the mixing distribution is the generalized Inverse Gaussian () distribution. Its probability density function is given in terms of modified Bessel function of the second kind, . A random variable has a generalized Hyperbolic distribution if its probability density function is:
[TABLE]
For the linear model expressed in Equation (1), the Generalized Hyperbolic Prior Model (GHPM), Equation (14), considers a Normal distribution for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} with the mean \mu+\beta{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} and variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}. The variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\gamma^{2},\;\delta^{2},\;\lambda is modelled as a generalized Inverse Gaussian distribution, with the corresponding parameters , , and :
[TABLE]
The expression of the joint probability distribution {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\mu,\;\beta,\;\gamma^{2},\;\delta^{2},\;\lambda is given by Equation (15),
[TABLE]
so the marginal is, Equation (16):
[TABLE]
For solving Equation (16) we can identify a inside the integral:
[TABLE]
Introducing the notations:
[TABLE]
the marginal from Equation (16) can be written as:
[TABLE]
Introducing between the parameters the parameter and excluding the parameter , considering as parameters instead of and reordering the parameters we obtain:
[TABLE]
The interest of this distribution is that is of a very general form, being the superclass of, among others, the Student-t () distribution for the particular set of parameters , the Hyperbolic () distribution for the particular set of parameters - Subsection (2.4.1), the Laplace () distribution for the particular set of parameters - Subsection (2.5.3), the variance-gamma () distribution for the particular set of parameters - Subsection (2.6.1), the Normal-Inverse Gaussian () distribution - Subsection (2.7.1).
In the following we consider the Student-t distribution expressed via the generalized Hyperbolic distribution:
has a Student’s t-distribution with degrees of freedom, .
Fixing the asymmetry parameter , as an immediate consequence and . The particular case of the probability density function is:
[TABLE]
Fixing and considering , the particular case of the probability density function is:
[TABLE]
Since we considered , for both arguments of the modified Bessel functions of the second kind appearing in Equation (22) we have and so for computing the values of the two modified Bessel functions of the second kind we can use the asymptotic relations for small arguments:
[TABLE]
We have:
[TABLE]
Using Equation (24) in Equation (22), the particular case of the probability density function becomes:
[TABLE]
Finally, fixing , Equation (25) becomes:
[TABLE]
Figure (7(a)) presents four Student-t probability density functions with different means (, blue and red, , yellow and , violet) and different degrees of freedom (, blue and yellow, , red and , violet). Figure (7(b)) presents the Generalized Hyperbolic probability density function for parameters with and set with the same numerical values as the corresponding ones from Figure (7(a)). For , the numerical values is . Figure (7(c)) presents the comparison between the four Student-t probability density functions and the corresponding Generalized Hyperbolic probability density functions, vs. . In all forth cases the probability density functions are superposed. Figure (7(d)) presents the comparison between the logarithm of distributions, vs. .
Figure (8(c)) presents the comparison between the standard Student-t probability density function, (reported in Figure (8(a))) and the Generalized Hyperbolic density function for parameters (reported in Figure (8(b))), showing that the two probability density functions are almost superposed. Figure (8(d)) presents the comparison between the logarithm of the two distributions, vs. . In this case, the numerical value for is .
The behaviour of the Generalized Hyperbolic density function depending on is presented in Figure (9(a)): a comparison between the standard Student-t probability density function (in blue) and the Generalized Hyperbolic density function for (in red), (in yellow), (in violet) and (in green) is presented in Figure (9(a)). The difference between those four Generalized Hyperbolic density functions and the Student-t density function is presented in Figure (9(b)).
Considering the Generalized Hyperbolic Prior Model (GHPM), Equation (14), for the structure of the parameters corresponding to the Student-t distribution we obtain a Normal variance mixture (since ) with the mixing distribution the Inverse Gamma distribution (since {\cal G}{\cal I}{\cal G}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\gamma^{2}\searrow 0,\;\nu,\;\frac{-\nu}{2})={\cal I}{\cal G}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\frac{\nu}{2}\;\frac{\nu}{2})). Indeed, if and we obtain . So p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\gamma^{2},\;\delta^{2},\;\lambda)=p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\gamma^{2}\searrow 0,\;\nu,\;\frac{-\nu}{2})={\cal G}{\cal I}{\cal G}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\gamma^{2}\searrow 0,\;\nu,\;\frac{-\nu}{2}). Since , for the modified Bessel function of the second kind appearing in the expression of the generalized Inverse Gaussian distribution the asymptotic relation for small arguments can be used, , so the generalized Inverse Gaussian can be written as an Inverse Gamma with equal shape and scale parameters {\cal G}{\cal I}{\cal G}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\gamma^{2}\searrow 0,\;\nu,\;\frac{-\nu}{2})\sim\frac{\left(\frac{\nu}{2}\right)^{\frac{\nu}{2}}}{\Gamma\left(\frac{\nu}{2}\right)}\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}^{-\frac{\nu}{2}-1}\;\exp\left\{-\frac{\nu}{2}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}^{-1}\right\}={\cal I}{\cal G}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|\frac{\nu}{2},\;\frac{\nu}{2}). We obtain the same conjugate prior model for the Student-t distribution, the Normal variance mixture with the mixing distribution an Inverse Gamma distribution. If the condition is not imposed in the Generalized Hyperbolic distribution, the two parameters Student-t distribution form is obtained.
2.4 Hyperbolic prior: expressed via conjugate priors
The interest is given by the fact that the Hyperbolic distribution is a heavy-tailed distribution and therefore can be successfully used as a sparsity enforcing prior. The Hyperbolic distribution is a continuous probability distribution, for the logarithm of the probability density function is a hyperbola, therefore the distribution decreases exponentially, which is more slowly than the Normal distribution. It is suitable to model phenomena where numerically large values are more probable than is the case for the Normal distribution. The origin of the distribution is the observation by Ralph Alger Bagnold, that the logarithm of the histogram of the empirical size distribution of sand deposits tends to form a hyperbola. The Hyperbolic distribution has the following probability density function:
[TABLE]
2.4.1 Hyperbolic prior: via Generalized Hyperbolic distribution
has a Hyperbolic distribution, .
The goal of this section is to derive the Hyperbolic distribution from the Generalized Hyperbolic distribution. For the particular case of the Generalized Hyperbolic distribution with , the probability density function is:
[TABLE]
For , the modified Bessel function of the second kind can be stated explicitly with:
[TABLE]
so:
[TABLE]
Plugging Equation (30) in Equation (28):
[TABLE]
Figure (10(a)) presents four Hyperbolic probability density functions with different parameters. Figure (10(b)) the corresponding Generalized Hyperbolic density functions for parameters . The Hyperbolic probability density functions and the corresponding Generalized Hyperbolic density functions are superposed, Figure (10(c)). Figure (10(d)) presents the comparison between the logarithm of the two distributions, vs. .
The Hyperbolic distribution can be expressed using the Generalized Hyperbolic Prior Model (GHPM), Equation (14) by fixing , .
[TABLE]
2.5 Laplace prior: expressed via conjugate priors
In the following, we present how the Laplace distribution, a sparsity enforcing prior because of its heavy-tailed form can be expressed via conjugate priors, namely the Normal distribution and the Exponential distribution.
For this part of the work, the following resources were used:
A variational Bayes framework for sparse adaptive estimation - K. E. THEMELIS, A. A. RONTOGIANNIS, arvix: 1401.2771v1 [statML], 13 Jan 2014.
In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponential distributions (with an additional location parameter) spliced together back-to-back, although the term ’double exponential distribution’ is also sometimes used to refer to the Gumbel distribution. The difference between two independent identically distributed exponential random variables is governed by a Laplace distribution, as is a Brownian motion evaluated at an exponentially distributed random time. Increments of Laplace motion or a variance gamma process evaluated over the time scale also have a Laplace distribution.
A random variable has a Laplace distribution if its probability density function is:
[TABLE]
Figure (11) presents the comparison between the Normal distribution and the standard Laplace distribution.
We present two ways to obtain the Laplace distribution via conjugate priors. The first possibility is to consider a zero-mean Normal distribution for which the variance is modelled as an Exponential distribution. The second possibility considers an Inverse Gamma distribution for the inverse of the variance, with the shape parameter set at 1.
2.5.1 Laplace prior: Normal and Exponential
We start with the linear model expressed in Equation (1). We consider the Laplace Prior Model (LPM), Equation (34), which considers a zero-mean Normal distribution for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} while the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|b is modelled as an Exponential distribution, with the corresponding parameter :
[TABLE]
The expression of the joint probability distribution {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|b is given by Equation (35),
[TABLE]
so the marginal is, Equation (36):
[TABLE]
The probability distribution of the generalized inverse Gaussian distribution is presented in Equation (37)
[TABLE]
and is the modified Bessel function of the second kind. So, for solving Equation (36) we can identify a inside the integral:
[TABLE]
So, the marginal from Equation (36) can be written as:
[TABLE]
The following equality stands:
[TABLE]
From Equation (39) and Equation (40) the marginal from Equation (36) becomes:
[TABLE]
So, we conclude that the marginal is a Laplace distribution, p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}},\lambda)={\cal L}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|0\;\left(2\lambda\right)^{-\frac{1}{2}}\right):
[TABLE]
2.5.2 Laplace prior: Normal and Inverse Gamma
We start with the linear model expressed in Equation (1). We consider the Laplace Prior Model (LPM), Equation (43), which considers a zero-mean Normal distribution for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} and the inverse of the variance. The variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|b is modelled as an Inverse Gamma distribution, with the corresponding shape parameter equal to :
[TABLE]
The expression of the joint probability distribution {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}|b is given by Equation (44),
[TABLE]
so the marginal is, Equation (45):
[TABLE]
The probability distribution of generalized inverse Gaussian distribution is presented in Equation (46)
[TABLE]
and is the modified Bessel function of the second kind. So, for solving Equation (45) we can identify a inside the integral:
[TABLE]
So, the marginal from Equation (45) can be written as:
[TABLE]
The following equality stands:
[TABLE]
From Equation (48) and Equation (49) the marginal from Equation (45) becomes:
[TABLE]
So, we conclude that the marginal is a Laplace distribution, p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}},b)={\cal L}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}|0,\;b^{-\frac{1}{2}}\right):
[TABLE]
2.5.3 Laplace prior: via the Generalized Hyperbolic distribution
has a Laplace distribution, .
The goal of this section is to derive the Laplace distribution from the Generalized Hyperbolic distribution. The Laplace distribution has the following probability density function:
[TABLE]
Considering in the expression of the Generalized Hyperbolic probability density function, a Hyperbolic distribution is obtained, Subsection (2.4.1). Fixing implies and the expression of the probability density function becomes:
[TABLE]
Further, considering , for the expression of the modified Bessel function of the second degree the asymptotic relation for small arguments presented in Equation (23) can be used, obtaining:
[TABLE]
Using Equation (54) in Equation (53), the expression of the probability density function becomes:
[TABLE]
Finally, for , the standard Laplace probability density function is obtained:
[TABLE]
Figure (12(a)) presents four Laplace probability density functions with different means values (, blue and red, , yellow and , violet) and different scale parameter values (, blue and violet, , red and , yellow). Figure (12(b)) presents the Generalized Hyperbolic probability density function for parameters with and set with the same numerical values as the corresponding ones from Figure (12(a)). For , the numerical values is . Figure (12(c)) presents the comparison between the four Laplace probability density functions and the corresponding Generalized Hyperbolic probability density functions, vs. . In all forth cases the probability density functions are superposed. Figure (12(d)) presents the comparison between the logarithm of distributions, vs. .
Figure (13(c)) presents the comparison between the standard Laplace probability density function, (reported in Figure (13(a))) and the Generalized Hyperbolic density function for parameters (presented in Figure (13(b))), showing the two distributions superposed. Figure (13(d)) presents the comparison between the logarithm of the two distributions, vs. .
The behaviour of the Generalized Hyperbolic density function depending on is presented in Figure (14(a)): a comparison between the Laplace probability density function (in blue) and the Generalized Hyperbolic density function for (in red), (in yellow), (in violet) and (in green) is presented in Figure (14(a)). The difference between those four Generalized Hyperbolic density functions and the Laplace density function is presented in Figure (14(b)).
The Laplace distribution can be expressed using the Generalized Hyperbolic Prior Model (GHPM), Equation (14) by fixing :
[TABLE]
2.6 Variance-Gamma distribution: expressed via conjugate priors
The Variance-Gamma distribution, or the generalized Laplace distribution or Bessel function distribution is a continuous probability distribution that is defined as the normal variance-mean mixture where the mixing density is the gamma distribution. The tails of the distribution decrease more slowly than the normal distribution. It is therefore suitable to model phenomena where numerically large values are more probable than is the case for the normal distribution. In particular, it can be successfully used for modelling sparse phenomena. Examples are returns from financial assets and turbulent wind speeds. The distribution was introduced in the financial literature by Madan and Seneta (D.B. Madan and E. Seneta (1990): The variance gamma (V.G.) model for share market returns, Journal of Business, 63, pp. 511–524.). The Variance-Gamma distributions form a subclass of the generalised hyperbolic distributions. The fact that there is a simple expression for the moment generating function implies that simple expressions for all moments are available. The Variance-Gamma distribution has the following probability density function:
[TABLE]
2.6.1 Variance-Gamma: via Generalized Hyperbolic distribution
The goal of this subsection is to derive the Variance-Gamma distribution from the Generalized Hyperbolic distribution.
has a variance-gamma distribution, . Considering , the particular form of the Generalized Hyperbolic distribution becomes:
[TABLE]
Since , for the expression of the modified Bessel function of the second degree the asymptotic relation for small arguments presented in Equation (23) can be used, obtaining:
[TABLE]
Using Equation (60) in Equation (59), the particular form of the Generalized Hyperbolic distribution becomes:
[TABLE]
Figure (15(c)) presents the comparison between the Variance-Gamma probability density function, (presented in Figure (15(a))) and the Generalized Hyperbolic density function for parameters (presented in Figure (15(b))). Figure (15(d)) presents the comparison between the logarithm of the two distributions, vs. .
The Variance-Gamma distribution can be expressed using the Generalized Hyperbolic Prior Model (GHPM), Equation (14) by fixing :
[TABLE]
2.7 Normal-Inverse Gaussian distribution: expressed via conjugate priors
The interest of the Normal-Inverse Gaussian distribution is given by the fact that it can be a heavy-tailed distribution and therefore can be successfully used as a sparsity enforcing prior. The Normal-Inverse Gaussian distribution is a continuous probability distribution, defined as the Normal variance-mean mixture, where the mixing density is the Inverse Gaussian distribution. The parameters of the Normal-Inverse Gaussian distribution can be used to construct a heaviness and skewness plot called the NIG-triangle. The Normal-Inverse Gaussian distribution has the following probability density function:
[TABLE]
2.7.1 Normal-Inverse Gaussian distribution: via Generalized Hyperbolic distribution
The goal of this subsection is to derive the Normal-Inverse Gaussian distribution from the Generalized Hyperbolic distribution.
has a Normal-Inverse Gaussian distribution.
Considering , the Generalized Hyperbolic distribution is:
[TABLE]
Using the fact that for , the modified Bessel function of the second kind can be stated explicitly, Equation (29), we express :
[TABLE]
Plugging Equation (65) in Equation (64):
[TABLE]
Figure (16(c)) presents the comparison between the Normal-Inverse Gaussian probability density function, (presented in Figure (16(a))) and the Generalized Hyperbolic density function for parameters (presented in Figure (16(b))). Figure (16(d)) presents the comparison between the logarithm of the two distributions, vs. .
3 From uncertainties models to likelihood models
Depending on the application the corresponding linear forward model variances might be unknown and are to be estimated. For the linear forward model, Equation (1), the unknowns are , representing a signal or an image and , which accounts for measurements errors and uncertainties. When the corresponding variances are alos considered to be unknowns, we want to estimate the associated variances, i.e. {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}. Typically, the construction of the hierarchical models is based on general Bayesian inference, which gives the possibility to derive the posterior distribution from the prior and likelihood. In this section, for the prior distribution we will use the Student-t distribution, in order to enforce sparsity on . The likelihood is obtained from the distribution proposed for modelling the uncertainties of the model, . Different distributions can be proposed for modelling the uncertainties, resulting in different likelihoods.
3.1 Stationary Gaussian Uncertainties Model
A stationary Gaussian uncertainties model (sGUM) can be proposed, under the assumption that the associated uncertainties variances are equal, Equation (67):
[TABLE]
The uncertainties vector elements are modelled using zero-mean Normal distributions, with the same associated variance, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}, Equation (68):
[TABLE]
leading to the sGUM, a multivariate normal distribution, Equation (69):
[TABLE]
The likelihood p\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) is obtained via the linear forward model, Equation (1) and the stationary Gaussian uncertainties model (sGUM), Equation (69). The distribution modelling the likelihood is also a multivariate normal distribution with the same covariance matrix {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\mbox{\boldmathI} and mean H$$f, stationary Gaussian likelihood (sGL), Equation (70):
[TABLE]
3.2 Non-stationary Gaussian Uncertainties Model
A non-stationary Gaussian uncertainties model (nsGUM) can be proposed, when the assumption that the associated uncertainties variances are equal, Equation (67) is not imposed.
The uncertainties vector elements are modelled using zero-mean Normal distributions like in sGUM, but with different associated variances, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}} for each uncertainties vector element {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, Equation (71):
[TABLE]
leading to the nsGUM, a multivariate normal distribution, Equation (72):
[TABLE]
where the following notations were used:
[TABLE]
The likelihood p\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) is obtained via the linear forward model, Equation (1) and the stationary Gaussian uncertainties model (nsGUM), Equation (72). The distribution modelling the likelihood is also a multivariate normal distribution with the same covariance matrix {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and mean H$$f, non-stationary Gaussian likelihood (nsGL), Equation (74):
[TABLE]
3.3 Stationary Student-t Uncertainties Model
A stationary Student-t uncertainties model (sStUM) can be proposed, under the assumption that the associated uncertainties variances are equal, Equation (67).
The uncertainties vector elements given the same associated variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}, are modelled using zero-mean Normal distributions, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\sim{\cal N}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|0,{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) and the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is modelled using an Inverse Gamma distribution, Equation (75):
[TABLE]
such that via the Student-t Prior Model, Equation (7), the error vector is modelled by a Student-t distribution, Equation (76):
[TABLE]
The sStUM is represented by a multivariate Student-t distribution, Equation (77):
[TABLE]
and is expressed by a multivariate Normal distribution and an Inverse Gamma distribution, Equation (78):
[TABLE]
The likelihood p\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) is obtained via the linear forward model, Equation (1) and the stationary Student-t uncertainties model (sStUM), Equation (78). The distribution modelling the likelihood is also a multivariate normal distribution with the same covariance matrix {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\mbox{\boldmathI} and mean H$$f, stationary Student-t likelihood (sStL), Equation (79):
[TABLE]
3.4 Non-stationary Student-t Uncertainties Model
A non-stationary Student-t uncertainties model (nsStUM) can be proposed, when the assumption that the associated uncertainties variances are equal, Equation (67) is not imposed.
The uncertainties vector elements given the associated variances {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} are modelled using zero-mean Normal distributions, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}\sim{\cal N}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}|0,{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}\right) and the variances {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} are modelled using Inverse Gamma distributions, having the same shape and scale parameters, Equation (80):
[TABLE]
such that via the Student-t Prior Model, Equation (7), every element of the uncertainties vector is modelled by a Student-t distribution, Equation (76). The sStUM is represented by a multivariate Student-t distribution, Equation (77) and is expressed by a multivariate Normal distribution and a product of Inverse Gamma distributions, Equation (81):
[TABLE]
The likelihood p\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) is obtained via the linear forward model, Equation (1) and the non-stationary Student-t uncertainties model (nsStUM), Equation (81). The distribution modelling the likelihood is also a multivariate normal distribution with the same covariance matrix {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\mbox{\boldmathI} and mean H$$f, stationary Student-t likelihood (nsStL), Equation (82):
[TABLE]
3.5 Stationary Laplace Uncertainties Model
A stationary Laplace uncertainties model (sLUM) can be proposed, under the assumption that the associated uncertainties variances are equal, Equation (67).
The uncertainties vector elements given the same associated variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}, are all modelled using zero-mean Normal distributions, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\sim{\cal N}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}|0,{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1}\right) and the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is modelled using an Inverse Gamma distribution, for which the shape parameter is set at and the considered scale parameter is Equation (83):
[TABLE]
such that via the Laplace Prior Model, Equation (43), the error vector is modelled by a Laplace distribution, Equation (84):
[TABLE]
The sLUM is represented by a multivariate Laplace distribution, Equation (85):
[TABLE]
and is expressed by a multivariate Normal distribution and an Inverse Gamma distribution, Equation (86):
[TABLE]
The likelihood p\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) is obtained via the linear forward model, Equation (1) and the stationary Laplace uncertainties model (sLUM), Equation (86). The distribution modelling the likelihood is also a multivariate normal distribution with the same covariance matrix {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1}\mbox{\boldmathI} and mean H$$f, stationary Student-t likelihood (sLL), Equation (87):
[TABLE]
3.6 Non-stationary Laplace Uncertainties Model
A non-stationary Laplace uncertainties model (nsLUM) can be proposed, when the assumption that the associated uncertainties variances are equal, Equation (67) is not imposed.
The uncertainties vector elements given the associated variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, are modelled using zero-mean Normal distributions, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}\sim{\cal N}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}|0,{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}^{-1}\right) and the variances {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} are modelled using Inverse Gamma distributions, for which the shape parameter is set at and the considered scale parameter is , Equation (88):
[TABLE]
such that via the Laplace Prior Model, Equation (43), every element of the uncertainties vector is modelled by a Laplace distribution, Equation (84). The nsLUM is represented by a multivariate Laplace distribution, Equation (85) and is expressed by a multivariate Normal distribution and a product of Inverse Gamma distributions, Equation (89):
[TABLE]
The likelihood p\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) is obtained via the linear forward model, Equation (1) and the non-stationary Laplace uncertainties model (nsLUM), Equation (89). The distribution modelling the likelihood is also a multivariate normal distribution with the same covariance matrix {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1}\mbox{\boldmathI} and mean H$$f, non-stationary Laplace likelihood (nsLL), Equation (90):
[TABLE]
4 Hierarchical Models with Student-t prior via StPM
Considering the Student-t distribution for modelling the sparse structure of in linear forward model, Equation (1), expressed via the conjugate priors StPM, depending on the model proposed for the uncertainties of the model, , and implicitly the corresponding likelihood, sGL or nsGL, different hierarchical models can be proposed:
4.1 Student-t hierarchical model: stationary Gaussian uncertainties model, known uncertainties variance
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Gaussian uncertainties model is proposed, i.e. a multivariate Gaussian distribution is used under the following two assumptions:
a) each element of the uncertainties vector has the same variance, ;
b) the variance is known;
[TABLE]
4.2 Student-t hierarchical model: non-stationary Gaussian uncertainties model, known uncertainties variances
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Gaussian uncertainties model is proposed, i.e. a multivariate Gaussian distribution is used under the following assumption:
a) the variance vector \mbox{\boldmathv}_{\epsilon} is known;
[TABLE]
4.3 Student-t hierarchical model: stationary Gaussian uncertainties model, unknown uncertainties variance
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Gaussian uncertainties model is proposed, i.e. a multivariate Gaussian distribution is used under the following two assumptions:
a) each element of the uncertainties vector has the same variance, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}};
b) the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
4.4 Student-t hierarchical model: non-stationary Gaussian uncertainties model, unknown uncertainties variances
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Gaussian uncertainties model is proposed, i.e. a multivariate Gaussian distribution is used under the following assumption:
a) the variance vector {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
4.5 Student-t hierarchical model: stationary Student-t uncertainties model, unknown uncertainties variance
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Student-t uncertainties model is proposed, i.e. a multivariate Student distribution expressed via StPM is used under the following two assumptions:
a) each element of the uncertainties vector has the same variance, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}};
b) the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
4.6 Student-t hierarchical model: non-stationary Student-t uncertainties model, unknown uncertainties variances
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a non-stationary Student-t uncertainties model is proposed, i.e. a multivariate Student-t distribution expressed via StPM is used under the following assumption:
a) the variance vector {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
The hierarchical model build over the linear forward model, Equation (1), using as a prior for a Student-t distribution, expressed via the Student-t Prior Model (StPM), Equation (7) and modelling the uncertainties of the model using the non-stationary Student-t Uncertainties Model (nsStUM), Equation (81), is presented in Equation (96). The posterior distribution is obtained via the Bayes rule, Equation (97):
[TABLE]
The goal is to estimate the unknowns of the hierarchical model, namely , the main unknown of the linear forward model, Equation (1) which was suppose sparse, and consequently modelled via the Student-t distribution and the two variances appearing in the hierarchical model, Equation (96), the variance corresponding to the sparse structure , namely {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} and the variance corresponding to uncertainties of model , namely {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}.
4.6.1 Joint MAP estimation
First, the Joint Maximum A Posterior (JMAP) estimation is considered: the unknowns are estimated on the basis of the available data , by maximizing the posterior distribution:
[TABLE]
where for the second equality the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}) is defined as:
[TABLE]
The MAP estimation corresponds to the solution minimizing the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}). From the analytical expression of the posterior distribution, Equation (97) and the definition of the criterion , Equation (99), we obtain:
[TABLE]
One of the simplest optimisation algorithm that can be used is an alternate optimization of the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}) with respect to the each unknown:
- •
With respect to :
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}, :
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, :
First, we develop the norm \|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}^{-\frac{1}{2}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)\|:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i} denotes the line i of the matrix , , i.e. {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}=\left[{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},\ldots,{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}M}}\right].
[TABLE]
The iterative algorithm obtained via JMAP estimation is presented Figure (17).
4.6.2 Posterior Mean estimation via VBA, partial separability
In this subsection, the Posterior Mean (PM) estimation is considered. The Joint MAP computes the mod of the posterior distribution. The PM computes the mean of the posterior distribution. One of the advantages of this estimator is that it minimizes the Mean Square Error (MSE). Computing the posterior means of any unknown needs great dimensional integration. For example, the mean corresponding to can be computed from the posterior distribution using Equation (101),
[TABLE]
In general, these computations are not easy. One way to obtain approximate estimates is to approximate p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}) by a separable one q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon})=q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}), then computing the posterior means using the separability. The mean corresponding to is computed using the corresponding separable distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Equation (102),
[TABLE]
If the approximation of the posterior distribution with a separable one can be done in such a way that conserves the mean, i.e. Equation (103),
[TABLE]
for all the unknowns of the model, a great amount of computational cost is gained. In particular, for the proposed hierarchical model, Equation (96), the posterior distribution, Equation (97), is not a separable one, making the analytical computations of the PM very difficult. One way the compute the PM in this case is to first approximate the posterior law p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}) with a separable law q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}), Equation (104),
[TABLE]
where the notations from Equation (105) are used
[TABLE]
by minimizing of the Kullback-Leibler divergence, defined as:
[TABLE]
where the notations from Equation (107) are used
[TABLE]
Equation (105) is selecting a partial separability for the approximated posterior distribution q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}) in the sense that a total separability is imposed for the distributions corresponding to the two variances appearing in the hierarchical model, q_{2}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}\right) and q_{3}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) but not for the distribution corresponding to . Evidently, a full separability can be imposed, by adding the supplementary condition q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})=\prod_{j=1}^{M}q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) in Equation (105). This case is considered in Subsection (4.6.3). The minimization can be done via alternate optimization resulting the following proportionalities from Equations (108a), (108b) and (108c),
[TABLE]
using the notations:
[TABLE]
and also
[TABLE]
Via Equation (99) and Equation (100), the analytical expression of logarithm of the posterior distribution is obtained, Equation (111):
[TABLE]
Computation of the analytical expression of q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}).
The proportionality relation corresponding to q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is presented in established in Equation (108a). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of can be regarded as constants. Via Equation (111) the integral defined in Equation (110) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral from Equation (112) becomes:
[TABLE]
Noting that \biggl{\langle}\ln p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon})\biggr{\rangle}_{q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}})} is a quadratic criterion and considering the proportionality from Equation (108a) it can be concluded that q_{1}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right) is a multivariate Normal distribution. Minimizing the criterion leads to the analytical expression of the corresponding mean. The variance is obtained by identification:
[TABLE]
We note that both the expressions of the mean and variance depend on expectancies corresponding to two variances of the hierarchical model.
Computation of the analytical expression of q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (108b). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants. Via Equation (111) the integral defined in Equation (110) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}^{-\frac{1}{2}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\|\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2-j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}})} can be written:
[TABLE]
Considering that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is a multivariate Normal distribution, Equation (115):
[TABLE]
From Equation (116) and Equation (119):
[TABLE]
from which it can establish the proportionality corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}):
[TABLE]
leading to the conclusion that q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is an Inverse Gamma distribution with the following shape and scale parameters:
[TABLE]
Computation of the analytical expression of q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}).
The proportionality relation corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) is presented in established in Equation (108c). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} can be regarded as constants. Via Equation (111) the integral defined in Equation (110) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-\frac{1}{2}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)\|\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{3-i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}})} can be written:
[TABLE]
Considering that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is a multivariate Normal distribution, Equation (115):
[TABLE]
and considering as constants all terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i} is the line i of the matrix , so we can conclude:
[TABLE]
From Equation (123) and Equation (128):
[TABLE]
from which it can establish the proportionality corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}):
[TABLE]
leading to the conclusion that q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) is an Inverse Gamma distribution with the following shape and scale parameters:
[TABLE]
Equations (115), (122) and (131) resume the distributions families and the corresponding parameters for q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), a multivariate Normal distribution and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), , Inverse Gamma distributions. However, the parameters corresponding to the multivariate Normal distribution are expressed via {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}^{-1} (and by extension all elements forming the three matrices {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}^{-1}, and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}^{-1}, ).
Computation of the analytical expressions of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}^{-1}.
For an Inverse Gamma distribution with scale and shape parameters and , , the following relation holds:
[TABLE]
The prove of the above relation is done by direct computation, using the analytical expression of the Inverse Gamma Distribution:
[TABLE]
Since q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), are Inverse Gamma distributions, with the corresponding parameters and , respectively and , the expectancies {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}^{-1} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}^{-1} can be expressed via the parameters of the two Inverse Gamma distributions using Equation (132):
[TABLE]
Using the notation introduced in (113):
[TABLE]
In Equation (134) other notations are introduced for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}^{-1} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1}. Both values were expressed during the model via unknown expectancies, but via Equation (134) those values don’t contain any more integrals to be computed. Therefore, the new notations represent the final analytical expressions used for expressing the density functions . Using Equation (134) and Equations (115), (122) and (131), the final analytical expressions of the separable distributions are presented in Equations (135c), (135f) and (135i).
[TABLE]
Equation (135c) establishes the dependency between the parameters corresponding to the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and the others parameters involved in the hierarchical model: the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} depend on {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}^{-1} which, via Equation (134) are defined using and . The dependency between the parameters of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and the parameters of the Inverse Gamma distributions q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} is presented in Figure (18).
Equation (135f) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Figure (19).
Equation (135i) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Figure (20).
The iterative algorithm obtained via PM estimation is presented Figure (21).
4.6.3 Posterior Mean estimation via VBA, full separability
In this subsection, the Posterior Mean (PM) estimation is again considered, but via a full separable approximation. The posterior distribution is approximated by a full separable distribution q\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right), i.e. a supplementary condition is added in Equation (105):
[TABLE]
As in Subsection (4.6.2), the approximation is done by minimizing of the Kullback-Leibler divergence, Equation (106), via alternate optimization resulting the following proportionalities from Equations (137a), (137b) and (137c),
[TABLE]
using the notations introduced in Equation (109), Equation (110) and Equation (138).
[TABLE]
The analytical expression of logarithm of the posterior distribution \ln p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}) is obtained in Equation (111).
Computation of the analytical expression of q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}).
The proportionality relation corresponding to q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is presented in established in Equation (137a). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i} can be regarded as constants:
[TABLE]
Using Equation (113) the integral from Equation (139) becomes:
[TABLE]
Considering all the {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} free terms as constants, the first norm can be written:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} represents the column of the matrix , {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} represents the matrix except the column , {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}^{-{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} represents the vector except the element {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}. Introducing the notation
[TABLE]
the expectancy of the first norm becomes:
[TABLE]
Considering all the free {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} terms as constants, the expectancy for the second norm becomes:
[TABLE]
From Equation (137a), (140), (143), and (144) the proportionality for q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) becomes:
[TABLE]
Considering the criterion J({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j})=\left(\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-\frac{1}{2}}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}\|^{2}+{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}^{-1}\right){\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}^{2}-2{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}T}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}^{-{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right){\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} which is quadratic, we conclude q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) is a Normal distribution. For computing the mean of the Normal distribution, it is sufficient to compute the solution that minimizes the criterion J({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}):
[TABLE]
The variance can be obtained by identification. The analytical expressions for the mean and the variance corresponding to the Normal distributions, q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) are presented in Equation (147).
[TABLE]
Computation of the analytical expression of q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) established in Equation (137b) refers to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, so in the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants,
[TABLE]
so the integral of the logarithm becomes:
[TABLE]
Equation (149) leads to the conclusion that q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is an Inverse Gamma distribution. Equation (150) presents the analytical expressions for to the shape and scale parameters corresponding to the Inverse Gamma distribution.
[TABLE]
Computation of the analytical expression of q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}).
The proportionality relation corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) established in Equation (137c) refers to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} so in the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} can be regarded as constants:
[TABLE]
Introducing the notation
[TABLE]
the expectancy of the logarithm becomes
[TABLE]
so and the proportionality relation for q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) from Equation (137c) can be written:
[TABLE]
Equation (154) shows that q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) are Inverse Gamma distributions. The analytical expressions of the corresponding parameters are presented in Equation (155).
[TABLE]
Since q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} are Inverse Gamma distributions, it is easy to obtain analytical expressions for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1}, defined in Equation (113) and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}^{-1},j\in\left\{1,2,\ldots,M\right\}, obtaining the same expressions as in Equation (134). Using Equation (134) and Equations (147), (150) and (155), the final analytical expressions of the separable distributions are presented in Equations (156c), (156f) and (156i).
[TABLE]
Equations (156c), (156f) and (156i) establish dependencies between the parameters of the distributions, very similar to the one presented in Figures (18), (19) and (20). The iterative algorithm obtained via PM estimation with full separability, is presented Figure (22).
4.7 IS: Student-t hierarchical model: non-stationary Student-t uncertainties model, unknown uncertainties variances
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a non-stationary Student-t uncertainties model is proposed, i.e. a multivariate Student-t distribution expressed via StPM is used under the following assumption:
a) the variance vector {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
The hierarchical model showed in Figure (3), which is build over the linear forward model, Equation (1) and Equation (2), using as a prior for a Student-t distribution, expressed via the Student-t Prior Model (StPM), Equation (7) and modelling the uncertainties of the model and using the non-stationary Student-t Uncertainties Model (nsStUM), Equation (81), is presented in Equation (157). The posterior distribution is obtained via the Bayes rule, Equation (158):
[TABLE]
The goal is to estimate the unknowns of the hierarchical model, namely , the main unknown of the linear forward model expressed in Equation (1), , the main unknown of Equation (2), supposed sparse and consequently modelled using the Student-t distribution and the three variances appearing in the hierarchical model, Equation (157), the variance corresponding to the sparse structure , namely {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z} and the two variances corresponding to the uncertainties of model , and , namely {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}.
4.7.1 Joint MAP estimation
First, the Joint Maximum A Posterior (JMAP) estimation is considered: the unknowns are estimated on the basis of the available data , by maximizing the posterior distribution:
[TABLE]
where for the second equality the criterion {\cal L}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}\right) is defined as:
[TABLE]
The MAP estimation corresponds to the solution minimizing the criterion {\cal L}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}\right). From the analytical expression of the posterior distribution, Equation (158) and the definition of the criterion , Equation (160), we obtain:
[TABLE]
One of the simplest optimisation algorithm that can be used is an alternate optimization of the criterion {\cal L}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},\;{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}\right) with respect to the each unknown:
- •
With respect to :
[TABLE]
- •
With respect to :
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, :
First, we develop the norm \|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}^{-\frac{1}{2}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)\|^{2}:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i} denotes the line i of the matrix , , i.e. {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}=\left[{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},\ldots,{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}M}}\right].
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, :
First, we develop the norm \|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}^{-\frac{1}{2}}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}\right)\|^{2}:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j} denotes the line j of the matrix , , i.e. {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}=\left[{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}d}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}d}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},\ldots,{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}d}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}M}}\right].
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}{}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}, :
[TABLE]
The iterative algorithm obtained via JMAP estimation is presented Figure (23).
4.7.2 Posterior Mean estimation via VBA, partial separability
In this subsection, the Posterior Mean (PM) estimation is considered. The Joint MAP computes the mod of the posterior distribution. The PM computes the mean of the posterior distribution. One of the advantages of this estimator is that it minimizes the Mean Square Error (MSE). Computing the posterior means of any unknown needs great dimensional integration. For example, the mean corresponding to can be computed from the posterior distribution using Equation (162),
[TABLE]
In general, these computations are not easy. One way to obtain approximate estimates is to approximate p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{\xi},\beta_{\xi},\alpha_{z},\beta_{z}) by a separable one q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{\xi},\beta_{\xi},\alpha_{z},\beta_{z})=q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}})q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}})q_{4}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}})q_{5}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}}), then computing the posterior means using the separability. The mean corresponding to is computed using the corresponding separable distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Equation (102),
[TABLE]
If the approximation of the posterior distribution with a separable one can be done in such a way that conserves the mean, i.e. Equation (164),
[TABLE]
for all the unknowns of the model, a great amount of computational cost is gained. In particular, for the proposed hierarchical model, Equation (157), the posterior distribution, Equation (158), is not a separable one, making the analytical computations of the PM very difficult. One way the compute the PM in this case is to first approximate the posterior law p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{\xi},\beta_{\xi},\alpha_{z},\beta_{z}) with a separable law q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{\xi},\beta_{\xi},\alpha_{z},\beta_{z}), Equation (165),
[TABLE]
where the notations from Equation (166) are used
[TABLE]
by minimizing of the Kullback-Leibler divergence, defined as:
[TABLE]
where the notations from Equation (168) are used
[TABLE]
Equation (166) is selecting a partial separability for the approximated posterior distribution q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{\xi},\beta_{\xi},\alpha_{z},\beta_{z}) in the sense that a total separability is imposed for the distributions corresponding to the three variances appearing in the hierarchical model, q_{3}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}\right), q_{4}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) and q_{5}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}}\right) but not for the distribution corresponding to and . Evidently, a full separability can be imposed, by adding the supplementary conditions q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})=\prod_{j=1}^{M}q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) and q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}})=\prod_{j=1}^{M}q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) in Equation (166). This case is considered in Subsection (4.7.3). The minimization can be done via alternate optimization resulting the following proportionalities from Equations (169a), (169b), (169c), (169d) and (169e),
[TABLE]
using the notations:
[TABLE]
and also
[TABLE]
Via Equation (160) and Equation (161), the analytical expression of logarithm of the posterior distribution is obtained, Equation (172):
[TABLE]
Computation of the analytical expression of q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) (first part).
The proportionality relation corresponding to q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is established in Equation (169a). In the expression of p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of can be regarded as constants. Via Equation (172) the integral defined in Equation (171) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral from Equation (173) becomes:
[TABLE]
Evidently, \biggl{\langle}p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right)\biggr{\rangle}_{q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}})\;q_{4}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}})} is a quadratic form with respect to . The proportionality from Equation (169a) leads to the following corollary:
Corollary 4.0.1**.**
q_{1}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)* is a multivariate Normal distribution.*
Computation of the analytical expression of q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}).
The proportionality relation corresponding to q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) is established in Equation (169b). In the expression of p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of can be regarded as constants. Via Equation (172) the integral defined in Equation (171) becomes:
[TABLE]
Using the notations introduced in Equation (174) and introducing the notations:
[TABLE]
the integral from Equation (176) becomes:
[TABLE]
The norm from the first term in Equation (178) can be developed as it follows:
[TABLE]
so Equation (178) can be developed as it follows:
[TABLE]
Using Corollary (4.0.1), we can easily compute \left\langle\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}^{\frac{1}{2}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\|^{2}\right\rangle_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})} and \left\langle{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right\rangle_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})}. For the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) we will denote {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} the corresponding mean and \widehat{\mbox{\boldmath\Sigma}}_{f}, obtaining:
[TABLE]
From Equations (178), (180), (181) we obtain:
[TABLE]
Evidently, \biggl{\langle}p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right)\biggr{\rangle}_{q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}})\;q_{4}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}})}. The proportionality from Equation (169b) leads to the following corollary:
Corollary 4.0.2**.**
q_{2}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}\right)* is a multivariate Normal distribution.*
Minimizing the criterion
[TABLE]
leads to the analytical expression of the corresponding mean, denoted {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathz}}}:
[TABLE]
The variance can be obtained by identification:
[TABLE]
Finally, we conclude that q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) is a multivariate Normal distribution with the following parameters:
[TABLE]
We note that both the expressions of the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathz}}} and variance \widehat{\mbox{\boldmath\Sigma}}_{z} depend on expectancies corresponding to the three variances of the hierarchical model.
Computation of the analytical expression of q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) (second part).
We come back for computing the parameters of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}). Using the fact that q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) was proved a multivariate Normal distribution, the norm from the latter term in Equation (175) can be computed. The norm was developed in Equation (179) and the computation of the integral uses the particular form of q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}):
[TABLE]
Minimizing the criterion
[TABLE]
leads to the analytical expression of the corresponding mean, denoted {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathz}}}:
[TABLE]
The variance can be obtained by identification:
[TABLE]
Finally, we conclude that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is a multivariate Normal distribution with the following parameters:
[TABLE]
We note that both the expressions of the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and variance \widehat{\mbox{\boldmath\Sigma}}_{f} depend on expectancies corresponding to the three variances of the hierarchical model.
Computation of the analytical expression of q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (169c). In the expression of p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants. Via Equation (172) the integral defined in Equation (171) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}^{-\frac{1}{2}}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}\right)\|^{2}\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}})\;q_{3-j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}})} can be written:
[TABLE]
Considering that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) are multivariate Normal distributions, Equations (191) and (186) and considering the development of the norm Equations (179), we have:
[TABLE]
where denotes terms not containing {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}} and {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} denotes the line of the matrix . From Equation (192) and Equation (195):
[TABLE]
from which it can establish the proportionality corresponding to q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}):
[TABLE]
leading to the following corollary:
Corollary 4.0.3**.**
q_{3j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right)* is an Inverse Gamma distribution.*
The shape and scale parameters are obtained by identification, from Equation (197):
[TABLE]
Computation of the analytical expression of q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}).
The proportionality relation corresponding to q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) is presented in established in Equation (169d). In the expression of p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} can be regarded as constants. Via Equation (172) the integral defined in Equation (171) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-\frac{1}{2}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)\|^{2}\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{4-i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}})} can be written:
[TABLE]
Considering that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is a multivariate Normal distribution, Equations (191) we have:
[TABLE]
where denotes terms not containing {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}} and {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}} denotes the line of the matrix . From Equation (199) and Equation (202):
[TABLE]
from which it can establish the proportionality corresponding to q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}):
[TABLE]
leading to the following corollary:
Corollary 4.0.4**.**
q_{4i}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}\right)* is an Inverse Gamma distribution.*
The shape and scale parameters are obtained by identification, from Equation (204):
[TABLE]
Computation of the analytical expression of q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (169d). In the expression of p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants. Via Equation (172) the integral defined in Equation (171) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}}^{-\frac{1}{2}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}\|^{2}\biggr{\rangle}_{q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}})\;q_{5-j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}})} can be written:
[TABLE]
Considering that q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) is a multivariate Normal distribution, Equations (186) we have:
[TABLE]
From Equation (206) and Equation (209):
[TABLE]
from which it can establish the proportionality corresponding to q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}):
[TABLE]
leading to the following corollary:
Corollary 4.0.5**.**
q_{5j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right)* is an Inverse Gamma distribution.*
The shape and scale parameters are obtained by identification, from Equation (211):
[TABLE]
Equations (191), (186), (198), (205) and (212) resume the distributions families and the corresponding parameters for q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) ( multivariate Normal distribution), q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), , q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), and q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), (Inverse Gamma distributions). However, the parameters corresponding to the multivariate Normal distributions are expressed via {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}} (and by extension all elements forming the three matrices {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, , {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, , both defined in Equation (174) and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, , defined in Equation (177).
Computation of the analytical expressions of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}}.
For an Inverse Gamma distribution with scale and shape parameters and , , the following relation holds:
[TABLE]
The prove of the above relation is done by direct computation, using the analytical expression of the Inverse Gamma Distribution:
[TABLE]
Since q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), , q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), and q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), are Inverse Gamma distributions, with the corresponding parameters and , , and , respectively and , the expectancies {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be expressed via the parameters of the two Inverse Gamma distributions using Equation (213):
[TABLE]
Using the notation introduced in (174):
[TABLE]
In Equation (215) other notations are introduced for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}}. Both values were expressed during the model via unknown expectancies, but via Equation (215) those values don’t contain any more integrals to be computed. Therefore, the new notations represent the final analytical expressions used for expressing the density functions . Using Equation (215) and Equations (191), (186), (198), (205), and (212), the final analytical expressions of the separable distributions are presented in Equations (216c), (216f), (216i), (216l) and (216o).
[TABLE]
Equation (216c) establishes the dependency between the parameters corresponding to the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and the others parameters involved in the hierarchical model: the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}}_{f} depend on {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}} which, via Equation (215) are defined using and and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathz}}}. The dependency between the parameters of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and the parameters of the Inverse Gamma distributions q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and the multivariate Normal distribution q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) is presented in Figure (24).
Equation (216f) establishes the dependency between the parameters corresponding to the multivariate Normal distribution q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) and the others parameters involved in the hierarchical model: the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathz}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}}_{z} depend on {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}} which, via Equation (215) are defined using and and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}}. The dependency between the parameters of the multivariate Normal distribution q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) and the parameters of the Inverse Gamma distributions q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is presented in Figure (25).
Equation (216i) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters depend on the element of the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), i.e. {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{f}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} and the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathz}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}}_{z} of the multivariate Normal distribution q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}), Figure (26).
Equation (216l) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}}_{f} of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Figure (27).
Equation (216o) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters depend on the element of the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathz}}} and the element of the covariance matrix \widehat{\mbox{\boldmath\Sigma}}_{z} corresponding to the multivariate Normal distribution q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}), Figure (28).
The iterative algorithm obtained via PM estimation is presented Figure (29).
More formal, the iterative algorithm obtained via PM estimation is presented Algorithm (4).
4.7.3 Posterior Mean estimation via VBA, full separability
In this subsection, the Posterior Mean (PM) estimation is again considered, but via a full separable approximation. The posterior distribution is approximated by a full separable distribution q\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}}\right), i.e. a supplementary condition is added in Equation (166):
[TABLE]
As in Subsection (4.7.2), the approximation is done by minimizing of the Kullback-Leibler divergence, Equation (167), via alternate optimization resulting the following proportionalities from Equations (218a), (218b), (218c), (218d) and (218e),
[TABLE]
using the notations introduced in Equation (170), Equation (171) and Equation (219).
[TABLE]
The analytical expression of logarithm of the posterior distribution \ln p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{\xi},\beta_{\xi},\alpha_{z},\beta_{z}) is obtained in Equation (172).
Computation of the analytical expression of q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (218a). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{f},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} can be regarded as constants:
[TABLE]
Using the notations introduced in Equation (174), the integral from Equation (220) becomes:
[TABLE]
Introducing the notations
[TABLE]
[TABLE]
and denoting {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} the column of matrix and {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} the matrix without column j, the development of the norm from the first term if Equation (232) is
[TABLE]
Using the equality {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}={\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}^{-{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}+{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} we establish the equalities from Equation (225) and Equation (226):
[TABLE]
[TABLE]
so the corresponding integrals are:
[TABLE]
Considering all the term that don’t contain {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} as constants, using Equation (227) via Equation (224), the integral of the norm from the first term of Equation (232) is:
[TABLE]
The norm corresponding to the latter term of Equation (232) is developed considering all the term that don’t contain {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} as constants:
[TABLE]
Introducing the notations
[TABLE]
the integral of the norm from the latter term of Equation (232) is:
[TABLE]
From Equation (232) via Equation (228) and Equation (231):
[TABLE]
so \biggl{\langle}\ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right)\biggr{\rangle}_{q_{1-j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}})\;q_{4}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}})\;q_{5}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}})} is a quadratic form with respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}. The proportionality from Equation (218a) leads to the following corollary:
Corollary 4.0.6**.**
q_{1j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right)* is a Normal distribution.*
Minimizing the criterion
[TABLE]
leads to the mean of the Normal distribution q_{1j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right):
[TABLE]
By identification, the variance can be easily derived:
[TABLE]
Finally, we conclude that q_{1j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right) is a Normal distribution with the following parameters:
[TABLE]
Computation of the analytical expression of q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (218b). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} can be regarded as constants:
[TABLE]
Using the notations introduced in Equation (174) and Equation (177), the integral from Equation (237) becomes:
[TABLE]
Introducing the notations
[TABLE]
[TABLE]
and denoting {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} the column of matrix and {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} the matrix without column j, the development of the norm from the first term if Equation (238) is
[TABLE]
Using the equality {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}={\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}^{-{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}+{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathD}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} we establish the equalities from Equation (242) and Equation (243):
[TABLE]
[TABLE]
so the corresponding integrals are:
[TABLE]
Considering all the term that don’t contain {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} as constants, using Equation (244) via Equation (241), the integral of the norm from the first term of Equation (238) is:
[TABLE]
The norm corresponding to the latter term of Equation (238) is developed considering all the term that don’t contain {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} as constants:
[TABLE]
so the integral of the norm from the latter term of Equation (238) is:
[TABLE]
From Equation (238) via Equation (245) and Equation (247):
[TABLE]
so \biggl{\langle}\ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right)\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2-j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}})\;q_{4}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}})\;q_{5}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}})}. The proportionality from Equation (218b) leads to the following corollary:
Corollary 4.0.7**.**
q_{2j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right)* is a Normal distribution.*
Minimizing the criterion
[TABLE]
leads to the mean of the Normal distribution q_{2j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right):
[TABLE]
By identification, the variance can be easily derived:
[TABLE]
Finally, we conclude that q_{2j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right) is a Normal distribution with the following parameters:
[TABLE]
Computation of the analytical expression of q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (218c). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants:
[TABLE]
Using the notations introduced in Equation (193), Corollary (4.0.6) and Corollary (4.0.7), the integral from Equation (253) becomes:
[TABLE]
The proportionality from Equation (218c) via Equation (254) leads to the following corollary:
Corollary 4.0.8**.**
q_{3j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right)* is an Inverse Gamma distribution.*
By identification we conclude that q_{3j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right) is an Inverse Gamma distribution with the following parameters:
[TABLE]
Computation of the analytical expression of q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}).
The proportionality relation corresponding to q_{4j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) is presented in established in Equation (218d). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants:
[TABLE]
Using the notations introduced in Equation (200) and Corollary (4.0.6), the integral from Equation (256) becomes:
[TABLE]
The proportionality from Equation (218d) via Equation (257) leads to the following corollary:
Corollary 4.0.9**.**
q_{4i}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}\right)* is an Inverse Gamma distribution.*
By identification we conclude that q_{4i}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}\right) is an Inverse Gamma distribution with the following parameters:
[TABLE]
Computation of the analytical expression of q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (218e). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\xi},\beta_{\xi},\alpha_{\epsilon},\beta_{\epsilon},\alpha_{z},\beta_{z}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants:
[TABLE]
Using the notations introduced in Equation (207) and Corollary (4.0.7), the integral from Equation (259) becomes:
[TABLE]
The proportionality from Equation (218e) via Equation (260) leads to the following corollary:
Corollary 4.0.10**.**
q_{5j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right)* is an Inverse Gamma distribution.*
By identification we conclude that q_{5j}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right) is an Inverse Gamma distribution with the following parameters:
[TABLE]
Equations (236), (252), (255), (258) and (261) resume the distributions families and the corresponding parameters for q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), , q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), (Normal distribution), q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), , q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), and q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), (Inverse Gamma distributions). However, the parameters corresponding to the Normal distributions are expressed via {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}} (and by extension all elements forming the three matrices {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, , {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, , both defined in Equation (174) and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, , defined in Equation (177).
Computation of the analytical expressions of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}}.
For an Inverse Gamma distribution with scale and shape parameters and , , the following relation holds:
[TABLE]
The prove of the above relation is done by direct computation, using the analytical expression of the Inverse Gamma Distribution:
[TABLE]
Since q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), , q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), and q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), are Inverse Gamma distributions, with the corresponding parameters and , , and , respectively and , the expectancies {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be expressed via the parameters of the two Inverse Gamma distributions using Equation (262):
[TABLE]
Using the notation introduced in (174):
[TABLE]
In Equation (264) other notations are introduced for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}}. Both values were expressed during the model via unknown expectancies, but via Equation (264) those values don’t contain any more integrals to be computed. Therefore, the new notations represent the final analytical expressions used for expressing the density functions . Using Equation (264) and Equations (236), (252), (255), (258) and (261), the final analytical expressions of the separable distributions are presented in Equations (265c), (265f), (265i), (265l) and (265o).
[TABLE]
Equation (265c) establishes the dependency between the parameters corresponding to the Normal distributions q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) and the others parameters involved in the hierarchical model: the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{f}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} and the variance depend on {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}} which, via Equation (264) are defined using and and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathz}}}. The dependency between the parameters of the Normal distribution q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) and the parameters of the Inverse Gamma distributions q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and the Normal distributions q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) is presented in Figure (30).
Equation (265f) establishes the dependency between the parameters corresponding to the Normal distributions q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) and the others parameters involved in the hierarchical model: the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{z}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} and the variance depend on {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}}, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}} which, via Equation (264) are defined using and and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}}. The dependency between the parameters of the Normal distribution q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) and the parameters of the Inverse Gamma distributions q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the Normal distributions q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) is presented in Figure (31).
Equation (265i) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters depend on the of the Normal distribution q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}), i.e. {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{f}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} and the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} and the variance of the Normal distribution q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}), Figure (32).
Equation (265l) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{f}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} and the variance of the Normal distribution q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}), Figure (33).
Equation (265o) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{z}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} and the variance corresponding to the Normal distribution q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}), Figure (28).
The iterative algorithm obtained via PM estimation is presented Figure (35).
More formal, the iterative algorithm obtained via PM estimation is presented Algorithm (5).
4.8 Student-t hierarchical model: stationary Laplace uncertainties model, unknown uncertainties variance
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Laplace uncertainties model is proposed, i.e. a multivariate Laplace distribution expressed via LPM is used under the following two assumptions:
a) each element of the uncertainties vector has the same variance, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}};
b) the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
4.9 Student-t hierarchical model: non-stationary Laplace uncertainties model, unknown uncertainties variances
- •
the hierarchical model is using as a prior the Student-t distribution;
- •
the Student-t prior distribution is expressed via StPM, Equation (7), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a non-stationary Laplace uncertainties model is proposed, i.e. a multivariate Laplace distribution expressed via LPM is used under the following assumption:
a) the variance vector {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
5 Hierarchical Models with Laplace prior via LPM
Considering the Laplace distribution for modelling the sparse structure of in linear forward model, Equation (1), expressed via the conjugate priors LPM, depending on the model proposed for the uncertainties of the model, , and implicitly the corresponding likelihood, sGL or nsGL, different hierarchical models can be proposed:
5.1 Laplace hierarchical model: stationary Gaussian uncertainties model, known uncertainties variance
- •
the hierarchical model is using as a prior the Laplace distribution;
- •
the Laplace prior distribution is expressed via LPM, Equation (43), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Gaussian uncertainties model is proposed, i.e. a multivariate Gaussian distribution is used under the following two assumptions:
a) each element of the uncertainties vector has the same variance, ;
b) the variance is known;
[TABLE]
5.2 Laplace hierarchical model: non-stationary Gaussian uncertainties model, known uncertainties variances
- •
the hierarchical model is using as a prior the Laplace distribution;
- •
the Laplace prior distribution is expressed via LPM, Equation (43), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Gaussian uncertainties model is proposed, i.e. a multivariate Gaussian distribution is used under the following assumption:
a) the variance vector \mbox{\boldmathv}_{\epsilon} is known;
[TABLE]
5.3 Laplace hierarchical model: stationary Gaussian uncertainties model, unknown uncertainties variance
- •
the hierarchical model is using as a prior the Laplace distribution;
- •
the Laplace prior distribution is expressed via LPM, Equation (43), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Gaussian uncertainties model is proposed, i.e. a multivariate Gaussian distribution is used under the following two assumptions:
a) each element of the uncertainties vector has the same variance, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}};
b) the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
5.4 Laplace hierarchical model: non-stationary Gaussian uncertainties model, unknown uncertainties variances
- •
the hierarchical model is using as a prior the Laplace distribution;
- •
the Laplace prior distribution is expressed via LPM, Equation (43), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Gaussian uncertainties model is proposed, i.e. a multivariate Gaussian distribution is used under the following assumption:
a) the variance vector {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
5.5 Laplace hierarchical model: stationary Student-t uncertainties model, unknown uncertainties variance
- •
the hierarchical model is using as a prior the Laplace distribution;
- •
the Laplace prior distribution is expressed via LPM, Equation (43), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Student-t uncertainties model is proposed, i.e. a multivariate Student distribution expressed via StPM is used under the following two assumptions:
a) each element of the uncertainties vector has the same variance, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}};
b) the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
5.6 Laplace hierarchical model: non-stationary Student-t uncertainties model, unknown uncertainties variances
- •
the hierarchical model is using as a prior the Laplace distribution;
- •
the Laplace prior distribution is expressed via LPM, Equation (43), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a non-stationary Student-t uncertainties model is proposed, i.e. a multivariate Student-t distribution expressed via StPM is used under the following assumption:
a) the variance vector {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
The hierarchical model build over the linear forward model, Equation (1), using as a prior for a Laplace distribution, expressed via the Laplace Prior Model (LPM), Equation (43) and modelling the uncertainties of the model using the non-stationary Student Uncertainties Model (nsStUM), Equation (81), is presented in Equation (273). The posterior distribution is obtained via the Bayes rule, Equation (274):
[TABLE]
The goal is to estimate the unknowns of the hierarchical model, namely , the main unknown of the linear forward model, Equation (1) which was suppose sparse, and consequently modelled via the Laplace distribution and the two variances appearing in the hierarchical model, Equation (273), the variance corresponding to the sparse structure , namely {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} and the variance corresponding to uncertainties of model , namely {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}.
5.6.1 Joint MAP estimation
First, the Joint Maximum A Posterior (JMAP) estimation is considered: the unknowns are estimated on the basis of the available data , by maximizing the posterior distribution:
[TABLE]
where for the second equality the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}) is defined as:
[TABLE]
The MAP estimation corresponds to the solution minimizing the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}). From the analytical expression of the posterior distribution, Equation (274) and the definition of the criterion , Equation (276), we obtain:
[TABLE]
One of the simplest optimisation algorithm that can be used is an alternate optimization of the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}) with respect to the each unknown:
- •
With respect to :
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}, :
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, :
First, we develop the norm \|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}^{-\frac{1}{2}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)\|:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i} denotes the line i of the matrix , , i.e. {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}=\left[{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},\ldots,{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}M}}\right].
[TABLE]
The iterative algorithm obtained via JMAP estimation is presented Figure (36).
5.6.2 Posterior Mean estimation via VBA, partial separability
In this subsection, the Posterior Mean (PM) estimation is considered. The Joint MAP computes the mod of the posterior distribution. The PM computes the mean of the posterior distribution. One of the advantages of this estimator is that it minimizes the Mean Square Error (MSE). Computing the posterior means of any unknown needs great dimensional integration. For example, the mean corresponding to can be computed from the posterior distribution using Equation (278),
[TABLE]
In general, these computations are not easy. One way to obtain approximate estimates is to approximate p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}) by a separable one q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon})=q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}), then computing the posterior means using the separability. The mean corresponding to is computed using the corresponding separable distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Equation (279),
[TABLE]
If the approximation of the posterior distribution with a separable one can be done in such a way that conserves the mean, i.e. Equation (280),
[TABLE]
for all the unknowns of the model, a great amount of computational cost is gained. In particular, for the proposed hierarchical model, Equation (96), the posterior distribution, Equation (274), is not a separable one, making the analytical computations of the PM very difficult. One way the compute the PM in this case is to first approximate the posterior law p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}) with a separable law q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}), Equation (281),
[TABLE]
where the notations from Equation (282) are used
[TABLE]
by minimizing of the Kullback-Leibler divergence, defined as:
[TABLE]
where the notations from Equation (284) are used
[TABLE]
Equation (282) is selecting a partial separability for the approximated posterior distribution q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}) in the sense that a total separability is imposed for the distributions corresponding to the two variances appearing in the hierarchical model, q_{2}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}\right) and q_{3}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) but not for the distribution corresponding to . Evidently, a full separability can be imposed, by adding the supplementary condition q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})=\prod_{j=1}^{M}q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) in Equation (282). This case is considered in Subsection (5.6.3). The minimization can be done via alternate optimization resulting the following proportionalities from Equations (285a), (285b) and (285c),
[TABLE]
using the notations:
[TABLE]
and also
[TABLE]
Via Equation (276) and Equation (277), the analytical expression of logarithm of the posterior distribution is obtained, Equation (288):
[TABLE]
Computation of the analytical expression of q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}).
The proportionality relation corresponding to q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is presented in established in Equation (285a). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of can be regarded as constants. Via Equation (288) the integral defined in Equation (287) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral from Equation (289) becomes:
[TABLE]
Noting that \biggl{\langle}\ln p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\alpha_{\epsilon},\beta_{f},\beta_{\epsilon})\biggr{\rangle}_{q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}})} is a quadratic criterion and considering the proportionality from Equation (285a) it can be concluded that q_{1}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right) is a multivariate Normal distribution. Minimizing the criterion leads to the analytical expression of the corresponding mean. The variance is obtained by identification:
[TABLE]
We note that both the expressions of the mean and variance depend on expectancies corresponding to two variances of the hierarchical model.
Computation of the analytical expression of q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (285b). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants. Via Equation (288) the integral defined in Equation (287) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}^{\frac{1}{2}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\|\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2-j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}})} can be written:
[TABLE]
Considering that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is a multivariate Normal distribution, Equation (292):
[TABLE]
From Equation (293) and Equation (296):
[TABLE]
from which it can establish the proportionality corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}):
[TABLE]
leading to the conclusion that q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is a generalized inverse Gaussian distribution (see Equation (46)) with the following parameters:
[TABLE]
Computation of the analytical expression of q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}).
The proportionality relation corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) is presented in established in Equation (285c). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} can be regarded as constants. Via Equation (288) the integral defined in Equation (287) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-\frac{1}{2}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)\|\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{3-i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}})} can be written:
[TABLE]
Considering that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is a multivariate Normal distribution, Equation (292):
[TABLE]
and considering as constants all terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i} is the line i of the matrix , so we can conclude:
[TABLE]
From Equation (300) and Equation (305):
[TABLE]
from which it can establish the proportionality corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}):
[TABLE]
leading to the conclusion that q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) are Inverse Gamma distributions with the following parameters:
[TABLE]
Equations (292), (299) and (308) resume the distributions families and the corresponding parameters for q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), a multivariate Normal distribution, q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), Inverse Gamma distributions and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), , generalized inverse Gaussian distributions. However, the parameters corresponding to the multivariate Normal distribution are expressed via {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}} (and by extension all elements forming the three matrices {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}^{-1}, and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, ).
Computation of the analytical expression {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}.
For an generalized inverse Gaussian distribution with parameters , and , , the following relation holds:
[TABLE]
where represents the modified Bessel function of the second kind.
To prove the above relation we consider the direct computation, using the analytical expression of the generalized inverse Gaussian distribution:
[TABLE]
The fact that the integral of the generalized inverse Gaussian distribution is obvious. Proving that the ratio between the two modified Bessel functions of the second kind is , i.e. that comes from expressing the modified Bessel function of the second kind via the modified Bessel function of the first kind :
[TABLE]
Since q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), are generalized inverse Gaussian distributions, with the corresponding parameters , {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{f_{j}} and , , the expectancies {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be expressed via the parameters of the generalized inverse Gaussian distributions using Equation (309):
[TABLE]
Using the notation introduced in (290):
[TABLE]
In Equation (312) other notation is introduced for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}. The value was expressed during the model via unknown expectancies, but via Equation (312) this value doesn’t contain any more integrals to be computed. Therefore, the new notation represents the final analytical expression used for expressing the density functions .
Computation of the analytical expression {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1}.
For an Inverse Gamma distribution with scale and shape parameters and , , the following relation holds:
[TABLE]
The prove of the above relation is done by direct computation, using the analytical expression of the Inverse Gamma Distribution:
[TABLE]
Since q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), are Inverse Gamma distributions, with the corresponding parameters and , the expectancies {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}^{-1} can be expressed via the parameters of the Inverse Gamma distributions using Equation (313):
[TABLE]
Using the notation introduced in (290):
[TABLE]
In Equation (315) other notation is introduced for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{-1}. The value was expressed during the model via unknown expectancies, but via Equation (315) this value doesn’t contain any more integrals to be computed. Therefore, the new notation represents the final analytical expressions used for expressing the density functions . Using Equation (312) and Equations (292), (299) and (308), the final analytical expressions of the separable distributions are presented in Equations (316c), (316g) and (316j).
[TABLE]
Equation (316c) establishes the dependency between the parameters corresponding to the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and the others parameters involved in the hierarchical model: the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} depend on {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}} which, via Equation (312) are defined using \left\{\widehat{a}_{f_{j}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{f_{j}}\right\},j\in\left\{1,2,\ldots,M\right\} and \left\{\widehat{a}_{\epsilon_{i}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{\epsilon_{i}}\right\},i\in\left\{1,2,\ldots,N\right\}. The dependency between the parameters of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and the parameters of the generalized inverse Gaussian distributions q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} is presented in Figure (37).
Equation (316g) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters \left\{\widehat{a}_{f_{j}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{f_{j}}\right\},j\in\left\{1,2,\ldots,M\right\} depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Figure (38).
Equation (316j) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters \left\{\widehat{a}_{\epsilon_{i}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{\epsilon_{i}}\right\},i\in\left\{1,2,\ldots,N\right\} depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Figure (39).
The iterative algorithm obtained via PM estimation is presented Figure (40).
5.6.3 Posterior Mean estimation via VBA, full separability
In this subsection, the Posterior Mean (PM) estimation is again considered, but via a full separable approximation. The posterior distribution is approximated by a full separable distribution q\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right), i.e. a supplementary condition is added in Equation (282):
[TABLE]
As in Subsection (5.6.2), the approximation is done by minimizing of the Kullback-Leibler divergence, Equation (283), via alternate optimization resulting the following proportionalities from Equations (318a), (318b) and (318c),
[TABLE]
using the notations introduced in Equation (286), Equation (287) and Equation (319):
[TABLE]
The analytical expression of logarithm of the posterior distribution \ln p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}) is obtained in Equation (288).
Computation of the analytical expression of q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}).
The proportionality relation corresponding to q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is presented in established in Equation (318a). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i} can be regarded as constants:
[TABLE]
Using Equation (290) the integral from Equation (320) becomes:
[TABLE]
Considering all the {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} free terms as constants, the first norm can be written:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} represents the column of the matrix , {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} represents the matrix except the column , {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}^{-{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} represents the vector except the element {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}. Introducing the notation
[TABLE]
the expectancy of the first norm becomes:
[TABLE]
Considering all the free {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} terms as constants, the expectancy for the second norm becomes:
[TABLE]
From Equation (318a), (321), (324), and (325) the proportionality for q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) becomes:
[TABLE]
Considering the criterion J({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j})=\left(\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{\frac{1}{2}}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}\|^{2}+{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right){\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}^{2}-2{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}T}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}^{-{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right){\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} which is quadratic, we conclude q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) is a Normal distribution. For computing the mean of the Normal distribution, it is sufficient to compute the solution that minimizes the criterion J({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}):
[TABLE]
The variance can be obtained by identification. The analytical expressions for the mean and the variance corresponding to the Normal distributions, q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) are presented in Equation (328).
[TABLE]
Computation of the analytical expression of q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) established in Equation (318b) refers to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, so in the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants,
[TABLE]
so the integral of the logarithm becomes:
[TABLE]
Equation (330) leads to the conclusion that q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is an generalized inverse Gaussian distribution. Equation (331) presents the analytical expressions for the parameters corresponding to the Inverse Gamma distribution.
[TABLE]
Computation of the analytical expression of q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}).
The proportionality relation corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) established in Equation (318c) refers to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} so in the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\alpha_{\epsilon},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} can be regarded as constants:
[TABLE]
Introducing the notation
[TABLE]
the expectancy of the logarithm becomes
[TABLE]
so the proportionality relation for q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) from Equation (318c) can be written:
[TABLE]
Equation (335) shows that q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) are generalized inverse Gaussian distributions. The analytical expressions of the corresponding parameters are presented in Equation (336).
[TABLE]
Since q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} are generalized inverse Gaussian distributions, it is easy to obtain analytical expressions for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}, defined in Equation (290) and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}},j\in\left\{1,2,\ldots,M\right\}, obtaining the same expressions as in Equation (312). Using Equation (312) and Equations (328), (331) and (336), the final analytical expressions of the separable distributions are presented in Equations (337c), (337g) and (337k).
[TABLE]
Equations (337c), (337g) and (337k) establish dependencies between the parameters of the distributions, very similar to the one presented in Figures (37), (38) and (39). The iterative algorithm obtained via PM estimation with full separability, is presented Figure (41).
5.7 Laplace hierarchical model: stationary Laplace uncertainties model, unknown uncertainties variance
- •
the hierarchical model is using as a prior the Laplace distribution;
- •
the Laplace prior distribution is expressed via LPM, Equation (43), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a stationary Laplace uncertainties model is proposed, i.e. a multivariate Laplace distribution expressed via LPM is used under the following two assumptions:
a) each element of the uncertainties vector has the same variance, {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}};
b) the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
5.8 Laplace hierarchical model: non-stationary Laplace uncertainties model, unknown uncertainties variances
- •
the hierarchical model is using as a prior the Laplace distribution;
- •
the Laplace prior distribution is expressed via LPM, Equation (43), considering the variance {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} as unknown;
- •
the likelihood is derived from the distribution proposed for modelling the uncertainties vector ;
- •
for the uncertainties vector a non-stationary Laplace uncertainties model is proposed, i.e. a multivariate Laplace distribution expressed via LPM is used under the following assumption:
a) the variance vector {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} is unknown;
[TABLE]
The hierarchical model build over the linear forward model, Equation (1), using as a prior for a Laplace distribution, expressed via the Laplace Prior Model (LPM), Equation (43) and modelling the uncertainties of the model using the non-stationary Laplace Uncertainties Model (nsLUM), Equation (89), is presented in Equation (339). The posterior distribution is obtained via the Bayes rule, Equation (340):
[TABLE]
The goal is to estimate the unknowns of the hierarchical model, namely , the main unknown of the linear forward model, Equation (1) which was suppose sparse, and consequently modelled via the Laplace distribution and the two variances appearing in the hierarchical model, Equation (339), the variance corresponding to the sparse structure , namely {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f} and the variance corresponding to uncertainties of model , namely {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}.
5.8.1 Joint MAP estimation
First, the Joint Maximum A Posterior (JMAP) estimation is considered: the unknowns are estimated on the basis of the available data , by maximizing the posterior distribution:
[TABLE]
where for the second equality the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}) is defined as:
[TABLE]
The MAP estimation corresponds to the solution minimizing the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}). From the analytical expression of the posterior distribution, Equation (340) and the definition of the criterion Equation (342), we obtain:
[TABLE]
One of the simplest optimisation algorithm that can be used is an alternate optimization of the criterion {\cal L}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}) with respect to the each unknown:
- •
With respect to :
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}, :
[TABLE]
- •
With respect to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, :
First, we develop the norm \|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}^{\frac{1}{2}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)\|:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i} denotes the line i of the matrix , , i.e. {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}=\left[{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}1}},\ldots,{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}h}_{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}M}}\right].
[TABLE]
The iterative algorithm obtained via JMAP estimation is presented Figure (42).
5.8.2 Posterior Mean estimation via VBA, partial separability
In this subsection, the Posterior Mean (PM) estimation is considered. The Joint MAP computes the mod of the posterior distribution. The PM computes the mean of the posterior distribution. One of the advantages of this estimator is that it minimizes the Mean Square Error (MSE). Computing the posterior means of any unknown needs great dimensional integration. For example, the mean corresponding to can be computed from the posterior distribution using Equation (344),
[TABLE]
In general, these computations are not easy. One way to obtain approximate estimates is to approximate p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}) by a separable one q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon})=q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}), then computing the posterior means using the separability. The mean corresponding to is computed using the corresponding separable distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Equation (345),
[TABLE]
If the approximation of the posterior distribution with a separable one can be done in such a way that conserves the mean, i.e. Equation (346),
[TABLE]
for all the unknowns of the model, a great amount of computational cost is gained. In particular, for the proposed hierarchical model, Equation (96), the posterior distribution, Equation (340), is not a separable one, making the analytical computations of the PM very difficult. One way the compute the PM in this case is to first approximate the posterior law p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}) with a separable law q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}), Equation (347),
[TABLE]
where the notations from Equation (348) are used
[TABLE]
by minimizing of the Kullback-Leibler divergence, defined as:
[TABLE]
where the notations from Equation (350) are used
[TABLE]
Equation (348) is selecting a partial separability for the approximated posterior distribution q({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}) in the sense that a total separability is imposed for the distributions corresponding to the two variances appearing in the hierarchical model, q_{2}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}\right) and q_{3}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right) but not for the distribution corresponding to . Evidently, a full separability can be imposed, by adding the supplementary condition q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})=\prod_{j=1}^{M}q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) in Equation (348). This case is considered in Subsection (5.8.3). The minimization can be done via alternate optimization resulting the following proportionalities from Equations (351a), (351b) and (351c),
[TABLE]
using the notations:
[TABLE]
and also
[TABLE]
Via Equation (342) and Equation (343), the analytical expression of logarithm of the posterior distribution is obtained, Equation (354):
[TABLE]
Computation of the analytical expression of q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}).
The proportionality relation corresponding to q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is presented in established in Equation (351a). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}\right) all the terms free of can be regarded as constants. Via Equation (354) the integral defined in Equation (353) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral from Equation (355) becomes:
[TABLE]
Noting that \biggl{\langle}\ln p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon})\biggr{\rangle}_{q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}})\;q_{3}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}})} is a quadratic criterion and considering the proportionality from Equation (351a) it can be concluded that q_{1}\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right) is a multivariate Normal distribution. Minimizing the criterion leads to the analytical expression of the corresponding mean. The variance is obtained by identification:
[TABLE]
We note that both the expressions of the mean and variance depend on expectancies corresponding to two variances of the hierarchical model.
Computation of the analytical expression of q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is presented in established in Equation (351b). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants. Via Equation (354) the integral defined in Equation (353) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}^{\frac{1}{2}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\|\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{2-j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}})} can be written:
[TABLE]
Considering that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is a multivariate Normal distribution, Equation (358):
[TABLE]
From Equation (359) and Equation (362):
[TABLE]
from which it can establish the proportionality corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}):
[TABLE]
leading to the conclusion that q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is a generalized inverse Gaussian distribution (see Equation (46)) with the following parameters:
[TABLE]
Computation of the analytical expression of q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}).
The proportionality relation corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) is presented in established in Equation (351c). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} can be regarded as constants. Via Equation (354) the integral defined in Equation (353) becomes:
[TABLE]
Introducing the notations:
[TABLE]
the integral \biggl{\langle}\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathV}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{\frac{1}{2}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}\right)\|\biggr{\rangle}_{q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}})\;q_{3-i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}})} can be written:
[TABLE]
Considering that q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is a multivariate Normal distribution, Equation (358):
[TABLE]
and considering as constants all terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}_{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}i} is the line i of the matrix , so we can conclude:
[TABLE]
From Equation (366) and Equation (371):
[TABLE]
from which it can establish the proportionality corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}):
[TABLE]
leading to the conclusion that q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) is an generalized inverse Gaussian distribution with the following parameters:
[TABLE]
Equations (358), (365) and (374) resume the distributions families and the corresponding parameters for q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), a multivariate Normal distribution and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), , generalized inverse Gaussian distributions. However, the parameters corresponding to the multivariate Normal distribution are expressed via {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}} (and by extension all elements forming the three matrices {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}, and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, ).
Computation of the analytical expressions of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}}.
For an generalized inverse Gaussian distribution with parameters , and , , the following relation holds:
[TABLE]
where represents the modified Bessel function of the second kind.
To prove the above relation we consider the direct computation, using the analytical expression of the generalized inverse Gaussian distribution:
[TABLE]
The fact that the integral of the generalized inverse Gaussian distribution is obvious. Proving that the ratio between the two modified Bessel functions of the second kind is , i.e. that comes from expressing the modified Bessel function of the second kind via the modified Bessel function of the first kind :
[TABLE]
Since q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}), and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}), are generalized inverse Gaussian distributions, with the corresponding parameters , {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{f_{j}} and , respectively and {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{\epsilon_{i}} and , the expectancies {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} can be expressed via the parameters of the two generalized inverse Gaussian distributions using Equation (375):
[TABLE]
Using the notation introduced in (356):
[TABLE]
In Equation (378) other notations are introduced for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}. Both values were expressed during the model via unknown expectancies, but via Equation (378) those values don’t contain any more integrals to be computed. Therefore, the new notations represent the final analytical expressions used for expressing the density functions . Using Equation (378) and Equations (358), (365) and (374), the final analytical expressions of the separable distributions are presented in Equations (379c), (379g) and (379k).
[TABLE]
Equation (379c) establishes the dependency between the parameters corresponding to the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and the others parameters involved in the hierarchical model: the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} depend on {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}} which, via Equation (378) are defined using \left\{\widehat{a}_{f_{j}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{f_{j}}\right\},j\in\left\{1,2,\ldots,M\right\} and \left\{\widehat{a}_{\epsilon_{i}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{\epsilon_{i}}\right\},i\in\left\{1,2,\ldots,N\right\}. The dependency between the parameters of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and the parameters of the generalized inverse Gaussian distributions q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} is presented in Figure (43).
Equation (379g) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters \left\{\widehat{a}_{f_{j}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{f_{j}}\right\},j\in\left\{1,2,\ldots,M\right\} depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Figure (44).
Equation (379k) establishes the dependency between the parameters corresponding to the Inverse Gamma distributions q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} and the others parameters involved in the hierarchical model: the shape and scale parameters \left\{\widehat{a}_{\epsilon_{i}},{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\widetilde{h}}_{\epsilon_{i}}\right\},i\in\left\{1,2,\ldots,N\right\} depend on the mean {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widehat{\mbox{\boldmathf}}} and the covariance matrix \widehat{\mbox{\boldmath\Sigma}} of the multivariate Normal distribution q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), Figure (45).
The iterative algorithm obtained via PM estimation is presented Figure (46).
5.8.3 Posterior Mean estimation via VBA, full separability
In this subsection, the Posterior Mean (PM) estimation is again considered, but via a full separable approximation. The posterior distribution is approximated by a full separable distribution q\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\right), i.e. a supplementary condition is added in Equation (348):
[TABLE]
As in Subsection (5.8.2), the approximation is done by minimizing of the Kullback-Leibler divergence, Equation (349), via alternate optimization resulting the following proportionalities from Equations (381a), (381b) and (381c),
[TABLE]
using the notations introduced in Equation (352), Equation (353) and Equation (382):
[TABLE]
The analytical expression of logarithm of the posterior distribution \ln p({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}) is obtained in Equation (354).
Computation of the analytical expression of q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}).
The proportionality relation corresponding to q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) is presented in established in Equation (381a). In the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i} can be regarded as constants:
[TABLE]
Using Equation (356) the integral from Equation (383) becomes:
[TABLE]
Considering all the {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} free terms as constants, the first norm can be written:
[TABLE]
where {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} represents the column of the matrix , {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} represents the matrix except the column , {\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}} and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}^{-{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} represents the vector except the element {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}. Introducing the notation
[TABLE]
the expectancy of the first norm becomes:
[TABLE]
Considering all the free {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} terms as constants, the expectancy for the second norm becomes:
[TABLE]
From Equation (381a), (384), (387), and (388) the proportionality for q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) becomes:
[TABLE]
Considering the criterion J({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j})=\left(\|{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}^{\frac{1}{2}}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}\|^{2}+{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right){\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}^{2}-2{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}T}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}\left({\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}}-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathH}}^{-{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}j}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}^{-{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}\right){\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j} which is quadratic, we conclude q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) is a Normal distribution. For computing the mean of the Normal distribution, it is sufficient to compute the solution that minimizes the criterion J({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}):
[TABLE]
The variance can be obtained by identification. The analytical expressions for the mean and the variance corresponding to the Normal distributions, q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) are presented in Equation (391).
[TABLE]
Computation of the analytical expression of q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}).
The proportionality relation corresponding to q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) established in Equation (381b) refers to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}, so in the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}} can be regarded as constants,
[TABLE]
so the integral of the logarithm becomes:
[TABLE]
Equation (393) leads to the conclusion that q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) is an generalized inverse Gaussian distribution. Equation (394) presents the analytical expressions for the parameters corresponding to the Inverse Gamma distribution.
[TABLE]
Computation of the analytical expression of q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}).
The proportionality relation corresponding to q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) established in Equation (381c) refers to {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} so in the expression of \ln p\left({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f},{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}|{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\mbox{\boldmathg}},\beta_{f},\beta_{\epsilon}\right) all the terms free of {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}} can be regarded as constants:
[TABLE]
Introducing the notation
[TABLE]
the expectancy of the logarithm becomes
[TABLE]
so the proportionality relation for q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) from Equation (381c) can be written:
[TABLE]
Equation (398) shows that q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) are generalized inverse Gaussian distributions. The analytical expressions of the corresponding parameters are presented in Equation (399).
[TABLE]
Since q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}),j\in\left\{1,2,\ldots,M\right\} and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}v}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}),i\in\left\{1,2,\ldots,N\right\} are generalized inverse Gaussian distributions, it is easy to obtain analytical expressions for {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{\mbox{\boldmathV}}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}}, defined in Equation (356) and {\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\widetilde{v}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}},j\in\left\{1,2,\ldots,M\right\}, obtaining the same expressions as in Equation (378). Using Equation (378) and Equations (391), (394) and (399), the final analytical expressions of the separable distributions are presented in Equations (400c), (400g) and (400k).
[TABLE]
Equations (400c), (400g) and (400k) establish dependencies between the parameters of the distributions, very similar to the one presented in Figures (43), (44) and (45). The iterative algorithm obtained via PM estimation with full separability, is presented Figure (47).
List of Figures
- 1 Interpretation of the forward linear model, Equation (1)
- 2 From linear forward model, Equation (1) to the Hierarchical Model: Direct sparsity, i.e. is sparse
- 3 From linear forward model, Equation (1) to the Hierarchical Model: Sparsity via a transformation, i.e. D$$f is sparse
- 4 Hierarchical Model: probability density functions assigned for the unknowns
- 5 Sparsity Mechanism: For direct sparsity applications
- 6 Comparison between the Normal distribution and Student-t distribution .
- (a) Normal vs. Student-t distribution
- (b) Heavy tailed property of the Student distribution
- 7 Four Student-t probability density functions with different means (, blue and red, , yellow and , violet) and different degrees of freedom (, blue and yellow, , red and , violet), Figure (7(a)) and the corresponding Generalized Hyperbolic probability density functions, with parameters set as in Equation (26), with , Figure (8(b)). Comparison between the distributions, vs. , Figure (8(c)) and between the logarithm of the distributions vs. , Figure (8(d)).
- (a)
- (b)
- (c) vs.
- (d) vs.
- 8 The standard Student-t distribution (), Figure (8(a)) and the Generalized Hyperbolic distribution, with parameters set as in Equation (26), Figure (8(b)). Comparison between the two distribution, vs. , Figure (8(c)) and between the logarithm of the distributions vs. , Figure (8(d)).
- (a)
- (b)
- (c) vs.
- (d) vs.
- 9 The behaviour of the Generalized Hyperbolic density function depending on .
- (a) vs. with
- (b) for
- 10 The Hyperbolic distribution, Figure (10(a)) and the Generalized Hyperbolic distribution, with parameters set as in Equation (31), Figure (10(b)). Comparison between the two distribution, vs. , Figure (10(c)) and between the logarithm of the distributions vs. , Figure (10(d)).
- (a)
- (b)
- (c) vs.
- (d) vs.
- 11 Comparison between the standard Normal distribution and the Laplace distribution.
- (a) Normal vs. Laplace distribution
- (b) Heavy tailed property of the Laplace distribution
- 12 Four Laplace probability density functions with different means values (, blue and red, , yellow and , violet) and different scale parameter values (, blue and violet, , red and , yellow), Figure (12(a)) and the corresponding Generalized Hyperbolic probability density functions, with parameters set as in Equation (56), with , Figure (12(b)). Comparison between the distributions, vs. , Figure (12(c)) and between the logarithm of the distributions vs. , Figure (12(d)).
- (a)
- (b)
- (c) vs.
- (d) vs.
- 13 The standard Laplace distribution, Figure (13(a)) and the Generalized Hyperbolic distribution, with parameters set as in Equation (56), Figure (13(b)). Comparison between the two distribution, vs. , Figure (13(c)) and between the logarithm of the distributions vs. , Figure (13(d)).
- (a)
- (b)
- (c) vs.
- (d) vs.
- 14 The behaviour of the Generalized Hyperbolic density function depending on .
- (a) vs. for
- (b) for
- 15 The Variance-Gamma distribution, Figure (15(a)) and the Generalized Hyperbolic distribution, with parameters set as in Equation (61), Figure (15(b)). Comparison between the two distribution, vs. , Figure (15(c)) and between the logarithm of the distributions vs. , Figure (15(d)).
- (a)
- (b)
- (c) vs.
- (d) vs.
- 16 The Normal-Inverse Gaussian distribution, Figure (16(a)) and the Generalized Hyperbolic distribution, with parameters set as in Equation (66), Figure (16(b)). Comparison between the two distribution, vs. , Figure (16(c)) and between the logarithm of the distributions vs. , Figure (16(d)).
- (a)
- (b)
- (c) vs.
- (d) vs.
- 17 Iterative algorithm corresponding to Joint MAP estimation for Student-t hierarchical model, non-stationary Student-t uncertainties model
- 18 Dependency between q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) parameters and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters
- 19 Dependency between q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters
- 20 Dependency between q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) parameters
- 21 Iterative algorithm corresponding to PM estimation via VBA - partial separability for Student-t hierarchical model, non-stationary Student-t uncertainties model
- 22 Iterative algorithm corresponding to PM estimation via VBA - full separability for Student-t hierarchical model, non-stationary Student-t uncertainties model
- 23 Indirect sparsity (via ) - Iterative algorithm corresponding to Joint MAP estimation for Student-t hierarchical model, non-stationary Student-t uncertainties model
- 24 Dependency between q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) parameters and q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}), q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) and q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters
- 25 Dependency between q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) parameters and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}), q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) and q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters
- 26 Dependency between q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) parameters
- 27 Dependency between q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) parameters
- 28 Dependency between q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters and q_{2}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathz}}) parameters
- 29 Indirect sparsity (via ) - Iterative algorithm corresponding to partial separability PM estimation for Student-t hierarchical model, non-stationary Student-t uncertainties model
- 30 Dependency between q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) parameters and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}), q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) and q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters
- 31 Dependency between q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) parameters and q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}), q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) and q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters
- 32 Dependency between q_{3j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\xi}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters and q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) parameters
- 33 Dependency between q_{4i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters and q_{1j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) parameters
- 34 Dependency between q_{5j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}z}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}) parameters
- 35 Indirect sparsity (via ) - Iterative algorithm corresponding to full separability PM estimation for Student-t hierarchical model, non-stationary Student-t uncertainties model
- 36 Iterative algorithm corresponding to Joint MAP estimation for Laplace hierarchical model, non-stationary Student-t uncertainties model
- 37 Dependency between q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) parameters and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters
- 38 Dependency between q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters
- 39 Dependency between q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) parameters
- 40 Iterative algorithm corresponding to PM estimation via VBA - partial separability for Laplace hierarchical model, non-stationary Student-t uncertainties model
- 41 Iterative algorithm corresponding to PM estimation via VBA - full separability for Laplace hierarchical model, non-stationary Student-t uncertainties model
- 42 Iterative algorithm corresponding to Joint MAP estimation for Laplace hierarchical model, non-stationary Laplace uncertainties model
- 43 Dependency between q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) parameters and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters
- 44 Dependency between q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) parameters and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) and q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters
- 45 Dependency between q_{3i}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\epsilon}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}i}}) parameters and q_{2j}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathv}}_{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}f}_{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}j}}) and q_{1}({\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mbox{\boldmathf}}) parameters
- 46 Iterative algorithm corresponding to PM estimation via VBA - partial separability for Laplace hierarchical model, non-stationary Laplace uncertainties model
- 47 Iterative algorithm corresponding to PM estimation via VBA - full separability for Laplace hierarchical model, non-stationary Laplace uncertainties model
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AKZ 06] A. Achim, E. E. Kuruoglu, and J. Zerubia. Sar image filtering based on the heavy-tailed rayleigh model. IEEE Transactions on Image Processing , 15(9):2686 – 2693, Sept. 2006.
- 2[AMD 10] Hacheme Ayasso and Ali Mohammad-Djafari. Joint NDT image restoration and segmentation using Gauss–Markov–Potts prior models and variational bayesian computation. IEEE Transactions on Image Processing , 19(9):2265–2277, 2010.
- 3[CMD 97] H. Carfantan and Ali Mohammad-Djafari. A Bayesian framework for nonlinear diffraction tomography. In IEEE EURASIP Workshop on Nonlinear Signal and Image Processing NSIP’97 , Mackinac island, MI, USA, September 1997.
- 4[FDMD 05] Olivier Féron, Bernard Duchêne, and Ali Mohammad-Djafari. Microwave imaging of inhomogeneous objects made of a finite number of dielectric and conductive materials from experimental data. Inverse Problems , 21(6):95–115, Dec 2005.
- 5[FDMD 07] Olivier Féron, Bernard Duchêne, and Ali Mohammad-Djafari. Microwave imaging of piecewise constant objects in a 2D-TE configuration. International Journal of Applied Electromagnetics and Mechanics , 26(6):167–174, IOS Press 2007.
- 6[Had 01] J. Hadamard. Sur les problèmes aux dérivées partielles et leur signification physique. Princeton Univ. Bull. , 13, 1901.
- 7[KTB 04] EE Kuruoglu, A. Tonazzini, and L. Bianchi. Source separation in noisy astrophysical images modelled by markov random fields. In Image Processing, 2004. ICIP’04. 2004 International Conference on , volume 4, pages 2701–2704. IEEE, 2004.
- 8[NMD 94] M. Nguyen and Ali Mohammad-Djafari. Bayesian approach with the maximum entropy principle in image reconstruction from microwave scattered field data. IEEE Trans. on Medical Imaging , 13(2):254–262, June 1994.
