Bayesian Inference of the Multi-Period Optimal Portfolio for an   Exponential Utility

David Bauder; Taras Bodnar; Nestor Parolya; Wolfgang Schmid

arXiv:1705.06533·math.ST·April 19, 2023·J. Multivar. Anal.

Bayesian Inference of the Multi-Period Optimal Portfolio for an Exponential Utility

David Bauder, Taras Bodnar, Nestor Parolya, Wolfgang Schmid

PDF

TL;DR

This paper develops a Bayesian framework for estimating multi-period optimal portfolios with exponential utility, providing uncertainty quantification, predictive wealth distributions, and risk measures, demonstrated on FTSE 100 assets post-Brexit.

Contribution

It introduces a Bayesian approach using Jeffreys' and conjugate priors for multi-period portfolio optimization, including stochastic representations and predictive wealth analysis.

Findings

01

Effective uncertainty quantification for portfolio weights.

02

Predictive distributions enable risk assessment and default probability estimation.

03

Application to FTSE 100 assets demonstrates practical utility in volatile markets.

Abstract

We consider the estimation of the multi-period optimal portfolio obtained by maximizing an exponential utility. Employing Jeffreys' non-informative prior and the conjugate informative prior, we derive stochastic representations for the optimal portfolio weights at each time point of portfolio reallocation. This provides a direct access not only to the posterior distribution of the portfolio weights but also to their point estimates together with uncertainties and their asymptotic distributions. Furthermore, we present the posterior predictive distribution for the investor's wealth at each time point of the investment period in terms of a stochastic representation for the future wealth realization. This in turn makes it possible to use quantile-based risk measures or to calculate the probability of default. We apply the suggested Bayesian approach to assess the uncertainty in the…

Equations185

W_{t}

W_{t}

V (0, W_{0}) = {v_{s}}_{s = 0}^{T - 1} max E_{0} [U (W_{T})]

V (0, W_{0}) = {v_{s}}_{s = 0}^{T - 1} max E_{0} [U (W_{T})]

V (T - t, W_{T - t})

V (T - t, W_{T - t})

w_{t}

w_{t}

\overline{x}_{t} = \frac{1}{n} i = t - n + 1 \sum t x_{i} and S_{t} = \frac{1}{n - 1} i = t - n + 1 \sum t (x_{i} - \overline{x}_{t}) (x_{i} - \overline{x}_{t})^{⊤} .

\overline{x}_{t} = \frac{1}{n} i = t - n + 1 \sum t x_{i} and S_{t} = \frac{1}{n - 1} i = t - n + 1 \sum t (x_{i} - \overline{x}_{t}) (x_{i} - \overline{x}_{t})^{⊤} .

\hat{w}_{t} = C_{t} S_{t}^{- 1} (\overline{x}_{t} - r_{f, t + 1} 1) with C_{t} = (γ W_{t} i = t + 2 \prod T R_{f, i})^{- 1} \leavevmode \leavevmode for \leavevmode \leavevmode t = 0, ..., T - 1.

\hat{w}_{t} = C_{t} S_{t}^{- 1} (\overline{x}_{t} - r_{f, t + 1} 1) with C_{t} = (γ W_{t} i = t + 2 \prod T R_{f, i})^{- 1} \leavevmode \leavevmode for \leavevmode \leavevmode t = 0, ..., T - 1.

\displaystyle\pi(\mbox{\boldmath$\mu$},\mbox{\boldmath$\Sigma$}|\mathbf{x}_{t,n})

\displaystyle\pi(\mbox{\boldmath$\mu$},\mbox{\boldmath$\Sigma$}|\mathbf{x}_{t,n})

\displaystyle\pi(\mbox{\boldmath$\mu$},\mbox{\boldmath$\Sigma$})

\displaystyle\pi(\mbox{\boldmath$\mu$},\mbox{\boldmath$\Sigma$})

\displaystyle\mbox{\boldmath$\mu$}|\mbox{\boldmath$\Sigma$}

\displaystyle\mbox{\boldmath$\mu$}|\mbox{\boldmath$\Sigma$}

Σ

\displaystyle\mbox{\boldmath$\mu$}|\mathbf{x}_{t,n}

\displaystyle\mbox{\boldmath$\mu$}|\mathbf{x}_{t,n}

\displaystyle\mbox{\boldmath$\Sigma$}|\mbox{\boldmath$\mu$},\mathbf{x}_{t,n}

\displaystyle\mbox{\boldmath$\Sigma$}|\mbox{\boldmath$\mu$},\mathbf{x}_{t,n}

\displaystyle\mbox{\boldmath$\mu$}|\mathbf{x}_{t,n}

\displaystyle\mbox{\boldmath$\mu$}|\mathbf{x}_{t,n}

\overline{x}_{t, c} = \frac{n x _{t} + r _{0} m _{0}}{n + r _{0}} \leavevmode \leavevmode and \leavevmode \leavevmode S_{t, c} = S_{t, d} + S_{0} + n r_{0} \frac{( m _{0} - x _{t, c} ) ( m _{0} - x _{t, c} ) ^{⊤}}{n + r _{0}} .

\overline{x}_{t, c} = \frac{n x _{t} + r _{0} m _{0}}{n + r _{0}} \leavevmode \leavevmode and \leavevmode \leavevmode S_{t, c} = S_{t, d} + S_{0} + n r_{0} \frac{( m _{0} - x _{t, c} ) ( m _{0} - x _{t, c} ) ^{⊤}}{n + r _{0}} .

\displaystyle\mbox{\boldmath$\Sigma$}|\mbox{\boldmath$\mu$},\mathbf{x}_{t,n}

\displaystyle\mbox{\boldmath$\Sigma$}|\mbox{\boldmath$\mu$},\mathbf{x}_{t,n}

\displaystyle\mathbf{S}^{*}_{t,c}(\mbox{\boldmath$\mu$})

L w_{t}

L w_{t}

L w_{t}

L w_{t}

L w_{t}

L w_{t}

ϵ_{d}

ϵ_{d}

\displaystyle\mbox{\boldmath$\zeta$}_{d}

\displaystyle\mbox{\boldmath$\Upsilon$}_{d}

L w_{t}

L w_{t}

ϵ_{c}

ϵ_{c}

\displaystyle\mbox{\boldmath$\zeta$}_{c}

\displaystyle\mbox{\boldmath$\Upsilon$}_{c}

\hat{w}_{t, d} = E (w_{t} ∣ x_{t, n})

\hat{w}_{t, d} = E (w_{t} ∣ x_{t, n})

\hat{w}_{t, c} = E (w_{t} ∣ x_{t, n})

\hat{w}_{t, c} = E (w_{t} ∣ x_{t, n})

V_{t, d} = V a r (w_{t} ∣ x_{t, n})

V_{t, d} = V a r (w_{t} ∣ x_{t, n})

V_{t, c} = V a r (w_{t} ∣ x_{t, n})

V_{t, c} = V a r (w_{t} ∣ x_{t, n})

n (w_{t} - \hat{w}_{t}) ∣ x_{t, n}

n (w_{t} - \hat{w}_{t}) ∣ x_{t, n}

\overset{˘}{x}_{t} \equiv n ⟶ \infty lim \overline{x}_{t, d} = n ⟶ \infty lim \overline{x}_{t, c} \leavevmode \leavevmode and \leavevmode \leavevmode \overset{˘}{S}_{t} \equiv n ⟶ \infty lim \frac{S _{t, d}}{n - 1} = n ⟶ \infty lim \frac{S _{t, c}}{n + r _{0}}

\overset{˘}{x}_{t} \equiv n ⟶ \infty lim \overline{x}_{t, d} = n ⟶ \infty lim \overline{x}_{t, c} \leavevmode \leavevmode and \leavevmode \leavevmode \overset{˘}{S}_{t} \equiv n ⟶ \infty lim \frac{S _{t, d}}{n - 1} = n ⟶ \infty lim \frac{S _{t, c}}{n + r _{0}}

\hat{w}_{t} \equiv n ⟶ \infty lim \hat{w}_{t, d} = n ⟶ \infty lim \hat{w}_{t, c} = C_{t} \overset{˘}{S}_{t}^{- 1} (\overset{˘}{x}_{t} - r_{f, t + 1} 1) .

\hat{w}_{t} \equiv n ⟶ \infty lim \hat{w}_{t, d} = n ⟶ \infty lim \hat{w}_{t, c} = C_{t} \overset{˘}{S}_{t}^{- 1} (\overset{˘}{x}_{t} - r_{f, t + 1} 1) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Bayesian Inference of the Multi-Period Optimal Portfolio for an Exponential Utility

David Baudera, Taras Bodnarb,1††1Corresponding Author: Taras Bodnar. E-Mail: [email protected]. Tel: +46 8 164562. Fax: +46 8 612 6717. This research was partly supported by the German Science Foundation (DFG) via the projects BO 3521/3-1 and SCHM 859/13-1 ”Bayesian Estimation of the Multi-Period Optimal Portfolio Weights and Risk Measures”., Nestor Parolyac, Wolfgang Schmidd

a* Department of Mathematics, Humboldt-University of Berlin, D-10099 Berlin, Germany

b Department of Mathematics, Stockholm University, SE-10691 Stockholm, Sweden

c Institute of Statistics, Leibniz University Hannover, D-30167 Hannover, Germany

d Department of Statistics, European University Viadrina, PO Box 1786, 15207 Frankfurt (Oder), Germany *

Zusammenfassung

We consider the estimation of the multi-period optimal portfolio obtained by maximizing an exponential utility. Employing Jeffreys’ non-informative prior and the conjugate informative prior, we derive stochastic representations for the optimal portfolio weights at each time point of portfolio reallocation. This provides a direct access not only to the posterior distribution of the portfolio weights but also to their point estimates together with uncertainties and their asymptotic distributions. Furthermore, we present the posterior predictive distribution for the investor’s wealth at each time point of the investment period in terms of a stochastic representation for the future wealth realization. This in turn makes it possible to use quantile-based risk measures or to calculate the probability of default. We apply the suggested Bayesian approach to assess the uncertainty in the multi-period optimal portfolio by considering assets from the FTSE 100 in the weeks after the British referendum to leave the European Union. The behaviour of the novel portfolio estimation method in a precarious market situation is illustrated by calculating the predictive wealth, the risk associated with the holding portfolio, and the default probability in each period.

Keywords: Multi-period optimal portfolio, Bayesian estimation, Stochastic representation, Posterior predictive distribution, Default probability, Credible sets

JEL Classification: C11, C13, C44, C58, C63

1 Introduction

In portfolio theory, the mean-variance paradigm introduced by Markowitz (1952) is still a popular reference for understanding the relationship between systematic risk, return and investment behaviour. A portfolio is determined here by using the asset expected returns and their covariances. As a starting point, Markowitz (1952) was vastly extended in the following 70 years. While Markowitz (1952) focused only on a single investment period, the multi-period solution was introduced in Markowitz (1959). Merton (1969) showed that the mean-variance multi-period setting in the continuous time case is equivalent to expected utility maximization for an exponential utility function. The multi-period optimal portfolio choice problems for different utility functions were considered by Mossin (1968), Samuelson (1969), Elton (1974), Brandt and Santa-Clara (2006), Basak and Chabakauri (2010).

While these studies focus on the continuous time case, Li and Ng (2000), Çanakoğlu and Özekici (2009), Bodnar, Parolya, and Schmid (2015a, b) presented the results in the discrete time case for the quadratic utility function and the exponential utility function. In particular, Bodnar, Parolya, and Schmid (2015b) derived an analytical expression for the multi-period optimal portfolio weights under the assumption of non-tradable predictable variables and a VAR(1)-structure which are described as linear combinations of the precision matrix (inverse covariance matrix) and the expected return vector. While this setting allows for flexibility in building trading strategies under quite unrestrictive assumptions, there are still shortcomings: (i) since the parameters of the asset return distribution, namely the mean vector and the covariance matrix, are unknown quantities, the optimal portfolio weights cannot be constructed in practice and they are obtained by replacing the unknown parameter of the asset return distribution by the corresponding estimates; (ii) although the distributional properties of the estimated optimal portfolio weights and corresponding inference procedures were derived in a number of literature studies for the single-period investment strategies (see, e.g., Gibbons, Ross, and Shanken (1989), Shanken (1992), Shanken and Zhou (2007), Okhrin and Schmid (2006), Bodnar and Schmid (2008, 2011), Bodnar and Schmid (2009)), the problem with the overlapping estimation windows appears to be very crucial under the multi-period setting; (iii) due to the multivariate structure, the determination of the joint distribution of the estimated multi-period optimal portfolio weights is a challenging task.

To tackle all these three challenges, we opt for a Bayesian approach. The Bayesian approach is a well established method for building trading strategies in a single-period optimal portfolio choice problem, starting with Winkler (1973) and Winkler and Barry (1975) and continued until this day. For an overview, see, e.g., Brandt (2010) where also Bayesian portfolio methods are discussed, or Avramov and Zhou (2010). As Avramov and Zhou (2010) pointed out, the Bayesian setting is a realistic description of human decision making processes and information utilization. Both past events and experiences influence the beliefs of market participants at least up to a certain degree how an investment will develop. The investor beliefs are modeled via a prior distributions which represents the relevant information regarding the behaviour of the asset returns. While there is a plenty of possibilities to specify the prior, we focus on the non-informative diffuse prior and the informative conjugate prior (see, e.g., Zellner (1971), and Gelman, Carlin, Stern, and Rubin (2014)) not only for computational reasons but mainly because of their popularity in the financial literature (c.f., Barry (1974), Brown (1976), Klein and Bawa (1976), Frost and Savarino (1986), Aguilar and West (2000), Rachev, Hsu, Bagasheva, and Fabozzi (2008), Avramov and Zhou (2010), Sekerke (2015), Bodnar, Mazur, and Okhrin (2017)). Furthermore, their application allows to derive the corresponding posterior distributions in the closed-form what enables us to access important risk measures and to construct credible sets.

The obtained posterior distributions of the optimal portfolio weights under both employed priors are presented in terms of their stochastic representations. A stochastic representation is a well established tool in computational statistics (c.f., Givens and Hoeting (2012)) and in the theory of elliptically contoured distributions (see, e.g. Gupta, Varga, and Bodnar (2013)) which was already used in Bayesian statistics by Bodnar, Mazur, and Okhrin (2017). It turns out that the derived stochastic representations are very powerful, allowing us to access not only the posterior distribution of the multi-period optimal portfolio weights, but also to determine the predictive distribution for the wealth at each point of the holding period. Therefore, we are able to access the quantiles for the posterior predictive wealth distribution and can calculate the risk associated with the portfolio at every point over the lifetime of a portfolio, besides analytical Bayesian estimates for the weights together with their uncertainties. Besides these pleasing properties, the developed stochastic representations are highly efficient from a computational point of view since Markov-Chain Monte-Carlo methods are not longer needed. In addition to the derivation of these results, we illustrate this method and its properties on real data. We test the model in an exhaustive study using data from the FTSE 100, where the portfolios cover the time of Great Britains referendum to leave the European Union on 23.6.2016, more commonly regarded as “Brexit”, where a slim majority of British voters decided to leave the European Union. Although this result was regarded as the less likely option in advance, it was regarded as the option with the least favourable effects on the British economy and should therefore have an effect on a portfolio covering this period.

The remaining paper is structured in the following way. In Section 2, we briefly review the solution of the multi-period optial portfolio choice problem with exponential utility derived in Bodnar, Parolya, and Schmid (2015b). The stochastic representations for the optimal portfolio weights under both priors are presented in Theorems 1 and 2 (Section 2.2), which are use to derive the corresponding Bayes estimates for the weights (Theorem 3) together with their covariance matrix (Theorem 4) as well as to prove the posterior asymptotic normality (Theorem 5). In Section 2.3, we obtain the posterior predictive distribution for the wealth during the holding period which is provided in terms of stochastic representation in Theorem 6 under both employed priors. In Section 3, the suggested Bayesian approach is applied to the Brexit-data by calculating the asymptotic distributions for the optimal portfolio weights, determining the credible sets for the portfolio wealth and specifying the default probabilities at each time point. Section 4 summarizes the main results of the paper, while all technical proofs are moved to the appendix (Section 5).

2 Bayesian analysis of multi-period optimal portfolios

2.1 Analytical solution of the multi-period optimization problem

Let $\mathbf{X}_{t}=(X_{t,1},X_{t,2},...,X_{t,k})^{\top}$ be a random vector of returns on $k$ assets taken at time point $t$ . Throughout the paper we assume that the asset returns $\mathbf{X}_{1},\mathbf{X}_{2},...$ are infinitely exchangeable and multivariate centered spherically symmetric. This assumption, in particular, implies (see, e.g., Bernardo and Smith (2000, Proposition 4.6)) that the asset returns are independently and identically distributed given mean vector $\mu$ and covariance matrix $\Sigma$ with the conditional distribution given by $\mathbf{X}_{t}|\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $}\sim\mathcal{N}_{k}(\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $})$ ( $k$ -dimensional normal distribution with mean vector $\mu$ and covariance matrix $\Sigma$ ). It is noted that the imposed assumption imply that neither the unconditional distribution of the asset returns is normal nor that they are independently distributed. Moroever, the unconditional distribution of the asset returns appears to be heavy-tailed which is usually observed for financial data.

The quantities $\mu$ and $\Sigma$ denote the parameters of the asset returns distribution where $\Sigma$ is assumed to be a $k\times k$ dimensional positive definite matrix. We consider a multi-period portfolio choice problem with the allocation of initial wealth at time point $t=0$ and with the subsequent update of the portfolio structure at time points $t\in\{1,2,...,T\}$ . Let $\mathbf{v}_{t}=(v_{t,1},...,v_{t,k})^{\top}$ stand for the vector of portfolio weights determined at time $t$ and let $r_{f,t}$ be the return on the risk-free asset in period $t$ . We assume that short-selling is allowed, i.e. the weights could also be negative. The vector $\mathbf{v}_{t}$ specifies the structure of the portfolio related to the risky assets, whereas the part of the wealth equal to $1-\mathbf{1}^{\top}\mathbf{v}_{t}$ is invested into the risk-free asset where $\mathbf{1}$ denotes the $k$ -dimensional vector of ones. Then the investor’s wealth in period $t$ is expressed as

[TABLE]

An investor seeks to maximize the utility of the final wealth, i.e. $U(W_{T})$ , where $U(x)=-\exp(-\gamma x)$ is the exponential utility function and the coefficient of absolute risk aversion, $\gamma>0$ , determines the investor’s attitude towards risk. The optimization problem is given by

[TABLE]

where the maximum is taken with respect to all weights $\mathbf{v}_{0}$ ,…, $\mathbf{v}_{T-1}$ which specify the portfolio structure during the initial period of investment as well as during all consequent reallocations. The solution of (1) is derived in the recursive way starting from the last period by applying Bellman equations at [math], $1$ , … $T-1$ . The optimization problem at time point $T-t$ is then given by

[TABLE]

subject to the terminal condition $U(W_{T})=-\exp(-\gamma W_{T})$ with $\mathbf{w}_{T-t+1}$ as the optimal portfolio weights in period $T-t+1$ . For details on this method, see e.g. Pennacchi (2008), while Bodnar, Parolya, and Schmid (2015b) determine an analytical solution of (1) under the exponential utility. The latter results are summarized in Proposition 1.

Proposition 1.

Let $\mathbf{X}_{t}$ , $t=0,...,T$ be a sequence of conditionally independently and identically distributed vectors of $k$ risky assets with $\mathbf{X}_{t}|\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $}\sim\mathcal{N}_{k}(\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $})$ . Let $\Sigma$ be positive definite. Then the optimal multi-period portfolio weights are given by

[TABLE]

for $t=0,...,T-1$ where $R_{f,i}=1+r_{f,i}$ and $\prod_{i=T+1}^{T}R_{f,i}\equiv 1$ .

Although Proposition 1 provides a simple solution of the multi-period portfolio choice problem, the formula (2) cannot directly be applied in practice since $\mu$ and $\Sigma$ are unknown parameters of the asset return distribution. As a result, these two quantities have to be estimated before the portfolio (2) is constructed. However, the usage the estimated mean vector and the estimated covariance matrix instead of the population ones does not ensure that the estimated portfolio weights coincide with true ones. Then two main questions raise: (i) how strongly deviates the estimated portfolio from the population one? and (ii) is it reasonable to invest into the estimated portfolio? Both questions have to be treated by using statistical methods and are very closely connected to the distributional properties of the estimates constructed for $\mu$ and $\Sigma$ .

The traditional approach of estimating the portfolio weights relies on the methods from the conventional statistics where the sample mean vector and the sample covariance matrix are used. Let $\mathbf{x}_{t-n+1},...,\mathbf{x}_{t}$ be the observation vectors of asset returns which are considered as realizations of the corresponding random vectors $\mathbf{X}_{i}$ , $i=t-n+1,...,t$ . Then the mean vector and the covariance matrix at time point $t$ are estimated by

[TABLE]

The sample estimate of the multi-period optimal portfolio is obtained by replacing $\mu$ and $\Sigma$ in (2) by the corresponding estimates from (3). This leads to

[TABLE]

Using the findings in Bodnar and Okhrin (2011), we obtain the density function, the moments and the stochastic representation of the sample multi-period optimal portfolio weights from the viewpoint of frequentist statistics. These results provide answers on the above two questions and allow us to characterize the distributional properties of each vector of weights $\hat{\mathbf{w}}_{t}$ separately. On the other hand, they do not take into account the multi-period nature of the considered investment procedure. More precisely, it is not possible to provide the characterization of the whole multi-period optimal portfolio, since the overlapping samples are used and the dependence structure between the estimated portfolio weights becomes severe.

For that reason, we deal with the problem of estimating the multi-period optimal portfolio from the viewpoint of Bayesian statistics and consider the portfolio constructed by using (4) as a benchmark portfolio without investigating its distributional properties in detail. In contrast to the methods of the frequentist statistics, the application of the Bayesian approach allows the sequential update of the available information which is a very important property needed for estimating the multi-period portfolio weights.

2.2 Bayesian estimation of portfolio weights

Let $\mathbf{x}_{t,n}=(\mathbf{x}_{t-n+1},...,\mathbf{x}_{t})$ denote the observation matrix at time point $t$ which consists of $n$ asset return vectors from $t-n+1$ to $t$ . According to Bayes theorem, the beliefs regarding $\mu$ and $\Sigma$ are updated in the presence of occurring data, yielding the posterior distribution $\pi(\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $}|\mathbf{x}_{t,n})$ to be proportional to the product of the likelihood function $L(\mathbf{x}_{t,n}|\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $})$ and the prior distribution $\pi(\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $})$ . The posterior is, then, used to derive Bayesian estimates for the multi-period optimal portfolio weights as well as their characteristics, like the covariance matrix and a credible region which is an analogue to a confidence region in the conventional statistics. The Bayes theorem states that

[TABLE]

The choice of the prior $\pi(\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $})$ is an important step in the Bayesian decision process. Although the prior should reflect the investor’s belief regarding the parameters of the asset return distribution, it also strongly affects the model’s computational properties since it influences the accessibility of the posterior distribution. Several priors for the mean vector and covariance matrix of the asset returns have been suggested in literature (see, e.g., Barry (1974), Brown (1976), Klein and Bawa (1976), Frost and Savarino (1986), Rachev, Hsu, Bagasheva, and Fabozzi (2008), Avramov and Zhou (2010), Sekerke (2015)) with the recent paper of Bodnar, Mazur, and Okhrin (2017) summarizing these results. In the following, we choose Jeffreys’ non-informative prior and a conjugate informative prior for both $\mu$ and $\Sigma$ . These two priors are widely used in the context of Bayesian inference of optimal portfolios.

The Jeffreys non-informative prior, also known as the diffuse prior, is given by

[TABLE]

while the cojugate prior is expressed as

[TABLE]

where $\mathbf{m}_{0}$ , $r_{0}$ , $d_{0}$ , $\mathbf{S}_{0}$ are additional model parameters known as hyperparameters. The symbol $\mathcal{IW}_{k}(d_{0},\mathbf{S}_{0})$ denotes the inverse Wishart distribution with $d_{0}$ degrees of freedom and parameter matrix $\mathbf{S}_{0}$ . The prior mean $\mbox{\boldmath$ \mu $}_{0}$ reflects our prior expectations about the expected asset returns, while $\mathbf{S}_{0}$ presents in the model the prior beliefs about the covariance matrix. The other two hyperparameters $r_{0}$ and $d_{0}$ are known as precision parameters for $\mbox{\boldmath$ \mu $}_{0}$ and $\mathbf{S}_{0}$ , respectively. Note that the prior (6)-(7) corresponds to the well-known conjugate normal-inverse-Wishart model as discussed by, e.g., Gelman, Carlin, Stern, and Rubin (2014). In this case the posterior is accessible in an analytical form and moreover, has the same distribution as the prior with updated hyperparameters.

In Proposition 2, we present the marginal posterior of $\mu$ as well as the conditional posterior of $\Sigma$ given $\mu$ . These results will be later used in the derivation of Bayesian estimates for the optimal portfolio weights. In the following the symbol $t_{k}(d,\mathbf{a},\mathbf{A})$ stands for the multivariate $k$ -dimensional $t$ -distribution with $d$ degrees of freedom, location vector $\mathbf{a}$ and dispersion matrix $\mathbf{A}$ . In the case of $k=1$ , $\mathbf{a}=0$ , and $\mathbf{A}=1$ , we use the notation $t_{d}$ to denote the standard univariate $t$ -distribution with $d$ degrees of freedom.

Proposition 2.

Let $\mathbf{X}_{t-n+1},...,\mathbf{X}_{t}$ be conditionally independently distributed with $\mathbf{X}_{i}|\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $}\sim\mathcal{N}_{k}(\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $})$ for $i=t-n+1,...,t$ with $n>k$ . Then:

(a)

Under the diffuse prior (5), the marginal posterior distribution of $\mu$ is given by

[TABLE]

The conditional posterior distribution of $\Sigma$ given $\mu$ is expressed as

[TABLE] 2. (b)

Under the conjugate prior $(\ref{informativenucor1})$ and $(\ref{informativesigmacor1})$ , the marginal posterior distribution of $\mu$ is given by

[TABLE]

The conditional posterior distribution of $\Sigma$ given $\mu$ is expressed as

[TABLE]

The proof of Proposition 2 follows from chapter 3 in Gelman, Carlin, Stern, and Rubin (2014) who presented the expressions of the marginal posterior distributions of $\mu$ under both the diffuse and the conjugate priors. Then, the results for the conditional posteriors of $\Sigma$ are obtained from the joint posterior distributions using the formulae for the marginal posteriors for $\mu$ . It is remarkable that although the results for the marginal posteriors for both $\mu$ and $\Sigma$ are widely used in Bayesian inferences and the conditional posteriors for $\mu$ given $\Sigma$ have been considered previously in literature (see, e.g., Sun and Berger (2007)), the results for the conditional posteriors of $\Sigma$ given $\mu$ have not been discussed nor used. Next, we show that the last finding allows to derive posterior distributions for functions which includes both $\mu$ and $\Sigma$ .

In order to assess the risk associated with estimating the optimal portfolio weights, we need to derive results about the posterior distribution of the weights presented in Proposition 1 which are given as a product of the inverse covariance matrix and the mean vector. Next, we establish very useful stochastic representations for these weights, endowing the parameters with their diffuse and conjugate priors. The results are summarized in Theorem 1, where the stochastic representations are derived for an arbitrary linear combination of optimal portfolio weights. These findings are later used for calculating the Bayesian estimates of the portfolio weights (Theorem 3) and their covariance matrix (Theorem 4). It is noted that the application of the stochastic representation to describe the distribution of random quantities has been used both in the conventional statistics (see, e.g., Givens and Hoeting (2012), Gupta, Varga, and Bodnar (2013)) and the Bayesian statistics (c.f., Bodnar, Mazur, and Okhrin (2017)). Later on, the symbol ” $\stackrel{{\scriptstyle d}}{{=}}$ ” denotes the equality in distribution. The proof of Theorem 1 is presented in the appendix (Section 5).

Theorem 1.

Let $\mathbf{L}$ be a $p\times k$ -dimensional matrix of constants. Then under the assumption of Proposition 2 we get:

(a)

Under the diffuse prior (5), the stochastic representation of $\mathbf{L}\mathbf{w}_{t}$ is given by

[TABLE]

where $\eta\sim\chi^{2}_{n}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , and $\mbox{\boldmath$ \mu $}|\mathbf{x}\sim t_{k}\left(n-k,\overline{\mathbf{x}}_{t,d},\mathbf{S}_{t,d}/(n(n-k))\right)$ ; moreover, $\eta,\mathbf{z}_{0}$ and $\mu$ are mutually independent. 2. (b)

Under the conjugate prior $(\ref{informativenucor1})$ and $(\ref{informativesigmacor1})$ , the stochastic representation of $\mathbf{L}\mathbf{w}_{t}$ is given by

[TABLE]

where $\eta\sim\chi^{2}_{n+d_{0}-k}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , and $\mbox{\boldmath$ \mu $}|\mathbf{x}\sim t_{k}\left(n+d_{0}-2k,\overline{\mathbf{x}}_{t,c},\mathbf{S}_{t,c}/((n+r_{0})(n+d_{0}-2k))\right)$ ; moreover, $\eta,\mathbf{z}_{0}$ and $\mu$ are mutually independent.

The results of Theorem 1 show that in both cases, i.e., when the mean vector and the covariance matrix are endowed by the diffuse prior and the conjugate prior, the obtained stochastic representations are very similar and the posterior distributions of the multi-period optimal portfolio weights from Proposition 1 can be described by three random variables which have standard univariate/multivariate distributions.

Another important application of Theorem 1 is that the results of this theorem also provide a hint how these distributions can be accessed in practice via simulations, namely by simulating samples from the $\chi^{2}$ -distribution, the normal distribution, and the $t$ -distribution. Although the derived stochastic representations have some nice computational properties in terms of speed, they are not computationally efficient. In the following theorem we derive further stochastic representations under both priors by applying the Sherman-Morrison-Woodbury formula on the inverse of the posterior scale matrices $\mathbf{S}_{t,d}^{*}(\mbox{\boldmath$ \mu $})$ and $\mathbf{S}^{*}_{t,c}(\mbox{\boldmath$ \mu $})$ . The proof of the theorem is provided in the appendix. Let $\mathcal{F}(d_{1},d_{2})$ denote the $F$ -distribution with $d_{1}$ and $d_{2}$ degrees of freedom.

Theorem 2.

Under the assumption of Theorem 1 we get:

(a)

Under the diffuse prior (5), the stochastic representation of $\mathbf{L}\mathbf{w}_{t}$ is given by

[TABLE]

with

[TABLE]

where $\eta\sim\chi^{2}_{n}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , $Q\sim\mathcal{F}(k,n-k)$ , and $\mathbf{u}$ uniformly distributed on the unit sphere in $\mathds{R}^{k}$ ; moreover, $\eta$ , $\mathbf{z}_{0}$ , $Q$ , and $\mathbf{u}$ are mutually independent. 2. (b)

Under the conjugate prior $(\ref{informativenucor1})$ and $(\ref{informativesigmacor1})$ , the stochastic representation of $\mathbf{L}\mathbf{w}_{t}$ is given by

[TABLE]

with

[TABLE]

where $\eta\sim\chi^{2}_{n+d_{0}-k}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , $Q\sim\mathcal{F}(k,n+d_{0}-2k)$ , and $\mathbf{u}$ uniformly distributed on the unit sphere in $\mathds{R}^{k}$ ; moreover, $\eta$ , $\mathbf{z}_{0}$ , $Q$ , and $\mathbf{u}$ are mutually independent.

Theorem 2 provides alternative stochastic representations of the optimal portfolio weights obtained under the diffuse prior and under the conjugate prior. Although more difficult mathematical expressions are present in Theorem 2, they are more computationally efficient than the ones provided in Theorem 1. Namely, there is no need to calculate the inverse of the matrices $\mathbf{S}_{t,d}^{*}(\mbox{\boldmath$ \mu $})$ and $\mathbf{S}_{t,c}^{*}(\mbox{\boldmath$ \mu $})$ in each simulation run and instead, we only calculate the inverse of the matrices $\mathbf{S}_{t,d}$ and $\mathbf{S}_{t,c}$ once for the whole simulation study. This property surely speeds up the simulation study considerably. Finally, we note that the realizations of the random vector $\mathbf{u}$ , which is uniformly distributed on the unit sphere in $\mathds{R}^{k}$ , are obtained by drawing $\mathbf{z}$ from the $k$ -dimensional standard normal distribution and calculating $\mathbf{u}=\mathbf{z}/\sqrt{\mathbf{z}^{\top}\mathbf{z}}$ .

The results of Theorem 2 are used to derive Bayesian estimates for the weights of the multi-period optimal portfolio at the initial period of investment as well as at each time of reallocations. They are presented in Theorem 3.

Theorem 3.

Under the assumption of Theorem 1, we get

(a)

Under the diffuse prior (5), the Bayes estimate for the optimal portfolio weights at time point $t$ is given by

[TABLE] 2. (b)

Under the conjugate prior $(\ref{informativenucor1})$ and $(\ref{informativesigmacor1})$ , the Bayes estimate for the optimal portfolio weights at time point $t$ is given by

[TABLE]

The proof of the theorem is given in the appendix. It is interesting to note that the estimate for the optimal portfolio weights obtained under the diffuse prior coincides with the expression derived in Section 2.1 for their frequentist estimate since $\mathbf{S}_{t,d}/(n-1)=\mathbf{S}_{t}$ .

Finally, we present the expressions for the covariance matrices of the optimal portfolio weights in Theorem 4 with the proof moved to the appendix. These formulas characterize the dependencies between the portfolio weight and also allow to access their Bayesian risk.

Theorem 4.

Under the assumption of Theorem 1, we get:

(a)

Under the diffuse prior (5), the covariance matrix of $\mathbf{w}_{t}$ is given by

[TABLE]

where $b_{d}=n(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})^{\top}\mathbf{S}_{t,d}^{-1}(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})$ . 2. (b)

Under the conjugate prior (6) and (7), the covariance matrix of $\mathbf{w}_{t}$ is given by

[TABLE]

where $b_{c}=(n+r_{0})(\overline{\mathbf{x}}_{t,c}-r_{f,t+1}\mathbf{1})^{\top}\mathbf{S}_{t,c}^{-1}(\overline{\mathbf{x}}_{t,c}-r_{f,t+1}\mathbf{1})$ .

The results of Theorems 3 and 4 provide the first two moments of optimal portfolio weights and, consequently, they characterize their mean values, variances, and correlations. Although different formulas are obtained under the diffuse prior and under the conjugate prior, when the sample size increases the difference between the corresponding expressions becomes negligible.

More general results are provided in Theorem 5 where it is shown that $\mathbf{w}_{t}$ converge to the same asymptotic normal distribution under the diffuse prior and under the conjugate prior.

Theorem 5.

Under the assumption of Theorem 1, it holds that

[TABLE]

as $n\longrightarrow\infty$ under both the diffuse prior and the conjugate prior where

[TABLE]

and

[TABLE]

The proof of Theorem 5 is given in the appendix. Its results are in line with the Bernstein-von Mises theorem (c.f., Bernardo and Smith (2000)) which shows under some regularity conditions that the posterior distribution converges to the normal one independently of the prior used when the sample size tends to infinity. In practice, the asymptotic covariance matrix of $\mathbf{w}_{t}$ is approximated by using $\overline{\mathbf{x}}_{t}$ and $\mathbf{S}_{t}$ instead of $\breve{\mathbf{x}}_{t}$ and $\breve{\mathbf{S}}_{t}$ .

2.3 Posterior predictive distribution

In this section we derive the posterior predictive distribution of the wealth at time point $t+1$ , $\widehat{W}_{t+1}$ , given the observable data $\mathbf{x}_{t,n}$ under the diffuse prior (5) and the conjugate prior(6) and (7) for the given vector of portfolio weights $\mathbf{v}_{t}$ and the current wealth $W_{t}$ . Namely, the aim is to derive the posterior predictive distribution of

[TABLE]

given information provided by the observation matrix $\mathbf{x}_{t,n}$ , i.e.

[TABLE]

where $\pi(\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $}|\mathbf{x}_{t,n})$ is the posterior distribution obtained under the diffuse prior or the conjugate prior. The symbol $\hat{W}_{t+1}$ denotes a random variable whose distribution coincides with the posterior predictive distribution of the wealth calculated at time point $t+1$ .

In Theorem 6 we present the stochastic representations of the posterior predictive distribution of $\hat{W}_{t+1}$ with the proof given in the appendix. The symbol $t_{d}$ stands for the standard univariate $t$ -distribution with $d$ degrees of freedom.

Theorem 6.

Under the assumption of Theorem 1 we get:

(a)

Under the diffuse prior (5), the stochastic representation of the posterior predictive distribution of $W_{t+1}$ is given by

[TABLE]

where $t_{1}$ and $t_{2}$ are independent with $t_{1}\sim t_{n-k}$ and $t_{2}\sim t_{n-k+1}$ . 2. (b)

Under the conjugate prior $(\ref{informativenucor1})$ and $(\ref{informativesigmacor1})$ , the stochastic representation of the posterior predictive distribution of $W_{t+1}$ is given by

[TABLE]

where $t_{1}$ and $t_{2}$ are independent with $t_{1}\sim t_{n+d_{0}-2k}$ and $t_{2}\sim t_{n+d_{0}-2k+1}$ .

The results in Theorem 6 are very useful in analyzing the behavior of the investor’s wealth during the whole investment period as well as at the final point $T$ . It allows: (i) to calculate with which probability the investor can become bankrupt during the whole investment horizon at each time point; (ii) to construct the prediction intervals for the wealths at each time point of the investment period; (iii) to determine risk measures, like Value-at-Risk (VaR) and conditional VaR (CVaR), of the investment strategy during all times of the future reallocation; (iv) to specify a region where the final wealth belongs to with a high probability. We illustrate these results based on real data in Section 3.

3 Empirical study

3.1 Data description

The data used in the empirical study consist of weekly returns on twelve stocks from the FTSE 100, namely Barclays, Glaxo Smith Kline, Standard Life, Marks and Spencer, Burberry Group plc, HSBC, LLoyds Banking, NEXT plc, Rolls-Royce Holding, The Sage Group, Tesco plc and Unilever which represent a variety of branches with strong international activities. Since the parameters of the asset returns are not usually constant over a longer period of time, we disregard the use of monthly data which are closer to the normal distribution and choose weekly returns as a compromise between actuality and the assumption of conditional normality. As a risk-free rate we use the weekly returns on the three-months US treasury bill.

The portfolio weights are estimated using a rolling window estimation with different sample sizes of $n$ $\in$ $\{52,78,104,130\}$ corresponding to one year up to two and a half years of weekly data in steps of six months. The portfolio runs from 6.6.2016 until 5.9.2016 ( $T=13$ ) covering a precarious market situation due to Great Britains referendum to leave the European Union on 23.06.2016. The gross returns of these assets are given in Figure 1. Especially Barclays suffered a loss of nearly 10 $\%$ in the week after the Brexit decision but also suffered losses in the weeks prior to the Brexit. HSBC announced that significant parts of her banking operations is moved from the City of London to different locations as a direct reaction to the referendum and it is rumoured that Lloyds seeks for a German banking licence as a consequence to the Brexit. The returns of the Marks and Spencer share were not as affected by the Brexit but the company reported that consumer confidence would be weakened in the days prior to the Brexit. This also implies price uncertainty for domestic consumer products due to a decline of the pound losing almost a fifth of his value against the dollar after the Brexit vote, which was emphasized for example by Tesco and Unilever. But Glaxo Smith Kline and Standard Life seem to be unaffected by the Brexit decision, yielding even positive returns. Rolls Royce, after all, faced significant losses in the beginning of 2016 and is hit by the Brexit vote severely, since they need to hedge a huge amount of British pounds against currency fluctuations because most of the contracts in aerospace are conducted in dollars.

3.2 Posterior distribution of the weights

Due to Theorem 2 it is possible to access the posterior distribution of the weights directly. The weights can be sampled using the following procedure:

Generate independently

•

$\eta$ $\sim$ $\chi_{n}^{2}$ under the diffuse prior or $\eta$ $\sim$ $\chi_{n+d_{0}-k}^{2}$ under the conjugate prior

•

$\mathbf{z}_{0}$ $\sim$ $\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$

•

$Q$ $\sim$ $\mathcal{F}(k,n-k)$ under the diffuse prior or $Q$ $\sim$ $\mathcal{F}(k,n+d_{0}-2k)$ under the conjugate prior

•

$\mathbf{Z}$ $\sim$ $\mathcal{N}_{k}(\mathbf{0},\mathbf{I}_{k})$ $\rightarrowtail$ $\mathbf{u}=\mathbf{Z}/\sqrt{\mathbf{Z}^{\prime}\mathbf{Z}}$ 2. 2.

Compute the vector of portfolio weights by using the stochastic representation (8) for the diffuse prior or (9) for the conjugate prior. 3. 3.

Repeat steps (1) and (2) $B$ times.

The implementation of this simulation procedure leads to sequences of optimal portfolio weights of size B at each time point of the investment period, from which using their sample distribution we approximate the posterior distributions of the weights as well as their important quantiles from these distributions and the credible sets for portfolio weights. It is remarkable that all computations can easily be done by generating samples from the well known univariate distributions and high numerical precision could be achieved by choosing the corresponding value of $B$ .

In Figures 2 and 3, we analyze the finite-sample behavior of the results presented in Theorem 5. Namely, we investigate the speed of convergence of the posterior distribution of the optimal portfolio weights to the corresponding asymptotic distribution which is a normal distribution according to Theorem 5 for both priors. The choice of the hyperparameters $\mathbf{m}_{0}$ and $\mathbf{S}_{0}$ in the case of the conjugate prior are of particular interest. According to the Bayesian paradigm, $\mathbf{m}_{0}$ and $\mathbf{S}_{0}$ represent the correct belief of the decision maker. In practice, however, there are several data driven methods how to replace $\mathbf{m}_{0}$ and $\mathbf{S}_{0}$ by data-dependent values $\hat{\mathbf{m}}_{0}$ and $\hat{\mathbf{S}}_{0}$ . We make use of the empirical Bayes approach (see Section 5.2 in the appendix for the derivation of the formulas) which is applied to the weekly data of the returns on the corresponding assets directly from the time period before the empirical counterparts of the portfolio weights are estimated, always with the same time window. Namely, they are given by

[TABLE]

with the derivation moved to the appendix (Section 5.2). The prior parameters for $t>1$ are estimated using a rolling window starting in the corresponding period. We set $d_{0}$ equal to the number of observations in the pres-sample period, i.e., $d_{0}=n$ .

We set $B=10^{5}$ for draws from the stochastic representations of Theorem 2 and compare the standardized weight of Glaxo Smith Kline (GSK) calculated for the priod $T-1$ in the case of several sample sizes $n$ $\in$ $\{52,78,104,130\}$ . The corresponding histograms are given in Figure 2 for the diffuse prior and in Figure 3 for the conjugate prior. In both figures we also present the p-values of the Shapiro-Wilk test, indicating if the standardized weights follow a standard normal distribution. This hypothesis is rejected for $n=52$ and $n=78$ in the case of the diffuse prior for a common significance level of 5 $\%$ but it cannot be rejected at this level for larger sample sizes. Stronger results are obtained in the case of the conjugate prior, where the null hypothesis cannot be rejected at 5 $\%$ level for all considered sample sizes. We therefore conclude that the approximate distribution of Theorem 5 works reasonably well.

3.3 Wealth development and credibility intervals

Since the main purpose of investing is making money, investors are therefore interested in how much money they made during an investment period. We focus again on the same investment period covering the Brexit-referendum as in the previous subsection.

During the lifetime of the portfolio, no bankruptcy occurred. But more importantly, the stochastic representation for the posterior predictive distribution given in Theorem 6 can be used to calculate credible intervals for the wealth. By generating $B=10^{5}$ draws from Theorem 6 and calculating the 95 $\%$ credible intervals, we generate upper and lower bounds for the wealth in the specific period. These intervals together with the predicted and realized wealths are shown in Figure 4. We observe a difference in the width of the intervals for lower and larger sample sizes which was expected. The credible intervals are considerably smaller for $n$ $\in$ $\{104,130\}$ compared to smaller $n$ . Note that the sample size has to be sufficiently large in relation to the number of assets. Otherwise, the credible intervals are inflated due to massive estimation uncertainty known as the curse of dimensionality.

It might happen that both the diffuse and the conjugate priors do not perform well when the sample size increases. The reason for the diffuse prior is that the empirical counterparts might not describe the portfolio running period well, indicating a trade-off between the actuality and stability of the parameters. This problem is amplified for the conjugate prior since the prior parameters are determined using even more distant data. While the data-driven approach to the conjugate prior is somewhat realistic, it is not completely in line with the Bayesian paradigm. When the expectations and therefore the choice of hyperparameters are closer to the return behaviour after the Brexit, the results could be improved. Although this is consistent with the Bayesian paradigm, such an approach is of course not entirely practical but not impractical: using appropriate forecasting methods, other data driven methods can be applicable as long as they yield a reliable point estimate. This subjective approach emphasizes the possibility as well as the necessity to resemble realistic future market behaviour in the prior parameterization and it is left for future research.

3.4 Default probability

Due to the accessability of the posterior predictive distribution, we can also calculate the default probability of our portfolio at each time point, defined as the event that our wealth becomes negative at this point in time. The predictive probability of default can easily be determined by calculating the amount of defaults in relation to all draws, in this case $B=10^{5}$ . The development of the defaults is given in Figure 5. Again, we find a pattern resembling the credible intervals of the posterior predictive distribution illustrated in the previous section with no surprises.

Starting with the diffuse prior, we observe a slightly increased default probability on 27.6.2016, the week after the Brexit referendum. With the conjugate prior, this default probability is lower in the same week. Again, the peak for $n=130$ of the diffuse prior again resembles the trade-off between parameter stability and actuality, resulting here in a slightly increased default probability. The default probability for the conjugate prior is slightly increased in the following week compared to the diffuse prior, presumably due to parameters relying on a wider estimation window.

4 Summary

In this paper we consider the estimation of the multi-period portfolio for an exponential utility function in a Bayesian setting. Since the portfolio weights are given as the product of two multivariate/matrix-variate random quantities, accessing the distribution of the weights is a challenging task. By choosing the non-informative and the conjugate prior, the posterior distributions of the weights have pleasing properties since the conditional distribution of the precision matrix for a given return vector is an inverted Wishart distribution. With this insight we could use this well understood distribution (c.f. Muirhead (1982)) to derive stochastic representations for the weights which is a direct access to the posterior distribution. Furthermore, these representations also provide us with Bayesian estimates for the optimal portfolio weights together with their covariance matrix. In addition to this, we derive the posterior predictive distribution for the wealth which makes it possible to calculate the quantiles of the portfolio wealth at each time point of the investment period and it is therefore highly relevant for risk purposes. The method is then applied to real data from the FTSE 100 covering the period of the Brexit referendum. With these data we determine the posterior distribution of the weights, the predictive wealths in each period, the lower wealth quantiles as well as the default probability in every time period.

It turns out that the use of stochastic representations to generate the posterior distribution numerically is computationally highly efficient: the representations rely on samples from well known distributions and no MCMC methods are needed. In the empirical part of Section 3 it was demonstrated that these methods work well and are easy to implement. We have to emphasize several points: while the non-informative prior will yield results which coincide with the common frequentist case and is as easily to apply as the classical case, the conjugate or informative prior is said to involve a potentially large degree of subjectivity – sometimes implying that the frequentist approach or the non-informative prior would be objective. But we have to choose the sample size in all of these cases which is naturally a subjective choice with a huge effect on the performance of the portfolio as we demonstrate in Section 3. This trade-off between parameter actuality and parameter stability has to be faced by the practitioner. One advantage of the conjugate prior is of course that we can incorporate our beliefs regarding the future behaviour of the asset returns in our model which is not possible neither in the frequentist nor in the non-informative case. This is clearly at the core of every investment decision and reflects natural decision making. Nevertheless, the hyperparameters have to be chosen carefully and a rigorous sensitivity analysis is left for future research.

There are still other open research questions regarding the multi-period portfolio choice with exponential utility function which are left for future research. The present approach can be extended to the case with predictable variables as discussed in Bodnar, Parolya, and Schmid (2015b) in the case of the known parameters of the asset return distribution. This, however, is much more difficult due to the more complicated structure of the optimal portfolio weights and the dependence structure of the asset returns. Furthermore, the multi-period optimal portfolios obtained by using other utility functions can be estimated following the approach suggested in the paper.

5 Appendix

5.1 Proofs of the theorems

In this part of the paper we present the proofs of the theoretical results. First, we note that the derived posterior distributions under the diffuse prior and under the conjugate prior in Proposition 2 have a similar structure. For that reason, we formulate and prove some lemmas from which the results in both cases of the diffuse prior and the conjugate prior follow.

Lemma 1.

Let

[TABLE]

where $\mathbf{S}_{y}^{*}(\mbox{\boldmath$ \nu $})=v_{y}(\mathbf{S}_{y}+(\mbox{\boldmath$ \nu $}-\mathbf{m}_{y})(\mbox{\boldmath$ \nu $}-\mathbf{m}_{y})^{\top})$ and let $\mathbf{M}$ be a $p\times k$ -dimensional matrix of constants. Then the stochastic representation of $\mathbf{M}\mbox{\boldmath$ \Omega $}^{-1}(\mbox{\boldmath$ \nu $}-\mathbf{a})$ is given by

[TABLE]

where $\eta\sim\chi^{2}_{k_{y}-k-1}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , and $\mbox{\boldmath$ \nu $}|\mathbf{y}\sim t_{k}\left(d_{y},\mathbf{m}_{y},\mathbf{S}_{y}/d_{y}\right)$ ; moreover, $\eta,\mathbf{z}_{0}$ and $\nu$ are mutually independent.

Proof of Lemma 1.

Since $\mbox{\boldmath$ \Omega $}^{*}\stackrel{{\scriptstyle d}}{{=}}\mbox{\boldmath$ \Omega $}|\mbox{\boldmath$ \nu $}=\mbox{\boldmath$ \nu $}^{*},\mathbf{y}\sim\mathcal{IW}_{k}(k_{y},\mathbf{S}_{y}^{*}(\mbox{\boldmath$ \nu $}^{*}))$ and, consequently, $\mbox{\boldmath$ \Omega $}^{*\,-1}\sim\mathcal{W}_{k}(k_{y}-k-1,\mathbf{S}_{y}^{*}(\mbox{\boldmath$ \nu $}^{*})^{-1})$ (c.f., Theorem 3.4.1 in Gupta and Nagar (2000), it holds that (see, e.g., Theorem 3.2.5 in Muirhead (1982))

[TABLE]

with $\widetilde{\mathbf{M}}=(\mathbf{M}^{\top},\mbox{\boldmath$ \nu $}^{*}-\mathbf{a})^{\top}$ and $\mathbf{V}^{*}=\tilde{\mathbf{M}}\mathbf{S}_{y}^{*}(\mbox{\boldmath$ \nu $}^{*})^{-1}\tilde{\mathbf{M}}^{\top}$ . Next, we partition $\mbox{\boldmath$ \Xi $}^{*}$ and $\mathbf{V}^{*}$ in the following way

[TABLE]

and

[TABLE]

The application of Theorem 3.2.10 in Muirhead (1982) yields

[TABLE]

Defining $\eta=\Xi_{22}^{*}/V_{22}$ and using Theorem 3.2.8 of Muirhead (1982) we get that $\eta\sim\chi^{2}_{k_{y}-k-1}$ . Since the $\chi^{2}_{k_{y}-k-1}$ -distribution is independent of $\mbox{\boldmath$ \nu $}=\mbox{\boldmath$ \nu $}^{*}$ and $\mathbf{y}$ (on which the distribution of $\Xi_{22}^{*}$ depends on by definition of $\mbox{\boldmath$ \Xi $}^{*}$ ), it is also the unconditional distribution of $\eta$ as well as $\eta$ is independent of both $\nu$ and $\mathbf{y}$ . Thus, the stochastic representation of $\mathbf{M}\mbox{\boldmath$ \Omega $}^{-1}(\mbox{\boldmath$ \nu $}-\mathbf{a})$ is given by

[TABLE]

where $\eta\sim\chi^{2}_{k_{y}-k-1}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , and $\mbox{\boldmath$ \nu $}|\mathbf{y}\sim t_{k}\left(d_{y},\mathbf{m}_{y},\mathbf{S}_{y}/d_{y}\right)$ ; moreover, $\eta$ , $\mathbf{z}_{0}$ and $\nu$ are mutually independent. This completes the proof of the lemma. ∎

Proof of Theorem 1.

The results of Theorem 1 follow from Lemma 1 with $\mathbf{M}=C_{t}\mathbf{L}$ , $\mbox{\boldmath$ \Sigma $}=\mbox{\boldmath$ \Omega $}$ , $\mbox{\boldmath$ \nu $}=\mbox{\boldmath$ \mu $}$ , $\mathbf{a}=r_{f,t+1}\mathbf{1}$ and

(a)

$k_{y}=n+k+1$ , $d_{y}=n-k$ , $v_{y}=n$ , $\mathbf{m}_{y}=\overline{\mathbf{x}}_{t,d}$ , $\mathbf{S}_{y}=\mathbf{S}_{t,d}/n$ , and $\mathbf{S}_{y}^{*}(\mbox{\boldmath$ \nu $})=\mathbf{S}_{t,d}^{*}(\mbox{\boldmath$ \mu $})$ in the case of the diffuse prior; 2. (b)

$k_{y}=n+d_{0}+1$ , $d_{y}=n+d_{0}-2k$ , $v_{y}=n+r_{0}$ , $\mathbf{m}_{y}=\overline{\mathbf{x}}_{t,c}$ , $\mathbf{S}_{y}=\mathbf{S}_{t,c}/(n+r_{0})$ , and $\mathbf{S}_{y}^{*}(\mbox{\boldmath$ \nu $})=\mathbf{S}_{t,c}^{*}(\mbox{\boldmath$ \mu $})$ in the case of the conjugate prior.

∎

Lemma 2.

Under the conditions of Lemma 1, we get the following stochastic representation of $\mathbf{M}\mbox{\boldmath$ \Omega $}^{-1}(\mbox{\boldmath$ \nu $}-\mathbf{a})$ expressed as

[TABLE]

with

[TABLE]

where $\eta\sim\chi^{2}_{k_{y}-k-1}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , $Q\sim\mathcal{F}(k,d_{y})$ , and $\mathbf{u}$ uniformly distributed on the unit sphere in $R^{k}$ ; moreover, $\eta$ , $\mathbf{z}_{0}$ , $Q$ , and $\mathbf{u}$ are mutually independent.

Proof of Lemma 2.

The application of the Sherman-Morrison formula (see, e.g., p.125 in Meyer (2000)) yields

[TABLE]

Let

[TABLE]

Since $\mbox{\boldmath$ \nu $}|\mathbf{y}\sim t_{k}(d_{y},\mathbf{m}_{y},\mathbf{S}_{y}/d_{y})$ and that the multivariate $t$ -distribution belongs to the class of the elliptically contoured distributions, we obtain that $\mathbf{u}$ and $Q$ are independent, and $\mathbf{u}$ is uniformly distributed on the unit sphere in $R^{k}$ (see Theorem 2.15 of Gupta, Varga, and Bodnar (2013)). Moreover, from the properties of the multivariate $t$ -distribution (see p. 19 of Kotz and Nadarajah (2004)), we get that $Q\sim\mathcal{F}(k,d_{y})$ , i.e., $Q$ has an $\mathcal{F}$ -distribution with $k$ and $d_{y}$ degrees of freedom.

Hence, the application of the (11) and (12) leads to

[TABLE]

and

[TABLE]

Putting the above results together we obtain the statement of the lemma. ∎

Proof of Theorem 2.

The results of Theorem 2 are obtained by using Lemma 2 with $\mathbf{M}=C_{t}\mathbf{L}$ , $\mbox{\boldmath$ \Sigma $}=\mbox{\boldmath$ \Omega $}$ , $\mbox{\boldmath$ \nu $}=\mbox{\boldmath$ \mu $}$ , $\mathbf{a}=r_{f,t+1}\mathbf{1}$ and

(a)

$k_{y}=n+k+1$ , $d_{y}=n-k$ , $v_{y}=n$ , $\mathbf{m}_{y}-\mathbf{a}=\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1}$ , $\mathbf{S}_{y}=\mathbf{S}_{t,d}/n$ , and $\mathbf{S}_{y}^{*}(\mbox{\boldmath$ \nu $})=\mathbf{S}_{t,d}^{*}(\mbox{\boldmath$ \mu $})$ in the case of the diffuse prior; 2. (b)

$k_{y}=n+d_{0}+1$ , $d_{y}=n+d_{0}-2k$ , $v_{y}=n+r_{0}$ , $\mathbf{m}_{y}-\mathbf{a}=\overline{\mathbf{x}}_{t,c}-r_{f,t+1}\mathbf{1}$ , $\mathbf{S}_{y}=\mathbf{S}_{t,c}/(n+r_{0})$ , and $\mathbf{S}_{y}^{*}(\mbox{\boldmath$ \nu $})=\mathbf{S}_{t,c}^{*}(\mbox{\boldmath$ \mu $})$ in the case of the conjugate prior.

∎

Proof of Theorem 3.

The proof of the theorem is based on the stochastic representations obtained in Theorem 2. Let $\mathbf{l}$ be an arbitrary $k$ -dimensional vector of constants.

(a)

Using that $\eta$ , $\mathbf{z}_{0}$ $Q$ , and $\mathbf{u}$ are independent and that $\mathbb{E}(\mathbf{z}_{0})=\mathbf{0}$ , in the case of the diffuse prior we get

[TABLE]

with $\mathbb{E}(\eta)=n$ and

[TABLE]

where we use that $E(\mathbf{u})=\mathbf{0}$ and $E(\mathbf{u}\mathbf{u}^{T})=\frac{1}{k}\mathbf{I_{k}}$ (see, e.g. Gupta et al. (2013)) as well as the fact that if $Q\sim\mathcal{F}(k,n-k)$ , then ${\frac{k}{n-k}Q}/\left(1+\frac{k}{n-k}Q\right)\sim Beta\left(\frac{k}{2},\frac{n-k}{2}\right)$ . Hence,

[TABLE]

and, consequently, since $\mathbf{l}$ was an arbitrary vector, we get

[TABLE] 2. (b)

Similar computations as in part (a) leads to

[TABLE]

under the conjugate prior.

∎

Lemma 3.

Under the assumption of Lemma 2 with $\mathbf{M}=\mathbf{b}^{\top}:1\times k$ , we get that

[TABLE]

where $c_{1}=\mathbf{b}^{\top}\mathbf{S}_{y}^{-1}\mathbf{b}$ , $c_{2}=(\mathbf{m}_{y}-\mathbf{a})^{\top}\mathbf{S}_{y}^{-1}(\mathbf{m}_{y}-\mathbf{a})$ , and $c_{12}=\mathbf{b}^{\top}\mathbf{S}_{y}^{-1}(\mathbf{m}_{y}-\mathbf{a})$ .

Beweis.

The proof of the lemma is based on the stochastic representations from Lemma 2. Since $\eta$ , $\mathbf{z}_{0}$ , $Q$ , and $\mathbf{u}$ are independent as well as $\mathbb{E}(\mathbf{z}_{0})=\mathbf{0}$ and $\mathbb{E}(\mathbf{z}_{0}\mathbf{z}_{0}^{\top})=\mathbf{I}_{p}$ , we obtain

[TABLE]

with $\mathbb{E}(\eta)=k_{y}-k-1$ and $\mathbb{E}(\eta^{2})=(k_{y}-k-1)(k_{y}-k+1)$ .

The application of $E(\mathbf{u}\mathbf{u}^{T})=\frac{1}{k}\mathbf{I}_{k}$ and the fact that all odd mixed moments of $\mathbf{u}$ are zero yield

[TABLE]

and

[TABLE]

Since $\frac{kQ/d_{y}}{1+kQ/d_{y}}$ has a beta distribution with $k/2$ and $d_{y}/2$ degrees of freedom, we obtain

[TABLE]

Furthermore, using $Q\sim\mathcal{F}(k,d_{y})$ , we get

[TABLE]

where $B(\cdot,\cdot)$ stands for the beta function (see, Mathai and Provost (1992, p. 256)).

Next, we compute $\mathbb{E}\left((\mathbf{b}^{\top}\mathbf{S}_{y}^{-1/2}\mathbf{u})^{2}((\mathbf{m}_{y}-\mathbf{a})^{\top}\mathbf{S}_{y}^{-1/2}\mathbf{u})^{2}|\mathbf{y}\right)$ . Let $Q_{N}\sim\chi^{2}_{k}$ be independent of $\mathbf{u}$ . Then $\sqrt{Q_{N}}\mathbf{u}$ has a multivariate standard normal distribution, i.e.

[TABLE]

where $c_{1}$ , $c_{2}$ , and $c_{12}$ are defined in the statement of Lemma 3. Hence,

[TABLE]

where the last equality follows from the Isserlis’ theorem (c.f., Isserlis (1918)).

Hence,

[TABLE]

and

[TABLE]

∎

Proof of Theorem 4.

The results of Theorem 4 are obtained by using Lemma 3 with $\mathbf{b}=C_{t}\mathbf{l}$ , $\mbox{\boldmath$ \Sigma $}=\mbox{\boldmath$ \Omega $}$ , $\mbox{\boldmath$ \nu $}=\mbox{\boldmath$ \mu $}$ , $\mathbf{a}=r_{f,t+1}\mathbf{1}$ and Theorem 3.

(a)

In the case of the diffuse prior, using $k_{y}=n+k+1$ , $d_{y}=n-k$ , $v_{y}=n$ , $\mathbf{m}_{y}-\mathbf{a}=\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1}$ , $\mathbf{S}_{y}=\mathbf{S}_{t,d}/n$ , $c_{1}=nC_{t}^{2}\mathbf{l}^{\top}\mathbf{S}_{t,d}^{-1}\mathbf{l}$ , $c_{2}=n(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})^{\top}\mathbf{S}_{t,d}^{-1}(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})$ , and $c_{12}=nC_{t}\mathbf{l}^{\top}\mathbf{S}_{t,d}^{-1}(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})$ we get

[TABLE]

where $b_{d}=n(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})^{\top}\mathbf{S}_{t,d}^{-1}(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})$ . Since $\mathbf{l}$ is an arbitrary vector, the results in part (a) follow. 2. (b)

In the case of the conjugate prior, the application of $k_{y}=n+d_{0}+1$ , $d_{y}=n+d_{0}-2k$ , $v_{y}=n+r_{0}$ , $\mathbf{m}_{y}-\mathbf{a}=\overline{\mathbf{x}}_{t,c}-r_{f,t+1}\mathbf{1}$ , and $\mathbf{S}_{y}=\mathbf{S}_{t,c}/(n+r_{0})$ , $c_{1}=(n+r_{0})C_{t}^{2}\mathbf{l}^{\top}\mathbf{S}_{t,c}^{-1}\mathbf{l}$ , $c_{2}=(n+r_{0})(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})^{\top}\mathbf{S}_{t,c}^{-1}(\overline{\mathbf{x}}_{t,d}-r_{f,t+1}\mathbf{1})$ , and $c_{12}=(n+r_{0})C_{t}\mathbf{l}^{\top}\mathbf{S}_{t,c}^{-1}(\overline{\mathbf{x}}_{t,c}-r_{f,t+1}\mathbf{1})$ . leads to

[TABLE]

where $b_{c}=(n+r_{0})(\overline{\mathbf{x}}_{t,c}-r_{f,t+1}\mathbf{1})^{\top}\mathbf{S}_{t,c}^{-1}(\overline{\mathbf{x}}_{t,c}-r_{f,t+1}\mathbf{1})$ . Since $\mathbf{l}$ is an arbitrary vector, we get the statement of Theorem 4.(b).

∎

Proof of Theorem 5.

Let $\mathbf{l}$ be an arbitrary $k$ -dimensional vector. From Theorem 1 with $\mathbf{L}=\mathbf{l}^{\top}$ , we get the following stochastic representations of $\mathbf{L}\mathbf{w}_{t}$ under the diffuse prior and the conjugate prior expressed as

[TABLE]

where $\eta\sim\chi^{2}_{n}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , and $\mbox{\boldmath$ \mu $}|\mathbf{x}\sim t_{k}\left(n-k,\overline{\mathbf{x}}_{t,d},\mathbf{S}_{t,d}/(n(n-k))\right)$ , and

[TABLE]

where $\eta\sim\chi^{2}_{n+d_{0}-k}$ , $\mathbf{z}_{0}\sim\mathcal{N}_{p}(\mathbf{0},\mathbf{I}_{p})$ , and $\mbox{\boldmath$ \mu $}|\mathbf{x}\sim t_{k}\left(n+d_{0}-2k,\overline{\mathbf{x}}_{t,c},\mathbf{S}_{t,c}/((n+r_{0})(n+d_{0}-2k))\right)$ .

Moreover, since

[TABLE]

and

[TABLE]

as $n\longrightarrow\infty$ as well as

[TABLE]

and

[TABLE]

the application of the delta method (c.f., (DasGupta, 2008, Theorem 3.7)) proves that

[TABLE]

and

[TABLE]

as $n\longrightarrow\infty$ under the diffuse prior and the conjugate prior, respectively.

Finally, the results of Theorem 4 yield

[TABLE]

and, similarly,

[TABLE]

Since, for each $\mathbf{l}$ the linear combination $\mathbf{l}^{\top}\mathbf{w}_{t}$ is asymptotically normally distributed, then we also get that the vector of weights $\mathbf{w}_{t}$ is asymptotically normal. ∎

Proof of Theorem 6.

Since $\mathbf{x}_{t+1}|\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $}\sim\mathcal{N}_{k}(\mbox{\boldmath$ \mu $},\mbox{\boldmath$ \Sigma $})$ and it is conditionally independent of $\mathbf{x}_{t,n}$ , we get

[TABLE]

(a)

In the case of the diffuse prior, we observe that

[TABLE]

where $\xi\sim\chi^{2}_{n-k+1}$ and is independent of $\mu$ (see, e.g., Theorem 3.2.13 in Muirhead (1982)). Then the stochastic representation of $\widehat{W}_{t+1}$ is given by

[TABLE]

where $t_{2}\sim t_{1}(n-k+1,0,1)$ is independent of $\mu$ . Finally, from the properties of the multivariate $t$ -distribution, we obtain

[TABLE]

which leads to

[TABLE]

where $t_{1}$ and $t_{2}$ are independent with $t_{1}\sim t_{n-k}$ and $t_{2}\sim t_{n-k+1}$ . 2. (b)

Similarly, for the conjugate prior, it holds that

[TABLE]

where $\xi\sim\chi^{2}_{n+d_{0}-2k+1}$ and is independent of $\mu$ . Then the stochastic representation of $\widehat{W}_{t+1}$ is given by

[TABLE]

where $t_{2}\sim t_{n+d_{0}-2k+1}$ is independent of $\mu$ . From the properties of the multivariate $t$ -distribution, we get

[TABLE]

which leads to

[TABLE]

where $t_{1}$ and $t_{2}$ are independent with $t_{1}\sim t_{n+d_{0}-2k}$ and $t_{2}\sim t_{n+d_{0}-2k+1}$ .

∎

5.2 Empirical Bayes estimation of the hyperparameters in the conjugate prior

In this section, we derive the empirical Bayes estimates for the hyperparameters of the conjugate prior $\mathbf{m}_{0}$ and $\mathbf{S}_{0}$ . Given the sample $\mathbf{x}_{\tau,n}$ the empirical Bayes estimates for $\mathbf{m}_{0}$ and $\mathbf{S}_{0}$ are obtained by maximizing (see, e.g., Carlin and Louis (2000))

[TABLE]

with respect to $\mathbf{m}_{0}$ and $\mathbf{S}_{0}$ .

First, we calculate the integral in (15), ignoring the terms which do not depend on $\mathbf{m}_{0}$ and $\mathbf{S}_{0}$ , to get

[TABLE]

where the last identity is obtained by recognizing that under the integral with respect to $\Sigma$ we have a kernel of the density function of $\mathcal{IW}_{k}(n+d_{0}+1,\mathbf{V}_{\tau}(\mbox{\boldmath$ \mu $};\mathbf{m}_{0},\mathbf{S}_{0}))$ with $\bar{\mathbf{y}}_{\tau}(\mathbf{m}_{0})=(n\bar{\mathbf{x}}_{\tau}+r_{0}\mathbf{m}_{0})/(n+r_{0})$ and

[TABLE]

Let $\widetilde{\mathbf{V}}_{\tau}(\mathbf{m}_{0},\mathbf{S}_{0})=\mathbf{S}_{0}+(n-1)\mathbf{S}_{\tau}+nr_{0}(\mathbf{m}_{0}-\bar{\mathbf{y}}_{\tau}(\mathbf{m}_{0}))(\mathbf{m}_{0}-\bar{\mathbf{y}}_{\tau}(\mathbf{m}_{0}))^{\top}/(n+r_{0})$ . The application of Sylvester’s determinant theorem leads to

[TABLE]

and, hence,

[TABLE]

where we use Sylvester’s determinant theorem for the second time. From the last line, we conclude that $g(\mathbf{m}_{0},\mathbf{S}_{0})$ is maximized with respect to $\mathbf{m}_{0}$ at $\hat{\mathbf{m}}_{0}$ satisfying $\mathbf{m}_{0}=\bar{\mathbf{y}}_{\tau}(\mathbf{m}_{0})$ independently of $\mathbf{S}_{0}$ leading to $\hat{\mathbf{m}}_{0}=\bar{\mathbf{x}}_{\tau}$ .

Taking the logarithms of $g(\mathbf{m}_{0},\mathbf{S}_{0})$ , calculating the matrix derivative with respect to $\mathbf{S}_{0}$ which is then set to the zero matrix, and substituting $\mathbf{m}_{0}$ by $\hat{\mathbf{m}}_{0}$ , we get the following matrix equation

[TABLE]

with the solution given by

[TABLE]

Literatur

(1)
Aguilar and West (2000)

Aguilar, O., and M. West (2000): “Bayesian dynamic factor models and portfolio allocation,” Journal of Business & Economic Statistics, 18, 338–357.

Avramov and Zhou (2010)

Avramov, D., and G. Zhou (2010): “Bayesian portfolio analysis,” Annual Review of Financial Economics, 2, 25–47.

Barry (1974)

Barry, C. (1974): “Portfolio analysis under uncertain means, variances, and covariances,” Journal of Finance, 29, 515–522.

Basak and Chabakauri (2010)

Basak, S., and G. Chabakauri (2010): “Dynamic mean-variance asset allocation,” Review of Financial Studies, 23, 2970–3016.

Bernardo and Smith (2000)

Bernardo, J. M., and A. F. M. Smith (2000): Bayesian Theory. Wiley.

Bodnar, Mazur, and Okhrin (2017)

Bodnar, T., S. Mazur, and Y. Okhrin (2017): “Bayesian estimation of the global minimum variance portfolio,” European Journal of Operational Research, 256, 292–307.

Bodnar and Okhrin (2011)

Bodnar, T., and Y. Okhrin (2011): “On the product of inverse Wishart and normal distributions with applications to discriminant analysis and portfolio theory,” Scandinavian Jornal of Statistics, 38, 311–331.

Bodnar, Parolya, and Schmid (2015a)

Bodnar, T., N. Parolya, and W. Schmid (2015a): “A closed-form solution of the multi-period portfolio choice problem for a quadratic utility function,” Annals of Operations Research, 229, 121–158.

Bodnar, Parolya, and Schmid (2015b)

(2015b): “On the exact solution of the multi-period portfolio choice problem for an exponential utility under return predictability,” European Journal of Operational Research, 246, 528–542.

Bodnar and Schmid (2008)

Bodnar, T., and W. Schmid (2008): “A test for the weights of the global minimum variance portfolio in an elliptical model,” Metrika, 67(2), 127–143.

Bodnar and Schmid (2009)

Bodnar, T., and W. Schmid (2009): “Econometrical analysis of the sample efficient frontier,” The European Journal of Finance, 15, 317–335.

Bodnar and Schmid (2011)

Bodnar, T., and W. Schmid (2011): “On the exact distribution of the estimated expected utility portfolio weights: Theory and applications,” Statistics $\&$ Risk Modeling, 28, 319–342.

Brandt (2010)

Brandt, M. (2010): “Portfolio Choice Problems,” in Handbook of Financial Econometrics, ed. by Y. Ait-Sahalia, and L. Hansen, vol. 1, pp. 269–336. Tools and Techniques, North Hol- land.

Brandt and Santa-Clara (2006)

Brandt, M., and Santa-Clara (2006): “Dynamic portfolio selection by augmenting the asset space,” The Journal of Finance, 61, 2187–2217.

Brown (1976)

Brown, S. (1976): “Optimal portfolio choice under uncertainty: A Bayesian approach,” Ph.D. thesis, University of Chicago.

Çanakoğlu and Özekici (2009)

Çanakoğlu, E., and S. Özekici (2009): “Portfolio selection in stochastic markets with exponential utility functions,” Annals of Operations Research, 166, 281–297.

Carlin and Louis (2000)

Carlin, B. P., and T. A. Louis (2000): Bayes and Empirical Bayes Methods for Data Analysis. Chapman & Hall/CRC,.

DasGupta (2008)

DasGupta, A. (2008): Asymptotic Theory of Statistics and Probability, Springer Texts in Statistics. Springer.

Elton (1974)

Elton, E. J., G. M. J. (1974): “On the optimality of some multiperiod portfolio selection criteria,” Journal of Business, 47, 231–243.

Frost and Savarino (1986)

Frost, P., and J. Savarino (1986): “An empirical Bayes approach to efficient portfolio selection,” Journal of Financial and Quantitative Analysis, 21, 293–305.

Gelman, Carlin, Stern, and Rubin (2014)

Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin (2014): Bayesian Data Analysis, vol. 2. Chapman & Hall/CRC Boca Raton, FL, USA.

Gibbons, Ross, and Shanken (1989)

Gibbons, M. R., S. A. Ross, and J. Shanken (1989): “A test of the efficiency of a given portfolio,” Econometrica, 57, 1121–1152.

Givens and Hoeting (2012)

Givens, G. H., and J. A. Hoeting (2012): Computational Statistics. John Wiley & Sons.

Gupta and Nagar (2000)

Gupta, A., and D. Nagar (2000): Matrix Variate Distributions. Chapman and Hall/CRC, Boca Raton.

Gupta, Varga, and Bodnar (2013)

Gupta, A., T. Varga, and T. Bodnar (2013): Elliptically Contoured Models in Statistics and Portfolio Theory. Springer, second edn.

Isserlis (1918)

Isserlis, L. (1918): “On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables,” Biometrika, 12, 134–139.

Klein and Bawa (1976)

Klein, R., and V. Bawa (1976): “The effect of estimation risk on optimal portfolio choice,” Journal of Financial Economics, 3, 215–231.

Kotz and Nadarajah (2004)

Kotz, S., and S. Nadarajah (2004): Multivariate $t$ Distributions and Their Applications. Cambridge University Press, Cambridge, United Kingdom.

Li and Ng (2000)

Li, D., and W.-L. Ng (2000): “Optimal dynamic portfolio selection: Multiperiod mean-variance formulation,” Mathematical Finance, 10, 387–406.

Markowitz (1952)

Markowitz, H. (1952): “Portfolio Selection,” The Journal of Finance, 7, 77–91.

Markowitz (1959)

(1959): Portfolio Selection: Efficient Diversification of Investments. John Wiley, New York.

Mathai and Provost (1992)

Mathai, A. M., and S. B. Provost (1992): Quadratic Forms an Random Variables. Marcel Dekker, New York.

Meyer (2000)

Meyer, C. D. (2000): Matrix Analysis and Applied Linear Algebra. SIAM.

Mossin (1968)

Mossin, J. (1968): “Optimal multiperiod portfolio policies,” The Journal of Business, 41, 215–229.

Muirhead (1982)

Muirhead, R. J. (1982): Aspects of Multivariate Statistical Theory. Wiley, New York.

Okhrin and Schmid (2006)

Okhrin, Y., and W. Schmid (2006): “Distributional properties of portfolio weights,” Journal of Econometrics, 134, 235–256.

Pennacchi (2008)

Pennacchi, G. G. (2008): Theory of Asset Pricing. Pearson/Addison-Wesley Boston.

Rachev, Hsu, Bagasheva, and Fabozzi (2008)

Rachev, S. T., J. S. J. Hsu, B. S. Bagasheva, and F. J. Fabozzi (2008): Bayesian Methods in Finance. Wiley, New Jersey.

Samuelson (1969)

Samuelson, P. A. (1969): “Lifetime portfolio selection by dynamic stochastic programming,” Review of Economics and Statistics, 51, 239–246.

Sekerke (2015)

Sekerke, M. (2015): Bayesian Risk Management: A Guide to Model Risk and Sequential Learning in Financial Markets. Wiley, New Jersey.

Shanken (1992)

Shanken, J. (1992): “On the estimation of beta-pricing models,” Review of Financial Studies, 5, 1–33.

Shanken and Zhou (2007)

Shanken, J., and G. Zhou (2007): “Estimating and testing beta pricing models: Alternative methods and their performance in simulations,” Journal of Financial Economics, 84, 40–86.

Sun and Berger (2007)

Sun, D., and J. Berger (2007): “Objective Bayesian analysis for the multivariate normal model,” in Bayesian Statistics, ed. by J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith, and M. West, vol. 8, pp. 525–547. Oxford: University Press.

Winkler (1973)

Winkler, R. L. (1973): “Bayesian models for forecasting future security prices.,” Journal of Financial and Quantitative Analysis, 8, 387–405.

Winkler and Barry (1975)

Winkler, R. L., and C. B. Barry (1975): “A Bayesian model for portfolio selection and revision.,” Journal of Finance, 30, 179–192.

Zellner (1971)

Zellner, A. (1971): An Introduction to Bayesian Inference in Econometrics. John Wiley, New York.

Bibliography47

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1)
2Aguilar and West (2000) Aguilar, O., and M. West (2000): “Bayesian dynamic factor models and portfolio allocation,” Journal of Business & Economic Statistics , 18, 338–357.
3Avramov and Zhou (2010) Avramov, D., and G. Zhou (2010): “Bayesian portfolio analysis,” Annual Review of Financial Economics , 2, 25–47.
4Barry (1974) Barry, C. (1974): “Portfolio analysis under uncertain means, variances, and covariances,” Journal of Finance , 29, 515–522.
5Basak and Chabakauri (2010) Basak, S., and G. Chabakauri (2010): “Dynamic mean-variance asset allocation,” Review of Financial Studies , 23, 2970–3016.
6Bernardo and Smith (2000) Bernardo, J. M., and A. F. M. Smith (2000): Bayesian Theory . Wiley.
7Bodnar, Mazur, and Okhrin (2017) Bodnar, T., S. Mazur, and Y. Okhrin (2017): “Bayesian estimation of the global minimum variance portfolio,” European Journal of Operational Research , 256, 292–307.
8Bodnar and Okhrin (2011) Bodnar, T., and Y. Okhrin (2011): “On the product of inverse Wishart and normal distributions with applications to discriminant analysis and portfolio theory,” Scandinavian Jornal of Statistics , 38, 311–331.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Zusammenfassung

1 Introduction

2 Bayesian analysis of multi-period optimal portfolios

2.1 Analytical solution of the multi-period optimization problem

Proposition 1**.**

2.2 Bayesian estimation of portfolio weights

Proposition 2**.**

Theorem 1**.**

Theorem 2**.**

Theorem 3**.**

Theorem 4**.**

Theorem 5**.**

2.3 Posterior predictive distribution

Theorem 6**.**

3 Empirical study

3.1 Data description

3.2 Posterior distribution of the weights

3.3 Wealth development and credibility intervals

3.4 Default probability

4 Summary

5 Appendix

5.1 Proofs of the theorems

Lemma 1**.**

Proof of Lemma 1.

Proof of Theorem 1.

Lemma 2**.**

Proof of Lemma 2.

Proof of Theorem 2.

Proof of Theorem 3.

Lemma 3**.**

Beweis.

Proof of Theorem 4.

Proof of Theorem 5.

Proof of Theorem 6.

5.2 Empirical Bayes estimation of the hyperparameters in the conjugate prior

Literatur

Proposition 1.

Proposition 2.

Theorem 1.

Theorem 2.

Theorem 3.

Theorem 4.

Theorem 5.

Theorem 6.

Lemma 1.

Lemma 2.

Lemma 3.