The Income Fluctuation Problem and the Evolution of Wealth
Qingyin Ma, John Stachurski, Alexis Akira Toda

TL;DR
This paper studies a comprehensive household savings model with state-dependent returns and income, establishing conditions for solution existence, uniqueness, and properties of wealth distribution, including Pareto tails.
Contribution
It extends classic models by allowing multiple state-dependent, correlated processes and derives conditions for wealth distribution characteristics.
Findings
Solutions exist, are unique, and globally computable.
Wealth dynamics are stationary, ergodic, and geometrically mixing.
Wealth distribution exhibits Pareto tails.
Abstract
We analyze the household savings problem in a general setting where returns on assets, non-financial income and impatience are all state dependent and fluctuate over time. All three processes can be serially correlated and mutually dependent. Rewards can be bounded or unbounded and wealth can be arbitrarily large. Extending classic results from an earlier literature, we determine conditions under which (a) solutions exist, are unique and are globally computable, (b) the resulting wealth dynamics are stationary, ergodic and geometrically mixing, and (c) the wealth distribution has a Pareto tail. We show how these results can be used to extend recent studies of the wealth distribution. Our conditions have natural economic interpretations in terms of asymptotic growth rates for discounting and return on savings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic theories and models · Economic Theory and Policy · Financial Literacy, Pension, Retirement Analysis
Abstract.
We analyze the household savings problem in a general setting where returns on assets, non-financial income and impatience are all state dependent and fluctuate over time. All three processes can be serially correlated and mutually dependent. Rewards can be bounded or unbounded and wealth can be arbitrarily large. Extending classic results from an earlier literature, we determine conditions under which (a) solutions exist, are unique and are globally computable, (b) the resulting wealth dynamics are stationary, ergodic and geometrically mixing, and (c) the wealth distribution has a Pareto tail. We show how these results can be used to extend recent studies of the wealth distribution. Our conditions have natural economic interpretations in terms of asymptotic growth rates for discounting and return on savings.
Keywords: Income fluctuation, optimality, stochastic stability, wealth distribution.
The Income Fluctuation Problem and the
Evolution of Wealth111We thank the editors and two anonymous referees for many valuable comments and suggestions. This paper has also benefited from discussion with many colleagues. We particularly thank Fedor Iskhakov, Larry Liu and Chung Tran for their insightful feedback and suggestions. The second author gratefully acknowledges financial support from ARC grant FT160100423.
Email addresses: [email protected], [email protected], [email protected].
Qingyin Maa, John Stachurskib and Alexis Akira Todac
aInternational School of Economics and Management,
Capital University of Economics and Business
bResearch School of Economics, Australian National University
cDepartment of Economics, University of California San Diego
January 30, 2020
1. Introduction
It has been observed that, in the US and several other large economies, the wealth distribution is heavy tailed and wealth inequality has risen sharply over the last few decades.222For example, in a study based on capital income data, Saez and Zucman (2016) find that, in the case of the US, the share of total household wealth held by the top 0.1% increased from 7 percent to 22 percent between 1978 and 2012. For a discussion of the heavy-tailed property of the wealth distribution, see Pareto (1896), Davies and Shorrocks (2000), Benhabib and Bisin (2018), Vermeulen (2018) or references therein. This matters not only for its direct impact on taxation and redistribution policies, but also for potential flow-on effects for productivity growth, business cycles and fiscal policy, as well as for the political environment that shapes these and other economic outcomes.333One analysis of the two-way interactions between inequality and political decision making can be found in Acemoglu and Robinson (2002). Glaeser et al. (2003) show how inequality can alter economic and social outcomes through subversion of institutions. The same study contains references on linkages between inequality and growth. Regarding fiscal policy, Brinca et al. (2016) find strong correlations between wealth inequality and the magnitude of fiscal multipliers, while Bhandari et al. (2018) study the connection between fiscal-monetary policy, business cycles and inequality. Ahn et al. (2018) discuss the impact of distributional properties on macroeconomic aggregates.
At present, our understanding of these phenomena is hampered by the fact that standard tools of analysis—such as those used for heterogeneous agent models—are not well adapted to studying the wealth distribution as it stands. For example, while we have sound understanding of the household problem when returns on savings and rates of time discount are constant (see, e.g., Schechtman (1976), Schechtman and Escudero (1977), Deaton and Laroque (1992), Carroll (1997), or Açıkgöz (2018)), our knowledge is far more limited in settings where these values are stochastic. This is problematic, since injecting such features into the household problem is essential for accurately representing the joint distribution of income and wealth (e.g., Benhabib et al. (2015), Benhabib et al. (2017), Stachurski and Toda (2019)).444Also related is the recent experimental study of Epper et al. (2018), which finds a strong positive connection between dispersion in subjective rates of time discounting across the population and realized dispersion in the wealth distribution. This in turn is consistent with earlier empirical studies such as Lawrance (1991). Moreover, models with time-varying discount rates and returns on assets are at the forefront of recent quantitative analysis of wealth and inequality.555For a recent quantitative study see, for example, Hubmer et al. (2018), where returns on savings and discount rates are both state dependent (as is labor income). Kaymak et al. (2018) find that asset return heterogeneity is required to match the upper tail of the wealth distribution.
While it might be hoped that the analysis of the income fluctuation problem (or household consumption and savings problem) changes little when we shift from constant to state dependent asset returns and rates of time discount, this turns out not to be the case. Effectively modeling these features and the way they map to the wealth distribution requires significant advances in our understanding of choice and stochastic dynamics in the setting of optimal savings.
One difficulty is that state-dependent discounting takes us beyond the bounds of traditional dynamic programming theory. This matters little if there exists some constant such that the discount process satisfies for all with probability one, since, in this case, a standard contraction mapping argument can still be applied (see, e.g., Miao (2006) or Cao (2020)). However, recent quantitative studies extend beyond such settings. For example, AR(1) specifications are increasingly common, in which case the support of is unbounded above at every point in time.666See, for example, Hills and Nakata (2018), Hubmer et al. (2018) or Schorfheide et al. (2018). Even if discretization is employed, the outcome can occur with positive probability when the approximation is sufficiently fine. Moreover, such outcomes are not inconsistent with empirical and experimental evidence, at least for some households in some states of the world.777See, for example, Loewenstein and Prelec (1991) and Loewenstein and Sicherman (1991). Do there exist conditions on that allow for in some states and yet imply existence of optimal polices and practical computational techniques?
Another source of complexity for the income fluctuation problem in the general setting considered here is that the set of possible values for household assets is typically unbounded above. For example, when returns on assets are stochastic, a sufficiently long sequence of favorable returns can compound one another to project a household to arbitrarily high levels of wealth. This model feature is desirable: We wish to analyze these kinds of outcomes rather than rule them out. Indeed, Benhabib et al. (2015) and other related studies argue convincingly that such outcomes are a key causal mechanism behind the heavy tail of the current distribution of wealth.888One related study is Benhabib et al. (2011), who show that capital income risk is the driving force of the heavy-tail properties of the stationary wealth distribution. In Blanchard-Yaari style economies, Toda (2014), Toda and Walsh (2015) and Benhabib et al. (2016) show that idiosyncratic investment risk generates a double Pareto stationary wealth distribution. Gabaix et al. (2016) point out that a positive correlation of returns with wealth (“scale dependence”) in addition to persistent heterogeneity in returns (“type dependence”) can well explain the speed of changes in the tail inequality observed in the data. However, if we accept this logic, then stationarity and ergodicity of the wealth process—which are fundamental both for estimation and for simulation-based numerical methods—must now be established in a setting where the wealth distribution has unbounded support. In such a scenario, what conditions on preferences and financial and labor income are necessary for these properties to hold?
A final and related example of the need for deeper analysis is as follows: To understand the upper tail of the wealth distribution, we must avoid unnecessarily truncating the upper tail of the set of possible asset values in quantitative work. While truncation is convenient because finite or compact state spaces are easier to handle computationally, we can attain greater accuracy in modeling the wealth distribution if truncation at the upper tail can be replaced locally by a parameterized savings function, such as a linear function (Gouin-Bonenfant and Toda, 2018). However, any such approximation must be justified by theory. What conditions can be imposed on primitives to generate such properties while still maintaining realistic assumptions for asset returns and non-financial income?
In this paper we address all of these questions, along with other key properties of the income fluctuation problem, such as continuity and monotonicity of the optimal consumption policy. Our setting admits capital income risk, labor earnings shocks and time-varying discount rates, driven by a combination of iid innovations and an exogenous Markov chain . The supports of the innovations can be unbounded, so we admit practical innovation sequences such as normal and lognormal. As a whole, this environment allows for a range of realistic features, such as stochastic volatility in returns on asset holdings, or correlation in the shocks impacting asset returns and non-financial income. The utility function can be unbounded both above and below, with no specific structure imposed beyond differentiability, concavity and the usual slope (Inada) conditions.999While the assumption that the exogenous state process is a (finite state) Markov chain might appear restrictive, it fits most practical settings and avoids a host of technical issues that tend to obscure the key ideas. Moreover, the innovation shocks are not restricted to be discrete, and the same is true for assets and consumption.
To begin, when considering optimality in the household problem, we require a condition on the state dependent discount process that generalizes the classical condition from the constant case and, for reasons discussed above, permits with positive probability. To this end, we introduce the restriction101010Here and below we set , so .
[TABLE]
Condition (1) clearly generalizes the classical condition for the constant discount case. In the stochastic case, can be understood as the asymptotic growth rate of the probability weighted average discount factor. Indeed, if is the average -period discount factor, then, from the definition of and some straightforward analysis, we obtain , so the condition implies that the asymptotic growth rate of the average -period discount factor is negative, drifting down from its initial condition at the rate . This does not, of course, preclude the possibility that at any given .
We show that condition (1) is in fact a necessary condition in those settings where the classical condition is necessary for finite lifetime values. In this sense it cannot be further weakened for the income fluctuation problem apart from special cases. At the same time, it admits the use of convenient specifications such as the discretized AR(1) process from Hubmer et al. (2018). In addition, we prove that can be represented as the spectral radius of a nonnegative matrix, and hence can be computed by numerical linear algebra (as discussed below).
We also generalize the standard condition , where is the gross interest rate in the constant case, which is used to ensure stability of the asset path and finiteness of lifetime valuations, as well as existence of stationary Markov policies (see, e.g., Deaton and Laroque (1992), Chamberlain and Wilson (2000) or Li and Stachurski (2014)). Analogous to (1), we introduce the generalized condition
[TABLE]
Here is a stochastic capital income process. Analogous to the case of , the value can be understood as the asymptotic growth rate of average gross payoff on assets, discounted to present value.
We show that, when Conditions (1)–(2) hold and non-financial income satisfies two moment conditions, a unique optimal consumption policy exists. We also show that the policy can be computed by successive approximations and analyze its properties, such as monotonicity and asymptotic linearity. This asymptotic linearity can be used to successfully model wealth inequality by accurately representing asset path dynamics for very high wealth households (Gouin-Bonenfant and Toda, 2018).
One important feature of Conditions (1)–(2) is that they take into account the autocorrelation structure of preference shocks and asset returns. For example, if these processes depend only on iid innovations, then (1) reduces to and (2) reduces to . But returns on assets are typically not iid, since both mean returns and volatility are, in general, time varying, and preference shocks are typically modeled as correlated (see, e.g., Hubmer et al. (2018) or Schorfheide et al. (2018)). This dependence must be and is accounted for in (2), since long upswings in and can lead to explosive paths for valuations and assets.
Next we study asymptotic stability, stationarity and ergodicity of wealth. Such properties are essential to existence of stationary equilibria in heterogeneous agent models (e.g., Huggett (1993), Aiyagari (1994) or Cao (2020)), as well as standard estimation, calibration and simulation techniques that connect time series averages with cross-sectional moments.111111A well-known example of a computational technique that uses ergodicity can be found in Krusell and Smith (1998). On the estimation side see, for example, Hansen and West (2002). These properties require an additional restriction, placed on the asymptotic growth rate of mean returns. Analogous to (1) and (2), this is defined as
[TABLE]
We show that if is sufficiently restricted and a degree of social mobility is present, then there exists a unique stationary distribution for the state process, the distributional path of the state process under the optimal path converges globally to the stationary distribution, and the stationary distribution is ergodic. We also show that, under some mild additional conditions, the rate of convergence of marginal distributions to the stationary distribution is geometric, and that a version of the Central Limit Theorem is valid. Finally, under some mild additional conditions, we prove that the stationary distribution of assets is Pareto tailed, consistent with the data.
Our study is related to Benhabib et al. (2015), who prove the existence of a heavy-tailed wealth distribution in an infinite horizon heterogeneous agent economy with capital income risk. In the process, they show that households facing a stochastic return on savings possess a unique optimal consumption policy characterized by the (boundary constraint-contingent) Euler equation, and that a unique and unbounded stationary distribution exists for wealth under this consumption policy. They assume isoelastic utility, constant discounting, and mutually independent, iid returns and labor income processes, both supported on bounded closed intervals with strictly positive lower bounds. We relax all of these assumptions. Apart from allowing more general utility and state dependent discounting, this permits such realistic features for household income as positive correlations between labor earnings and wealth returns (an extension that was suggested by Benhabib et al. (2015)), or time varying volatility in returns.121212Empirical motivation for these kinds of extensions can be found in numerous studies, including Guvenen and Smith (2014) and Fagereng et al. (2016a, b).
Another related paper is Chamberlain and Wilson (2000), which studies an income fluctuation problem with stochastic income and asset returns and obtains many significant results on asymptotic properties of consumption. Their study imposes relatively few restrictions on the wealth return and labor income processes. Our paper extends their work by allowing for random discounting, as well as dropping their boundedness restriction on the utility, which prevents their work from being used in many standard settings such as constant relative risk aversion. We also develop a set of new results on stability and ergodicity, as well as asymptotic normality of the wealth process.
Our optimality theory draws on techniques found in Li and Stachurski (2014), who show that the time iteration operator is a contraction mapping with respect to a metric that evaluates consumption differences in terms of marginal utility, while assuming a constant discount factor and constant rate of return on assets.131313Coleman (1990) introduced the time iteration operator as a constructive method for solving stochastic growth models. It has since been used in Datta et al. (2002), Morand and Reffett (2003) and many other studies. We show that these ideas extend to a setting where both returns and discount rates are stochastic and time varying. Our results on dynamics under the optimal policy have no counterparts in Li and Stachurski (2014).
In a similar vein, our work is related to several other papers that treat the standard income fluctuation problems with constant rates of return on assets and constant discount rates, such as Rabault (2002), Carroll (2004) and Kuhn (2013). While Carroll (2004) constructs a weighted supremum norm contraction and works with the Bellman operator, the other two papers focus on time iteration. In particular, Rabault (2002) exploits the monotonicity structure, while Kuhn (2013) applies a version of the Tarski fixed point theorem. Our techniques for studying optimality are close to those in Li and Stachurski (2014), as discussed above.141414Our paper is also related to Cao and Luo (2017), who study wealth inequality in a continuous-time framework with heterogeneous returns following a two-state Markov chain. While we do not pursue the connection here, the generality of our setup, including a persistent shock structure to wealth returns, might permit a study of the continuous-time limit that yields the tail results of Cao and Luo (2017) in a general framework.
The rest of this paper is structured as follows. Section 2 formulates the problem and establishes optimality results. Sufficient conditions for the existence and uniqueness of optimal policies are discussed. Section 3 focuses on stochastic stability. Section 4 discusses our key conditions and how they can be checked. Section 5 provides a set of applications and Section 6 concludes. All proofs are deferred to the appendix. Code that generates our figures can be found at https://github.com/jstac/ifp_public.
2. The Income Fluctuation Problem and Optimality Results
This section formulates the income fluctuation problem we consider, establishes the existence, uniqueness and computability of a solution, and derives its properties.
2.1. Problem Statement
We consider a general income fluctuation problem, where a household chooses a consumption-asset path to solve
[TABLE]
Here is the utility function, is discount factor process with , is the gross rate of return on wealth, and is non-financial income. These stochastic processes obey
[TABLE]
where , and are measurable nonnegative functions and is an irreducible time-homogeneous -valued Markov chain taking values in finite set . Let be the probability of transitioning from to in one step. The innovation processes , and are iid independent and their supports can be continuous and vector-valued.
The function maps to , is twice differentiable on , satisfies and everywhere on , and that as and as . We define
[TABLE]
The next period value of a random variable is typically denoted . Expectation without a subscript refers to the stationary process, where is drawn from its (necessarily unique) stationary distribution.
2.2. Key Conditions
Our conditions for optimality are listed below. In what follows, is the asymptotic growth rate of the discount process as defined in (1).
Assumption 2.1**.**
The discount factor process satisfies .
Assumption 2.1 is a natural extension of the standard condition from the constant discount case. If for all , then , as follows immediately from the definition. It is weaker than the obvious sufficient condition with probability one for some constant , since in such a setting we have . In fact it cannot be significantly weakened, as the proposition shows.
Proposition 2.1** (Necessity of the discount condition).**
Let and be positive with probability one for all and all initial states in . If, in this setting, we have , then the objective in (2.1) is infinite at every initial state .
The positivity assumed here may or may not hold in applications, but Proposition 2.1 shows that special conditions will have to be imposed on preferences if Assumption 2.1 fails. Put differently, allowing is tantamount to allowing in the case when the discount rate is constant.
Next, we need to ensure that the present discounted value of wealth does not grow too quickly, which requires a joint restriction on asset returns and discounting. When and are constant at values and , the standard restriction from the existing literature is . A generalization using as defined in (2) is
Assumption 2.2**.**
The discount factor and return processes satisfy .
Finally, we impose routine technical restrictions on non-financial income. The second restriction is needed to exploit first order conditions.
Assumption 2.3**.**
and .
Next we provide one example where Assumptions 2.1–2.3 are easily verified. More complex examples are deferred to Sections 4 and 5.
Example 2.1**.**
Suppose, as in Benhabib et al. (2015), that there is a constant discount factor , utility is CRRA with , and are iid, mutually independent, supported on bounded closed intervals of strictly positive real numbers, and, moreover,
[TABLE]
Assumptions 2.1–2.3 are all satisfied in this case. To see this, observe that in the constant discount case, so Assumption 2.1 holds. Since is convex when , Jensen’s inequality implies that . Multiplying both sides of the last inequality by yields
[TABLE]
By the second condition of (7), Assumption 2.2 holds. Assumption 2.3 also holds because is restricted to a compact subset of the positive reals.
2.3. Optimality: Definitions and Fundamental Properties
To consider optimality, we temporarily assume that and set the asset space to .151515Assumption 2.3 combined with implies that for all . Hence, for all and excluding zero from the asset space makes no difference to optimality. The state space for is then . A feasible policy is a Borel measurable function with for all . A feasible policy and initial condition generate an asset path via (2.1) when and . The lifetime value of policy is
[TABLE]
where is the asset path generated by . In the Appendix we show that is well-defined on . A feasible policy is called optimal if on for any feasible policy . A feasible policy is said to satisfy the first order optimality condition if
[TABLE]
for all , and equality holds when . Noting that is decreasing, the first order optimality condition can be compactly stated as
[TABLE]
for all . A feasible policy is said to satisfy the transversality condition if, for all ,
[TABLE]
Theorem 2.1** (Sufficiency of first order and transversality conditions).**
If Assumptions 2.1–2.3 hold, then every feasible policy satisfying the first order and transversality conditions is an optimal policy.
2.4. Existence and Computability of Optimal Consumption
Let be the space of continuous functions such that is increasing in the first argument, for all , and
[TABLE]
To compare two consumption policies, we pair with the distance
[TABLE]
which evaluates the maximal difference in terms of marginal utility. While elements of are not generally bounded, is a valid metric on . In particular, is finite on since , and the last two terms are finite by (12). In Appendix B, we show that is a complete metric space. The following proposition shows that, for any policy in , the first order optimality condition (10) implies the transversality condition.
Proposition 2.2** (Sufficiency of first order condition).**
Let Assumptions 2.1–2.3 hold. If and the first order optimality condition (10) holds for all , then satisfies the transversality condition. In particular, is an optimal policy.
We aim to characterize the optimal policy as the fixed point of the time iteration operator defined as follows: for fixed and , the value of the image at is defined as the that solves
[TABLE]
where is the function on
[TABLE]
defined by
[TABLE]
The following theorem shows that the time iteration operator is an -step contraction mapping on a complete metric space of candidate policies and its fixed point is the unique optimal policy.
Theorem 2.2** (Existence, uniqueness and computability of optimal policies).**
If Assumptions 2.1–2.3 hold, then there exists an in such that is a contraction mapping on . In particular,
- (1)
* has a unique fixed point .* 2. (2)
The fixed point is the unique optimal policy in . 3. (3)
For all we have as .
Part (3) shows that, under our conditions, the familiar time iteration algorithm is globally convergent, provided one starts with some policy in the candidate class .
2.5. Properties of Optimal Consumption
In this section we study the properties of the optimal consumption function obtained in Theorem 2.2. Assumptions 2.1–2.3 are held to be true throughout. The following two propositions show the monotonicity of the consumption function, which is intuitive.
Proposition 2.3** (Monotonicity with respect to wealth).**
The optimal consumption and savings functions and are increasing in .
Proposition 2.4** (Monotonicity with respect to income).**
If and are two income processes satisfying for all and and are the corresponding optimal consumption functions, then pointwise on .
Under further assumptions we can show that the optimal policy is concave and asymptotically linear with respect to the wealth level.
Proposition 2.5** (Concavity and asymptotic linearity of consumption function).**
If for each and that is concave in its first argument,
[TABLE]
then
- (1)
* is concave, and* 2. (2)
there exists such that .
Remark 2.1**.**
Condition (17) imposes some concavity structure on utility. It holds for the constant relative risk aversion (CRRA) utility function
[TABLE]
as shown in Appendix B.
Proposition 2.5 states that for some function when is large. This provides justification for linearly extrapolating the policy functions when computing them at high wealth levels.
Together, parts (1) and (2) of Proposition 2.5 imply the linear lower bound , although they do not provide a concrete number for . The following proposition establishes an explicit linear lower bound.
Proposition 2.6** (Linear lower bound on consumption).**
If there exists a nonnegative constant such that
[TABLE]
then for all .161616We adopt the convention , so condition (19) does not rule out the case . Indeed, as shown in the proofs, the conclusions still hold if we replace this condition by the weaker alternative for all .
The second inequality in (19) restricts marginal utility derived from transferring wealth to the next period and then consuming versus consuming wealth today. The value can be clarified once primitives are specified, as the next example illustrates.
Example 2.2**.**
Suppose that utility is CRRA, as in (18). If we now take
[TABLE]
and , then the conditions of Proposition 2.6 hold. In particular, the second inequality in (19) holds, as follows directly from the definition of and . In the case of Benhabib et al. (2015), where the discount rate is constant and returns are iid, the expression in (20) reduces to . The requirement then reduces to , which is one of their assumptions (see Example 2.1).
3. Stationarity, Ergodicity, and Tail Behavior
This section focuses on stationarity, ergodicity and tail behavior of wealth under the unique optimal policy obtained in Theorem 2.2. So that this policy exists, Assumptions 2.1–2.3 are always taken to be valid. We extend to by setting for all and consider dynamics of on , the law of motion for which is
[TABLE]
Let be the joint stochastic kernel of on . See Appendix A for this and related definitions.
3.1. Stationarity
To obtain existence of a stationary distribution we need to restrict the asymptotic growth rate for asset returns defined in (3).
Assumption 3.1**.**
There exists a constant such that (19) holds and .
Below is one straightforward example of a setting where this holds, with more complex applications deferred to Sections 4–5.
Example 3.1**.**
Assumption 3.1 holds in the setting of Benhabib et al. (2015). As shown in Example 2.2, with and the assumptions of Benhabib et al. (2015) in force, the conditions of (19) hold. Moreover, in their iid setting we have , so reduces to . This is one of their conditions, as discussed in Example 2.1.
By Proposition 2.6, the value in Assumption 3.1 is an upper bound on the rate of savings. is an asymptotic growth rate for each unit of savings invested. If the product of these is less than one, then probability mass contained in the wealth distribution will not drift to , which allows us to obtain the following result.171717Assumption 3.1 is weaker than any restriction implying wealth is bounded from above—a common device for compactifying the state space and thereby obtaining a stationary distribution. Indeed, under many specifications of and that fall within our framework, wealth of a given household can and will, over an infinite horizon, exceed any finite bound with probability one. See, for example, Benhabib et al. (2015), Proposition 6.
Theorem 3.1** (Existence of a stationary distribution).**
If Assumption 3.1 holds, then admits at least one stationary distribution on .
Stationarity of the form obtained in Theorem 3.1 is required to establish existence of stationary recursive equilibria in heterogeneous agent models with idiosyncratic risk, such as Huggett (1993) or Aiyagari (1994).181818For models with aggregate shocks, such as Krusell and Smith (1998), a fully specified recursive equilibrium requires that households take the wealth distribution as one component of the state in their savings problem, and that stationarity holds for the entire joint distribution (defined over a product space encompassing both the wealth distribution and the exogenous state process). These problems fall outside the scope of Theorem 3.1, since is finite-valued. For a careful treatment of stationary recursive equilibrium in Krusell–Smith type models, see Cao (2020).
3.2. Ergodicity
While Assumption 3.1 implies existence of a stationary distribution, it is not in general sufficient for uniqueness or stability. For these additional properties to hold, we must impose sufficient mixing. In doing so, we consider the following two cases:
- (Y1)
The support of is finite. 2. (Y2)
The process admits a density representation.
Condition (Y2) means that there exists a function from to such that
[TABLE]
for all Borel sets and all in .
Assumption 3.2**.**
There exists a in such that . Moreover, with defined as the greatest lower bound of the support of , either
- •
(Y1) holds and , or
- •
(Y2) holds and there exists a such that on .
Assumption 3.2 requires that there is a positive probability of receiving low labor income at some relatively persistent state of the world . This is a mixing condition that enforces social mobility. The reason is that is already assumed to be irreducible, so is eventually visited by each household. For any such household, there is a positive probability of low labor income over a long period. Wealth then declines. In other words, currently rich households or dynasties will not be rich forever. This guarantees sufficient social mobility between rich and poor, generating ergodicity.
To state our uniqueness and stability results, let be the -step stochastic kernel, let be total variation norm and let , where is a constant to be defined in the proof. For any integrable real-valued function on , let
[TABLE]
and
[TABLE]
where, here and in the theorem below, indicates expectation under stationarity.
Theorem 3.2** (Uniqueness, stability, ergodicity and mixing).**
If Assumptions 3.1 and 3.2 hold, then
- (1)
the stationary distribution of is unique and there exist constants and such that,
[TABLE] 2. (2)
For all and real-valued function on such that ,
[TABLE] 3. (3)
* is -geometrically mixing. Moreover, if and is bounded,*
[TABLE]
Part 1 of Theorem 3.2 states that the stationary distribution is unique and asymptotically attracting at a geometric rate. Part 2 states that the state process is ergodic, and hence long-run sample moments for individual households coincide with cross-sectional moments. The notion of mixing discussed in Part 3 is defined in the appendix. It states that social mobility holds asymptotically and mixing occurs at a geometric rate, although the rate may be arbitrarily slow. This mixing is enough to provide a Central Limit Theorem for the state process, which is the second claim in Part 3.
3.3. Tail Behavior
Having established the stationarity and ergodicity of wealth, we now study the tail behavior of the wealth distribution. We show that the wealth distribution is either bounded or (unbounded and) heavy-tailed under mild conditions. To prove this result we introduce the following assumption.
Assumption 3.3**.**
The assumptions of Proposition 2.5 are satisfied, so the optimal policy is concave and asymptotically linear: . Furthermore, there exists such that and
[TABLE]
Remark 3.1**.**
Condition (23) implies that wealth grows with nonzero probability when it is large. Indeed, using the law of motion (21a) and noting that , if , then by (23) we have
[TABLE]
with positive probability if is large enough.
To state our result on tail behavior, we introduce the following notation. For any nonnegative function , define the matrix-valued function by
[TABLE]
Elements of are conditional moment generating functions of . In the statement below, denotes the Hadamard (entry-wise) product, and returns the spectral radius of a matrix. Also is a random variable with distribution .
Theorem 3.3** (Tail behavior).**
Let Assumptions 3.1–3.3 hold and define
[TABLE]
Then is convex in . Assume that there exists in the interior of the domain of such that and let
[TABLE]
If has unbounded support, then it is heavy-tailed. In particular, for any ,
[TABLE]
Remark 3.2**.**
The assumption for some is weak. Because the -th element of is
[TABLE]
by the definition of in (25a) and condition (23), we always have as . Hence there exists such that if, for example, has a compact support.
Condition (27) implies that for any , there exists a constant such that
[TABLE]
for large enough , so the upper tail of the wealth distribution is at least Pareto.
Remark 3.3**.**
Toda (2019) constructs an example of a Huggett (1993) economy with Pareto-tailed wealth distribution when discount factors are random. Theorem 3.3 is significantly more general as we allow for stochastic returns and income. Stachurski and Toda (2019) prove that with constant discount factor, constant asset return, and light-tailed income, the wealth distribution is always light-tailed. Theorem 3.3 shows that sufficient heterogeneity in discount factor or returns generates heavy tails.
Example 3.2**.**
The CRRA-iid setting of Benhabib et al. (2015) satisfies the assumptions of Theorem 3.3. When utility is CRRA, by Proposition 5 of Benhabib et al. (2015), condition (23) holds if with positive probability, where is given in Example 2.2. In the iid case, this condition reduces to , which holds under the conditions of Benhabib et al. (2015).191919Benhabib et al. (2015) assume that , so it suffices to show that or, equivalently, . By Jensen’s inequality and their restriction , the last bound is true whenever . But this must hold because, under their conditions, we have , as shown in Example 2.1. Thus, Assumption 3.3 holds. The existence of with follows from Remark 3.2 and the assumption that has a compact support.
4. Testing the Growth Conditions
The three key conditions in the paper are the restrictions on the growth rates , and , with the first two required for optimality and the last for stationarity (see Assumptions 2.1, 2.2 and 3.1 respectively). In this section we explore the restrictions implied by these conditions. We begin with the following result, which yields a straightforward method for computing these growth rates.
Lemma 4.1** (Long-run growth rates and spectral radii).**
Let , where is a nonnegative measurable function and is an iid sequence with marginal distribution . In this setting we have
[TABLE]
and is the spectral radius of the matrix defined by
[TABLE]
The matrix is expressed as a function on in (29) but can be represented in traditional matrix notation by enumerating .202020Specifically, if , then where is, as before, the transition matrix for the exogenous state, and when . In what follows, , and are defined analogously to .
What factors determine the long-run average growth rates embedded in our assumptions, such as or ? Lemma 4.1 tells us how to compute these values for a given specification of dynamics, but how should we understand them intuitively and what factors determine their size? To address these questions, let us consider an AR(1) discount factor process, which has been adopted in several recent quantitative studies (see, e.g., Hubmer et al. (2018) or Hills and Nakata (2018)). In particular, suppose that the state process follows a discretized version of
[TABLE]
and . (The discretization implies that is always positive.) To simplify interpretation, the process (30) is structured so that the stationary distribution of is . We use Rouwenhorst (1995)’s method to discretize and then calculate using Lemma 4.1, studying how is affected by the parameters in (30).
Since for all , the structure of (30) implies that is the long-run unconditional mean of . It can therefore be set to standard calibrated value for the discount factor, such as from Krusell and Smith (1998). What we wish to understand is how the remaining parameters and affect the value of . While no closed form expression is available in this case, Figure 1 sheds some light by providing a contour plot of over a set of pairs. The figure shows that grows with both the persistence term and volatility term . In particular, the condition fails when the persistence and volatility of the discount factor process are sufficiently high. This is because is the limit of and, for positive random variables, sequence of large outcomes have a strong compounding effect on their product. High volatility and high persistence reinforce this effect.
This discussion has focused on but similar intuition applies to both and . If and are both increasing functions of the state process, then these asymptotic growth rates also increase with greater persistence and volatility in the state process, as well as higher unconditional mean. The next section further illustrates these points.
5. Application: Stochastic Volatility and Mean Persistence
We showed in Examples 2.1, 2.2 and 3.1 that, in the setting of Benhabib et al. (2015), where the discount factor is constant and returns and labor income are iid, Assumptions 2.1–2.3 and Assumption 3.1 are all satisfied. Hence, by Theorems 2.2 and 3.1, the household optimization problem has a unique optimal policy and the wealth process under this policy has a stationary solution. If, in addition, the support of is finite or has a positive density, say, then the conditions of Theorem 3.2 also hold and the stationary solution is ergodic, geometrically mixing and its time series averages are asymptotically normal.
Let us now bring the model closer to the data by relaxing the iid restrictions on financial and non-financial returns, introducing both mean persistence and time varying volatility in returns on assets.212121The importance of these features for wealth dynamics was highlighted in Fagereng et al. (2016a). In particular, we set
[TABLE]
where is iid and standard normal and and are finite-state Markov chains, discretized from
[TABLE]
Innovations are iid and standard normal. Using the data in Fagereng et al. (2016b) on Norwegian financial returns over 1993–2003, we estimate these AR(1) models to obtain , , , , and . Based on this calibration, the stationary mean and standard deviation of are around and , respectively.
To distinguish the effects of stochastic volatility and mean persistence, we consider two subsidiary models. The first reduces to its stationary mean , while the second reduces to its stationary mean . In summary,
[TABLE]
We set and . To test the stability properties of Model @slowromancapi@, we explore a neighborhood of the calibrated values, while in Model @slowromancapii@, we do likewise for pairs. In each scenario, other parameters are fixed to the benchmark. The results are shown in Figures 2 and 3.
In part (a) of each figure, we see that is increasing in the persistence and volatility parameters of the state process. The intuition behind this feature was explained in Section 4 for the case of and is similar here. (Note that in the present case, since is a constant, so has the same shape as in terms of contours.) The dots in the figures show that at the estimated parameter values.
Part (b) of each figure shows the set of parameters under which the model is globally stable and ergodic. The stability threshold is the boundary of the set of parameter pairs that produce , where is given by (20). For such pairs, Assumptions 2.2 and 3.1 both hold, so the conditions of Theorems 3.1–3.2 are satisfied. (We are continuing to suppose that is finite or has a positive density, so that Assumption 3.2 holds. Assumptions 2.1 and 2.3 are always valid in the current setting). Observe that the estimated parameter values (dot points) lie inside the stable set.
6. Conclusion
We studied an updated version of the income fluctuation problem, the “common ancestor” of modern macroeconomic theory (Ljungqvist and Sargent (2012), p. 3.) Working in a setting where returns on financial assets, non-financial income and impatience are all state dependent and fluctuate over time, we obtained conditions under which the household savings problem has a unique solution that can be computed by successive approximations and the wealth process under the optimal savings policy has a unique stationary distribution with Pareto right tail. We also obtained conditions under which wealth is ergodic and exhibits geometric mixing and asymptotic normality. We investigated the nature of our conditions and provided methods for testing them in applications. While our work was motivated by the desire to better understand the joint distribution of income and wealth, the income fluctuation problem also has applications in asset pricing, life-cycle choice, fiscal policy, monetary policy, optimal taxation, and social security. The ideas contained in this paper should be helpful for those fields after suitable modifications or extensions.
Appendix A Preliminaries
Given a topological space , let be the Borel -algebra and be the probability measures on . A stochastic kernel on is a map such that is -measurable for each and is a probability measure on for each . For all , and , we define and . Furthermore, for all , let . is called Feller if is continuous on whenever is bounded and continuous on . We call stationary for if .
A sequence is called tight, if, for all , there exists a compact such that for all . A stochastic kernel is called bounded in probability if the sequence is tight for all . Given , we define the total variation norm . Given any measurable map , we say that is -geometrically mixing if there exist constants and such that, for all and , the corresponding Markov process satisfies .
Below we use to denote a fixed probability space on which all random variables are defined. is expectations with respect to . The state process and the innovation processes , and introduced in (5) live on this space. In what follows, is a stationary version of the chain, where is drawn from its unique stationary distribution—henceforth denoted . The marginal distributions of the innovations are denoted by , and respectively. We let be the natural filtration generated by and the three innovation processes. conditions on and is expectation under .
We first prove Lemma 4.1, since its implications will be used immediately below. In the proof, we consider the matrix as a linear operator on and identify vectors in with real-valued functions on .
Proof of Lemma 4.1.
A proof by induction confirms that, for any function ,
[TABLE]
where is the -th composition of the operator with itself (or, equivalently, the -th power of the matrix ). The positivity of and Theorem 9.1 of Krasnosel’skii et al. (2012) imply that when is any norm on and is everywhere positive on . With and , this becomes
[TABLE]
where the second equality is due to (32) and and the third is by the law of iterated expectations. ∎
Lemma A.1**.**
Let and be as defined in Lemma 4.1. If , then there exists an in and a such that whenever .
Proof.
Recalling from the proof of Lemma 4.1 that when is any norm on and is everywhere positive on , we can again take but now switch to , so that (33) becomes
[TABLE]
Since and , the claim in Lemma A.1 now follows. ∎
Appendix B Proof of Section 2 Results
Proof of Proposition 2.1.
Pick any and . Since for all is dominated by a feasible consumption path, monotonicity of and the law of iterated expectations give
[TABLE]
where and the monotone convergence theorem has been employed to pass the expectation through the sum. In view of (32) and , we then have
[TABLE]
By the assumed almost sure positivity of and the irreducibility of , the matrix is irreducible. Hence, by the Perron–Frobenius theorem, we can choose an everywhere positive eigenfunction such that . By the everywhere positivity of , the function is everywhere positive on , and hence we can choose such that is less than pointwise on . We then have
[TABLE]
By lemma 4.1 we know that , and since and are positive, this expression is infinite. Returning to (35), we see that the value function is infinite at our arbitrarily chosen pair . ∎
For the rest of this section we suppose that Assumptions 2.1–2.3 hold.
Lemma B.1**.**
* and , are finite, as are the constants and .*
Proof.
That and are finite follows directly from Lemma A.1, with and respectively. Regarding , Assumption 2.3 states that . By the Law of Iterated Expectations, we can write this as . As is irreducible, we know that is positive everywhere on . Hence, must hold. The proof of is similar. ∎
Lemma B.2**.**
For the maximal asset path defined by
[TABLE]
we have, for each , that .
Proof.
Iterating backward on (36), we can show that . Taking expectation yields
[TABLE]
Then the Monotone Convergence Theorem and the Markov property imply that
[TABLE]
By Lemma B.1, we now have, for all ,
[TABLE]
Applying Lemma B.1 again gives , as was to be shown. ∎
Proposition B.1**.**
The value in (8) is well-defined in .
Proof.
By the assumptions on the utility function, there exists a constant such that , and hence . The last term is finite by Lemma A.1. ∎
Proof of Thoerem 2.1.
The proof is a long but relatively straightforward extension of Theorem 1 of Benhabib et al. (2015) and thus omitted. A full proof is available from the authors upon request. ∎
Proposition B.2**.**
* is a complete metric space.*
Proof.
The proof is a straightforward extension of Proposition 4.1 of Li and Stachurski (2014) and thus omitted. A full proof is available from the authors upon request. ∎
Proof of Proposition 2.2.
Let be a policy in satisfying (10). To show that any asset path generated by satisfies the transversality condition (11), observe that, by condition (12), we have
[TABLE]
[TABLE]
Regarding the first term on the right hand side of (38), fix and observe that
[TABLE]
with probability one, where is the maximal path defined in (36). We then have
[TABLE]
By Lemma B.1, we have
[TABLE]
and the last expression converges to zero as by Lemma A.1. The second term in (39) also converges to zero by Lemma B.2. Hence as , which, combined with (38) and another application of Lemma B.2, gives our desired result. ∎
Proposition B.3**.**
For all and , there exists a unique that solves (14).
Proof.
Fix and . Because , the map is increasing. Since is strictly decreasing, the equation (14) can have at most one solution. Hence uniqueness holds.
Existence follows from the intermediate value theorem provided we can show that
- (a)
is a continuous function, 2. (b)
such that , and 3. (c)
such that .
For part (a), it suffices to show that
[TABLE]
is continuous on . To this end, fix and . By (37) we have
[TABLE]
The last term is integrable, as follows easily from Lemma B.1. Hence the dominated convergence theorem applies. From this fact and the continuity of , we obtain . Hence, is continuous.
Part (b) clearly holds, since as and is increasing and always finite (since it is continuous as shown in the previous paragraph). Part (c) is also trivial (just set ). ∎
Proposition B.4**.**
We have for all .
Proof.
Fix and let .
Step 1. We show that is continuous. To apply a standard fixed point parametric continuity result such as Theorem B.1.4 of Stachurski (2009), we first show that is jointly continuous on the set defined in (15). This will be true if is jointly continuous on . For any and in with , we need to show that . To that end, we define
[TABLE]
where , and as defined in (5). Then and are continuous in by the continuity of and nonnegative by (40).
By Fatou’s lemma and Theorem 1.1 of Feinberg et al. (2014),
[TABLE]
This implies that
[TABLE]
The function is then continuous, since the above inequality is equivalent to the statement . Hence, is continuous on , as was to be shown. Moreover, since takes values in the closed interval , and the correspondence is nonempty, compact-valued and continuous, Theorem B.1.4 of Stachurski (2009) then implies that is continuous on .
Step 2. We show that is increasing in . Suppose that for some and with , we have . Since is increasing in by assumption, is increasing in and decreasing in . Then . This is a contradiction.
Step 3. We have shown in Proposition B.3 that for all .
Step 4. We show that . Since , we have
[TABLE]
for all . The right hand side is easily shown to be finite via Lemma B.1. ∎
To prove Theorem 2.2, let be all continuous functions that is decreasing in its first argument and is bounded and nonnegative. Given , let be the function mapping into the that solves
[TABLE]
Moreover, consider the bijection defined by .
Lemma B.3**.**
The operator and satisfies on .
Proof.
Pick any and . Let , then solves
[TABLE]
We need to show that and evaluate to the same number at . In other words, we need to show that is the solution to
[TABLE]
But this is immediate from (42). Hence, we have shown that on . Since is a bijection, we have . Since in addition by Proposition B.4, we have . This concludes the proof. ∎
Lemma B.4**.**
* is order preserving on . That is, for all with .*
Proof.
Let be functions in with . Suppose to the contrary that there exists such that . Since functions in are decreasing in the first argument, we have
[TABLE]
This is a contradiction. Hence, is order preserving. ∎
Lemma B.5**.**
There exists an and such that is a contraction mapping of modulus on .
Proof.
Since is order preserving and is closed under the addition of nonnegative constants, based on Blackwell (1965), it remains to verify the existence of and such that for all and . By Lemma A.1 and Assumption 2.2, it suffices to show that for all and , we have
[TABLE]
Fix , , and let . By the definition of , we have
[TABLE]
Here, the first inequality is elementary and the second is due to the fact that and is order preserving. Hence, and (43) holds for . Suppose (43) holds for arbitrary . It remains to show that it holds for . For , define . By the induction hypothesis, the monotonicity of and the Markov property,
[TABLE]
Hence, (43) is verified by induction. This concludes the proof. ∎
Proof of Theorem 2.2.
Let and be as in Lemma B.5. In view of Propositions 2.2, B.2 and B.4, to show that is a contraction and verify claims (1)–(3) of Theorem 2.2, based on the Banach contraction mapping theorem, it suffices to show that for all . To this end, pick any . Note that the topological conjugacy result established in Lemma B.3 implies that . Hence, and . By the definition of and the contraction property established in Lemma B.5,
[TABLE]
Hence, is a contraction and claims (1)–(3) are verified. ∎
Our next goal is to prove Proposition 2.3. To begin with, we define
[TABLE]
Lemma B.6**.**
* is a closed subset of , and for all .*
Proof.
To see that is closed, for a given sequence in and with , we need to show that . This obviously holds since is increasing for all , and, in addition, implies that for all .
Fix . We now show that . Since by Proposition B.4, it remains to show that is increasing. Suppose the claim is false, then there exist and such that and . Since , and by Proposition B.4, we have and . However, based on the property of the time iteration operator, we then have
[TABLE]
which implies that . This is a contradiction. Hence, is increasing, and is a self-map on . ∎
Proof of Proposition 2.3.
Since maps elements of the closed subset into itself by Lemma B.6, Theorem 2.2 implies that . Hence, the stated claims hold. ∎
Proof of Proposition 2.4.
Let be the time iteration operator for the income process established in Proposition B.4. It suffices to show for all . To see this, note that by Lemma B.4, we have whenever . Therefore if for all , we obtain . Iterating this starting from any , by Theorem 2.2, it follows that , completing the proof.
To show that for any , take any and define . To show , suppose on the contrary that . Since is increasing in and (hence is decreasing), it follows from the definition of the time iteration operator in (14)–(16), , and the monotonicity of that
[TABLE]
which is a contradiction. ∎
To prove Proposition 2.5, we need several lemmas.
Lemma B.7**.**
For all , there exists a threshold such that if and only if . In particular, there exists a threshold such that if and only if .
Proof.
Recall that, for all , solves
[TABLE]
For each and , define
[TABLE]
To prove the first claim, by Lemma B.6, it suffices to show that implies . This obviously holds since in view of (44), the former implies that
[TABLE]
which then yields . The second claim follows immediately from the first claim and the fact that is the unique fixed point of in . ∎
Consider a subset defined by .
Lemma B.8**.**
* is a closed subset of and , and, for all .*
Proof.
The first claim is immediate because limits of concave functions are concave. To prove the second claim, fix . We have by Lemma B.6. It remains to show that is concave for all . Given , Lemma B.7 implies that for and that for . Since in addition is continuous and increasing, to show the concavity of with respect to , it suffices to show that is concave on .
Suppose there exist some , , and such that
[TABLE]
Let , where . Then by Lemma B.7 and noting that consumption is interior, we have
[TABLE]
Using condition (17) then yields
[TABLE]
which contradicts (46). Hence, is concave for all . ∎
Proof of Proposition 2.5.
By Theorem 2.2, is a contraction mapping with unique fixed point . Since is a closed subset of and by Lemma B.8, we know that . The first claim is verified. Regarding the second claim, note that implies that is increasing and concave for all . Hence, is a decreasing function for all . Since for all , is well-defined and . ∎
Proof of Remark 2.1.
For each in concave in its first argument, let , where . Then is concave. Based on the generalized Minkowski’s inequality (see, e.g., Hardy et al. (1952), page 146, theorem 198), we have
[TABLE]
Since , the above inequality implies that condition (17) holds. ∎
To prove Proposition 2.6, let be as in (19) and define
[TABLE]
Lemma B.9**.**
* is a closed subset of , and for all .*
Proof.
To see that is closed, for a given sequence in and with , we need to verify that . This obviously holds since for all and , and, on the other hand, implies that for all .
We next show that is a self-map on . Fix . We have since is a self-map on . It remains to show that satisfies for all . Suppose for some . Then
[TABLE]
Since and , this implies that
[TABLE]
which contradicts (19) since . As a result, for all and we conclude that . ∎
Proof of Proposition 2.6.
We have shown in Theorem 2.2 that is a contraction mapping on the complete metric space , with unique fixed point . Since in addition is a closed subset of and by Lemma B.9, we know that . The stated claim is verified. ∎
Appendix C Proof of Section 3 Results
As before, Assumptions 2.1–2.3 are in force. Notice that Assumption 2.2, Assumption 3.1 and Lemma A.1 imply existence of an in such that
[TABLE]
Lemma C.1**.**
For all , we have .
Proof.
Since , Proposition 2.6 implies that for all . For all , we have in general, where the integers and . Using these facts and (2.1), we have:
[TABLE]
with probability one. Taking expectations of the above while noting that by Assumption 3.1 and Lemma A.1, we have
[TABLE]
or all and . Here we have used in Lemma B.1 and the Markov property. Hence, for all , as was claimed. ∎
A function is called norm-like if all its sublevel sets (i.e., sets of the form ) are precompact in (i.e., any sequence in a given sublevel set has a subsequence that converges to a point of ).
Proof of Theorem 3.1.
Based on Lemma D.5.3 of Meyn and Tweedie (2009), a stochastic kernel is bounded in probability if and only if for all , there exists a norm-like function such that the -Markov process satisfies . Fix . Since is finite, is bounded in probability. Hence, there exists a norm-like function such that . Then defined by is a norm-like function on . The stochastic kernel is then bounded in probability since Lemma C.1 implies that . Regarding existence of stationary distribution, since is Feller (due to the finiteness of ), whenever , the product measure satisfies
[TABLE]
Since in addition is continuous, a simple application of the generalized Fatou’s lemma of Feinberg et al. (2014) (Theorem 1.1) shows that the stochastic kernel is Feller. Moreover, since is bounded in probability, based on the Krylov-Bogolubov theorem (see, e.g., Meyn and Tweedie (2009), Proposition 12.1.3 and Lemma D.5.3), admits at least one stationary distribution. ∎
Lemma C.2**.**
The borrowing constraint binds in finite time with positive probability. That is, for all , we have .
Proof.
The claim holds trivially when . Suppose the claim does not hold on (recall that ), then for some , i.e., the borrowing constraint never binds with probability one. Hence,
[TABLE]
for all . Then we have
[TABLE]
for all . Let and be defined by (48). Let . Based on the Markov property and Lemma B.1, as ,
[TABLE]
Similarly, as ,
[TABLE]
Letting . (C) then implies that , contradicted with the fact that . Thus, we must have for all . ∎
Our next goal is to prove Theorem 3.2. In proofs we apply the theory of Meyn and Tweedie (2009). Important definitions (their information in the textbook) include: -irreducibility (Section 4.2), small set (page 102), strong aperiodicity (page 114), petite set (page 117), Harris chain (page 199), and positivity (page 230).
Recall that paired with its Euclidean topology is a second countable topological space (i.e., its topology has a countable base). Since and are respectively Borel subsets of and paired with the relative topologies, they are also second countable. Hence, satisfies (see, e.g., page 149, Theorem 4.44 of Aliprantis and Border (2006)). Recall (22). With slight abuse of notation, in proofs, we use to denote the density of in both cases (Y1) and (Y2) and write , where is the related measure. Specifically, is the Lebesgue measure when (Y2) holds. Moreover, Let be the counting measure.
Recall and the greatest lower bound of the support of given by Assumption 3.2. Let . Then by Assumption 3.2.
Lemma C.3**.**
* for all .*
Proof.
Fix . If , the claim holds trivially by Lemma B.7. Now consider the case . Suppose . Then, based on the De Morgan’s law, we have
[TABLE]
[TABLE]
[TABLE]
[TABLE]
Note that the set can be written as
[TABLE]
[TABLE]
Assumption 3.2 then implies that, for all ,
[TABLE]
Let and be defined by (48) and let . Similar to the proof of Lemma B.7, we can show that, with probability ,
[TABLE]
for some constant . Since and , Lemma B.1 implies that there exists such that
[TABLE]
As a result, we have with probability . This is a contradiction. Hence the stated claim is verified. ∎
Let be defined such that at .
Lemma C.4**.**
Let be an integrable map such that is decreasing for all . Then, for all and , the map is decreasing.
Proof.
Fix . When , (21a) implies that
[TABLE]
Since is decreasing, and by Proposition 2.3 and (21a), the optimal asset accumulation path is increasing in with probability one, we know that is decreasing for all . Thus, is decreasing. The claim holds for . Suppose this claim holds for arbitrary , it remains to show that it holds for . Note that
[TABLE]
Since is decreasing for all , based on the induction argument, is decreasing. The stated claim then follows. ∎
Lemma C.5**.**
The Markov process is -irreducible.
Proof.
Recall given by Assumption 3.2. Let be defined by if (Y1) holds and if (Y2) holds. We define the measure on by for . Clearly is a nontrivial measure. In particular, as is the counting measure. Moreover, since is the greatest lower bound of the support of , it must be the case that if (Y1) holds and that if (Y2) holds. As a result, when (Y1) holds and when (Y2) holds.
We first show that is -irreducible. Let be an element of such that . Fix . We need to show that visits set in finite time with positive probability.
Since is irreducible, for some integer . By Lemma C.1, there exists such that . By Lemma C.3, there exists such that . Lemma B.7 and Lemma C.4 then imply that for all . Hence, for and , we have
[TABLE]
based on the Markov property. By (21a), we have
[TABLE]
Note that, by Assumption 3.2, whenever . Since in addition , we have
[TABLE]
Let . Then (50) and (C) imply that
[TABLE]
Therefore, we have shown that any measurable subset with positive measure can be reached in finite time with positive probability, i.e., is -irreducible. Based on Proposition 4.2.2 of Meyn and Tweedie (2009), there exists a maximal probability measure on such that is -irreducible. ∎
Lemma C.6**.**
Let the function be defined as in (45). Then if (Y1) holds, while if (Y2) holds.
Proof.
Suppose (Y1) holds and . Then, by Lemma B.7, for all ,
[TABLE]
Hence, for all and ,
[TABLE]
where the last equality follows from (21a), which implies that with probability one. This is contradicted with Lemma C.3.
Suppose (Y2) holds and . By definition, for all and . Since with probability one, we have for all and . Via similar analysis to (C), Lemma B.7 implies that for all . Hence, for all and , we have . Again, this contradicts Lemma C.3. ∎
Lemma C.7**.**
The Markov process is strongly aperiodic.
Proof.
By the definition of strong aperiodicity, we need to show that there exists a -small set with , i.e., there exists a nontrivial measure on and a subset such that and
[TABLE]
For given by Assumption 3.2, let and let if (Y1) holds and if (Y2) holds. We now show that satisfies the above conditions. Define and note that on . Define the measure on by . If (Y1) holds, then as shown above, and, if (Y2) holds, Lemma C.6 implies that . Since in addition , it always holds that . Moreover, since on , we have and is a nontrivial measure.
For all and , Lemma B.7 implies that
[TABLE]
Hence, satisfies (53) and is strongly aperiodic. ∎
Lemma C.8**.**
The set is a petite set for all .
Proof.
Fix and . Let . By Lemma C.3,
[TABLE]
We start by showing that there exists a nontrivial measure on such that
[TABLE]
In other words, is a -small set. Fix . For all , define
[TABLE]
Note that for all , Lemma B.7 implies that
[TABLE]
Since is decreasing for all , by Lemma C.4,
[TABLE]
Note that is a nontrivial measure on since (54) implies that . Furthermore, since is chosen arbitrarily, the above inequality implies that (55) holds. We have shown that is a -small set, and hence a petite set. Since finite union of petite sets is petite for -irreducible chains (see, e.g., Proposition 5.5.5 of Meyn and Tweedie (2009)), the set must also be petite. ∎
Recall in Assumption 3.1, and in (48). Let .
Lemma C.9**.**
There exist constants , and a measurable map that is bounded on , such that, for sufficiently large and all , we have .
Proof.
Since by Proposition 2.6 and by Assumption 3.1 and Lemma A.1, by Lemma B.1 and the Markov property,
[TABLE]
Define . Note that . Choose , and such that . Then, for ,
[TABLE]
In particular, if , then and (C) implies that
[TABLE]
Let . Then the stated claim follows from (C)–(57) and the fact that is bounded on . ∎
Proof of Theorem 3.2.
Claim (1) can be proved by applying Theorem 19.1.3 (or a combination of Proposition 5.4.5 and Theorem 15.0.1) of Meyn and Tweedie (2009). The required conditions in those theorems have been established by Lemmas C.5, C.7, C.8 and C.9 above. Regarding claim (2), Lemmas C.8 and C.9 imply that for all , where is petite. Since in addition is -irreducible by Lemma C.5, Theorem 19.1.2 of Meyn and Tweedie (2009) implies that is a positive Harris chain. Claim (2) then follows from Theorem 17.1.7 of Meyn and Tweedie (2009).
To verify claim (3), since we have shown that is positive Harris with stationary distribution , based on Theorem 16.1.5 and Theorem 17.5.4 of Meyn and Tweedie (2009), it suffices to show that is -uniformly ergodic. Let be the -skeleton of (see page 62 of Meyn and Tweedie (2009)). Then is -irreducible and aperiodic by Proposition 5.4.5 of Meyn and Tweedie (2009). Theorem 16.0.1 of Meyn and Tweedie (2009) and Lemmas C.8 and C.9 then imply that is -uniformly ergodic, and, there exists such that , where for and, for all ,
[TABLE]
To show that is -uniformly ergodic, by Theorem 16.0.1 of Meyn and Tweedie (2009), it remains to verify: for . This obviously holds since, by the proof of Lemma C.9, there exist such that, for all ,
[TABLE]
Hence, is -uniformly ergodic and claim (3) follows. The proof is now complete. ∎
Proof of Theorem 3.3.
Take an arbitrarily large constant such that
[TABLE]
which is possible by Assumption 3.3 and the definition of in (25a). For this , since and is a finite set, we can take such that
[TABLE]
for all and . Multiplying both sides by , it follows from the law of motion (21a), , and the definition of in (25a) that for ,
[TABLE]
Let . Then for all and all ,
[TABLE]
Start the wealth accumulation process from . Consider the following process:
[TABLE]
where . We now show that with probability one for all by induction. Since , the case is trivial. Suppose the claim holds up to . Because and remains 0 once it becomes 0, without loss of generality we may assume are all positive. Hence . By the definition of , we have whenever . Therefore
[TABLE]
Hence applying (58), we get
[TABLE]
Now take any and let be a geometric random variable with mean that is independent of everything. Define
[TABLE]
where is as in (24). Since clearly and , we have . By Lemma 3.1 of Beare and Toda (2017), are convex, and hence continuous in the interior of their domains. Therefore and for small enough . Hence, for any , we can take small enough and large enough such that . By Lemma 3.1 of Beare and Toda (2017), there exists a unique such that . Theorem 3.4 of Beare and Toda (2017) then implies that
[TABLE]
for all . In particular, for any initial with ,
[TABLE]
Now suppose that we draw from the ergodic distribution. Then has the same distribution as , and so does . Therefore
[TABLE]
If the ergodic distribution of has unbounded support, then . As we have seen above, conditional on , we have for all . Therefore
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Acemoglu and Robinson (2002) Acemoglu, D. and J. A. Robinson (2002): “The Political Economy of the Kuznets curve,” Review of Development Economics , 6, 183–203.
- 2Açıkgöz (2018) Açıkgöz, Ö. T. (2018): “On the Existence and Uniqueness of Stationary Equilibrium in Bewley Economies with Production,” Journal of Economic Theory , 173, 18–55.
- 3Ahn et al. (2018) Ahn, S., G. Kaplan, B. Moll, T. Winberry, and C. Wolf (2018): “When Inequality Matters for Macro and Macro Matters for Inequality,” NBER Macroeconomics Annual , 32, 1–75.
- 4Aiyagari (1994) Aiyagari, S. R. (1994): “Uninsured Idiosyncratic Risk and Aggregate Saving,” Quarterly Journal of Economics , 109, 659–684.
- 5Aliprantis and Border (2006) Aliprantis, C. D. and K. C. Border (2006): Infinite Dimensional Analysis: A Hitchhiker’s Guide , Springer.
- 6Beare and Toda (2017) Beare, B. K. and A. A. Toda (2017): “Geometrically Stopped Markovian Random Growth Processes and Pareto Tails,” Tech. rep., UC San Diego.
- 7Benhabib and Bisin (2018) Benhabib, J. and A. Bisin (2018): “Skewed Wealth Distributions: Theory and Empirics,” Journal of Economic Literature , 56, 1261–1291.
- 8Benhabib et al. (2017) Benhabib, J., A. Bisin, and M. Luo (2017): “Earnings Inequality and Other Determinants of Wealth Inequality,” American Economic Review: Papers and Proceedings , 107, 593–597.
