Super Generalized Central Limit Theorem: Limit distributions for sums of non-identical random variables with power-laws
Masaru Shintani, Ken Umeno

TL;DR
This paper proves that sums of non-identical power-law distributed variables converge to a unique stable distribution, explaining the universality of stable laws in natural and social phenomena.
Contribution
It establishes a super generalized central limit theorem for non-identical power-law variables, extending classical results to more complex, real-world data.
Findings
Sums of non-identical power-law variables converge to a stable distribution.
The theorem explains the universality of stable laws in diverse systems.
Application to stock market returns demonstrates practical relevance.
Abstract
In nature or societies, the power-law is present ubiquitously, and then it is important to investigate the mathematical characteristics of power-laws in the recent era of big data. In this paper we prove the superposition of non-identical stochastic processes with power-laws converges in density to a unique stable distribution. This property can be used to explain the universality of stable laws such that the sums of the logarithmic return of non-identical stock price fluctuations follow stable distributions.
| (KS test) | (AD test) | |||||
|---|---|---|---|---|---|---|
| (const) | (const) | 10000 | 50000 | 0.122 | 0.074 | |
| 1000 | 100000 | 0.561 | 0.413 | |||
| 1000 | 100000 | 0.865 | 0.546 | |||
| 1000 | 100000 | 0.226 | 0.308 | |||
| 1000 | 100000 | 0.741 | 0.497 | |||
| 1000 | 100000 | 0.659 | 0.301 | |||
| 1000 | 100000 | 0.916 | 0.529 | |||
| 10000 | 20000 | 0.768 | 0.548 | |||
| 10000 | 30000 | 0.108 | 0.099 |
| random variables | N | L | KS test | AD test | |||
| 3 | 1 | 2000 | 10000 | 0.136 | 0.110 | ||
| 3 | 1 | 1000 | 10000 | 0.289 | 0.190 | ||
| 3 | 1 | 1000 | 10000 | 0.305 | 0.081 | ||
| 3 | 1 | 2000 | 10000 | 0.145 | 0.093 | ||
| 3 | 1 | 1000 | 10000 | 0.371 | 0.286 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Time Series Analysis · Opinion Dynamics and Social Influence · Statistical Mechanics and Entropy
Super Generalized Central Limit Theorem
–Limit distributions for sums of non-identical random variables with power-laws–
Masaru Shintani
Ken Umeno
Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Yoshida Honmachi Sakyo-ku, Kyoto, 606–8501
(Aug 22, 2017)
Abstract
In nature or societies, the power-law is present ubiquitously, and then it is important to investigate the characteristics of power-laws in the recent era of big data. In this paper we prove the superposition of non-identical stochastic processes with power-laws converges in density to a unique stable distribution. This property can be used to explain the universality of stable laws such that the sums of the logarithmic return of non-identical stock price fluctuations follow stable distributions.
pacs:
89.65.Gh, 02.50.-r, 02.70.Rr, 05.40.Fb
*Introduction—.*There are a lot of data that follow the power-laws in the world. Examples of recent studies include, but are not limited to the financial market mandelbrot1997variation ; mantegna1994stochastic ; mantegna1995scaling ; gopikrishnan1998inverse ; gabaix2006institutional ; denys2016universality ; tanaka2016statistical , the distribution of people’s assets druagulescu2001exponential , the distribution of waiting times between earthquakes occurring bak2002unified and the dependence of the number of wars on its intensity roberts1998fractality . It is then important to investigate the general characteristics of power-laws.
In particular, as for the data in the financial market, Mandelbrot mandelbrot1997variation firstly argued that the distribution of the price fluctuations of cotton follows a stable law. Since the 1990’s, there has been a controversy as to whether the central limit theorem or the generalized central limit theorem (GCLT) Kolmogorov as sums of power-law distributions can be applied to the data of the logarithmic return of stock price fluctuations. In particular, Mantegna and Stanley argued that the logarithmic return follows a stable distribution with the power-law index mantegna1994stochastic ; mantegna1995scaling , and later they denied their own argument by introducing the cubic laws () gopikrishnan1998inverse . Even recently, some researchers gabaix2006institutional ; denys2016universality ; tanaka2016statistical have argued whether a distribution of the logarithmic returns follows power-laws with or stable laws with . On the other hand, it is necessary to prepare very large data sets to elucidate true tail behavior of distributions weron2001levy . In this respect, the recent study tanaka2016statistical showed that the large and high-frequency arrowhead data of the Tokyo stock exchange (TSE) support stable laws with .
In this study, we show that the sums of the logarithmic return of multiple stock price fluctuations follows stable laws, and it can be described from a theoretical background. We will extend the GCLT to sums of independent non-identical stochastic processes. We call this Super Generalized Central Limit Theorem (SGCLT).
*Summary of stable distributions and the GCLT—.*A probability density function of random variable following a stable distribution nolan2003stable is defined with its characteristic function as:
[TABLE]
where is expressed as:
[TABLE]
The parameters and are real constants satisfying , , , and denote the indices for power-law in stable distributions, the skewness, the scale parameter and the location, respectively. When and , the probability density function obeys a normal distribution. Note that explicit forms of stable distributions are not known for general parameters and except for a few cases such as the Cauchy distribution ().
A stable random variable satisfies the following property for the scale and the location parameters. A random variable follows , when
[TABLE]
where . When the random variables satisfy , the superposition of independent random variables that have different parameters except for is also in the stable distribution family as:
[TABLE]
where the parameters and are expressed as:
[TABLE]
We can prove this immediately by the use of the characteristic function for the sums of random variables expressed as the product of their characteristic functions:
[TABLE]
We focus on the GCLT. Let of be a probability density function of a random variable for :
[TABLE]
with being real constants. Then, according to the GCLT Kolmogorov , the superposition of independent, identically distributed random variables converges in density to a unique stable distribution for , that is
[TABLE]
where is a characteristic function of as the expected value of , is the expectation value of , is an imaginary part of the argument, and parameters and are expressed as:
[TABLE]
with being the Gamma function. When , we obtain , and the superposition of the independent, identically distributed random variables converges in density to a normal distribution:
[TABLE]
*Our generalization—.*We consider an extension of this existing theorem for sums of non-identical random variables. In what follows we assume that the random variables satisfy the following two conditions.
(Condition 1): The random variables , obey respectively the distributions , , and satisfy , .
(Condition 2): The probability distribution function of the random variables satisfies in :
[TABLE]
where and are samples obtained by and . We emphasize that the probability distribution function may not be obtained even when we integrate over and .
The main claim of this paper is the following generalization of GCLT: The following superposition of non-identical random variables with power-laws converges in density to a unique stable distribution for , where
[TABLE]
with being a characteristic function of as the expected value of , and parameters are expressed as:
[TABLE]
Here denotes the expectation value of with respect to random parameter distributions and .
*Proof—.*Although the following is not mathematically rigorous, we give the following intuitive proof.
The probability distribution function of random variables satisfying the Conditions 1-2 is expressed as:
[TABLE]
where and satisfy and . The superposition is then defined as:
[TABLE]
where is a characteristic function of . On the other hand, let be with some , and be samples given by the same parent to for each . Then are independent, identically distributed for at a fixed index . Then, we define the superposition as follows:
[TABLE]
Here, we do not consider the convergence of in density for , but consider the superposition for , since the superposition will converge to the same limiting distribution of if converges in density.
We focus on the convergence in density of for and as follows. About the previous in , we express it as with the following ,
[TABLE]
Here, the superposition is described as:
[TABLE]
When , let be the superposition . Then, converges in density to for according to the GCLT (5), that is
[TABLE]
where and are
[TABLE]
Thus, with the stable property (2), we obtain the convergence of the superposition as follows:
[TABLE]
where and are:
[TABLE]
This proves the superposition converges in density to . Figure LABEL:fig:concept illustrates the concept of this proof.
As above, the superposition of non-identical stochastic processes converges in density to a unique stable distribution. Since the limiting distribution of is the same as that of , also converges to . When , this statement does not hold because of dependence between and in , but we find that the limit distribution of the superposition generally converges in density to as can be seen in the following numerical examples.
*Numerical confirmation—.*As below, we confirm the claim of SGCLT (7) by some numerical experiments.
To verify the main claim numerically, we use two kinds of test: two-samples Kolmogorov-Smirnov (KS) test stephens1974edf and two-samples Anderson-Darling (AD) test anderson1952asymptotic with 5% significance level. We generate two data by different methods, and see the of both of tests. Then, unless the null hypothesis is rejected, we judge the two data follow the same distribution.
For the first data, we generate non-identical stochastic processes satisfying Conditions 1-2, and prepare the superposition obtained in the same way as (7). For the second data, we generate the random numbers that follow the stable distribution, where the first data will converge to the stable distribution according to (7). Note that we compare the superposition with not a cumulative distribution function but random numbers obtained from another numerical method described below since a cumulative distribution function of a stable distribution cannot be expressed explicitly except for a few cases.
For the first data, let us consider the chaotic dynamical system , where is defined umeno1998superposition as follows for :
[TABLE]
This mapping has a mixing property and an ergodic invariant density for almost all initial points . One of the authors (KU) obtained the following explicit asymmetric power-law distribution as an invariant density umeno1998superposition :
[TABLE]
This asymmetric distribution behaves as follows for :
[TABLE]
This is exactly the same expression with the condition of GCLT (4) for random variables in . Then, putting the variables and be distributed, we can obtain various different distributions with the same power-laws.
We regard the parameters and as random samples obtained from and , where and obey and , respectively. These are defined for with finite mean.
Then the parameters and are given as and , and , are also satisfied since are not 0 and samples from some random variables and with finite mean. As above, we can get some stochastic processes satisfying the Conditions 1-2.
For the second data, the random numbers generated with the following procedure follow a stable distribution chambers1976method . Let and be independent random numbers: uniformly distributed in , exponentially distributed with mean . In addition, let be as follows:
[TABLE]
for where . Then it follows that . We get arbitrary stable distributions by the use of the property (1) about the scale parameter and the location.
With two data obtained accordingly, we see whether the superposition numerically converges in density to a stable distribution or not. Table 1 and 2 show of the KS test and the AD test for each . The constant is the length of the sequence and is the number of sequences used for the superposition. The meaning of is the uniform distribution in . Figure 2 illustrates an example of correspondence when . “Crand” is the random numbers follow the standard Cauchy distribution. This case shows that the integral average of the probability distribution function with the Cauchy distribution is not uniquely determined.
As can be seen from Table 1 and 2, we cannot reject the null hypothesis in any case for . In other words, the distribution of superposition and the stable distribution are close enough in density according to our SGCLT.
In Figure 3, we can see that the superposition of non-identical distributed random variables converges.
*Conclusions—.*We have further generalized the GCLT for the sums of independent non-identical stochastic processes with the same power-law index . Our main claim of SGCLT can have more general applications since the various type of different power-laws exist in nature. Thus, our SGCLT can support the argument on the ubiquitous nature of stable laws such that the logarithmic return of the multiple stock price fluctuations follow a stable distribution with by regarding them as the sums of non-identical random variables with power-laws. Take the data of the stock market as an example. Then, for the case that the distribution of the logarithmic return of each stock price fluctuation have the almost same power-law exponents and different scale parameters , we get some trends or indicators according to this SGCLT.
The authors thank Dr. Shin-itiro Goto (Kyoto University) for stimulating discussions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) B. Mandelbrot, Journal of Business, 36 , 394 (1963)
- 2(2) R. N. Mantegna, H. E. Stanley, Phys. Rev. Lett. 73 , 2946 (1994)
- 3(3) R. N. Mantegna, H. E. Stanley, Nature, 376 , 46 (1995)
- 4(4) P. Gopikrishnana, M. Meyer, L. A. N. Amaral, H.E. Stanley, Eur. Phys. J. B, 3 , 139 (1998)
- 5(5) X. Gabaix, P. Gopikrishnan, V. Plerou, H. E. Stanley, The Quarterly Journal of Economics, 121 , 2, 461 (2006)
- 6(6) M. Denys, T. Gubiec, R. Kutner, M. Jagielski, H. E. Stanley, Phys. Rev. E, 94 , 042305 (2016)
- 7(7) M. Tanaka, IEICE Technical Report, 116 , 27 (2016) (In Japanese)
- 8(8) A. Dr a ˘ ˘ a \breve{\text{a}} gulescu, V. M. Yakovenko, Physica A: Statistical Mechanics and its Applications, 299 , 213 (2001)
