A Proof of the Herschel-Maxwell Theorem Using the Strong Law of Large Numbers
Somabha Mukherjee

TL;DR
This paper provides a proof of the Herschel-Maxwell theorem using the strong law of large numbers, offering shorter proofs under certain conditions and connecting to Maxwell's characterization via the central limit theorem.
Contribution
It introduces a novel proof of the Herschel-Maxwell theorem leveraging the strong law of large numbers and explores alternative proofs using the central limit theorem.
Findings
Normal distribution characterized by spherical symmetry and independence
Shorter proofs under moment assumptions
Connection to Maxwell's characterization via CLT
Abstract
In this article, we use the strong law of large numbers to give a proof of the Herschel-Maxwell theorem, which characterizes the normal distribution as the distribution of the components of a spherically symmetric random vector, provided they are independent. We present shorter proofs under additional moment assumptions, and include a remark, which leads to another strikingly short proof of Maxwell's characterization using the central limit theorem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTheoretical and Computational Physics · Computational Physics and Python Applications · Stochastic processes and financial applications
A Proof of the Herschel-Maxwell Theorem Using the Strong Law of Large Numbers
Somabha Mukherjee Electronic address: [email protected], [email protected] Department of Statistics, Wharton School, University of Pennsylvania
Abstract
In this article, we use the strong law of large numbers to give a proof of the Herschel-Maxwell theorem, which characterizes the normal distribution as the distribution of the components of a spherically symmetric random vector, provided they are independent. We present shorter proofs under additional moment assumptions, and include a remark, which leads to another strikingly short proof of Maxwell’s characterization using the central limit theorem.
KEY WORDS: Spherically symmetric; Normal distribution; Characteristic function; Strong law of large numbers; Central limit theorem.
Contents
- 1 Introduction
- 2 Some Basic Properties of a Spherically Symmetric Distribution
- 3 The Spherical Symmetry Characterization and its Proof
- 4 Shorter Proofs Under Additional Moment Assumptions
- 5 Conclusion
1 Introduction
The Herschel-Maxwell theorem is one of the many beautiful characterizations of the normal disribution. It states that if the distribution of a random vector with independent components is invariant under rotations, then the components must be identically distributed as a normal distribution.
As mentioned in [2], J.C. Maxwell addressed the following question: What is the distribution of velocities of the gas particles? The argument behind Maxwell’s claim that velocities are normally distributed, hinged upon two very natural assumptions about the distribution function, independence and rotation invariance. Even before Maxwell, astronomer J.F.W. Herschel addressed a similar issue while characterizing the errors in astronomical measurements. He assumed that the components of the two-dimensional errors in measurement are independent, and that the distribution of the error is independent of its direction.
In this paper, we give a proof of the Herschel-Maxwell theorem using the strong law of large numbers, and give a remark about another unbelievably short proof of the theorem using the central limit theorem. The main tools of our analysis are characteristic functions and Haar’s Theorem for rotation-invariant measures on the surface of the unit sphere in Euclidean spaces.
2 Some Basic Properties of a Spherically Symmetric Distribution
Definition 2.1**.**
A random vector taking values in is said to have a spherically symmetric distribution, if and have the same distribution for every real, orthogonal matrix .
In the following two theorems, we state some basic properties of a spherically symmetric distribution.
Theorem 2.1**.**
The entries of a spherically symmetric random vector have the same distribution. Moreover, if that distribution has a finite mean, then the mean must be [math], and if that distribution has finite second moment, then any two distinct entries of the random vector are uncorrelated.
Theorem 2.2**.**
The random vector has a spherically symmetric distribution if and only if its characteristic function satisfies for all .
It follows immediately from Theorem 2.2, that if a random vector follows a spherically symmetric distribution, then so does all its subvectors. We now state and prove some sort of a “converse” of this fact, under an additional assumption, which will play a crucial role in our main proof.
Theorem 2.3**.**
Let be a distribution on with the property that if and are independent observations from , then has a spherically symmetric distribution. Then for every , if are independent observations from , has a spherically symmetric distribution.
Proof.
In view of Theorem 2.2, it suffices to show that for all and for all . By Theorem 2.2, this is true for and . Assume that the proposition holds for some . Let be independent observations from . Then, by our induction hypothesis, we have:
[TABLE]
for all . We are done. ∎
Theorem 2.4**.**
Let and be two independent random variables. Suppose that has a spherically symmetric distribution. Then, is either [math] or .
Proof.
Suppose, towards a contradiction, that . By Theorem 2.1, and have the same distribution. Since for every , we have for all :
[TABLE]
However, the sets \big{\{}(x,y)\neq(0,0):x\cos{\mathbf{\theta}}+y\sin{\mathbf{\theta}}=0\big{\}}\leavevmode\nobreak\ \left(0\leq{\mathbf{\theta}}\leq\frac{\pi}{2}\right) are pairwise disjoint, and they form an uncountable collection. This contradicts the fact that for any set in this collection, the probability of belonging to that set is positive. ∎
3 The Spherical Symmetry Characterization and its Proof
We will require a simple version of Haar’s Theorem for rotation-invariant measures on , the surface of the unit sphere in . It is stated below.
Theorem 3.1**.**
Let be a rotation-invariant Borel probability measure on i.e. for every Borel set and every orthogonal matrix . Then, is the uniform measure on .
It follows from Theorem 3.1, that if is a unit norm, spherically symmetric random vector in , then has the uniform distribution on . We are now ready to state and prove the main result of this paper.
Theorem 3.2**.**
Let and be two independent random variables. Suppose that has a spherically symmetric distribution. Then, and are identically distributed as a normal distribution with mean [math] (and possibly [math] variance).
Proof.
By Theorem 2.1, and have the same distribution, say . By Theorem 2.4, is either [math] or . In the latter case, has the normal distribution with mean [math] and variance [math]. So, assume that .
Generate a sequence of independent random variables from the distribution , and a sequence of independent random variables. For each , call and . It follows from Theorem 2.3 that has a spherically symmetric distribution i.e. for every orthogonal matrix , and have the same distribution. So,
[TABLE]
for every and every orthogonal matrix . Thus, is a unit norm spherically symmetric random vector in , and hence, follows the uniform distribution on . By the same argument, also follows the uniform distribution on . Hence, for all . This, in turn, implies that for all . By the Strong Law of Large Numbers, the right hand side converges almost surely to . Observe that , since otherwise, by the Strong Law of Large Numbers for independent and identically distributed random variables with expectation , it would follow that the left hand side converges almost surely to [math], a contradiction. So, by the Strong Law of Large Numbers for finite mean, the right hand side converges almost surely to . Hence, and we are done. ∎
Remark 1**.**
A slight modification of the proof of Theorem 3.2 yields the following result:
Theorem 3.3**.**
Suppose that is a sequence of random variables satisfying the following conditions:
-
,
-
,
-
is spherically symmetric for all , and
-
for all .
Then, are identically distributed as a normal distribution with mean [math] and positive variance.
The only observation needed before replicating the proof of Theorem 3.2 is that, under the above conditions, . Theorem 3.3 is probably interesting only from the angle that the independence of the ’s can be relaxed in lieu of some additional assumptions, in order to arrive at the same normal characterization.
4 Shorter Proofs Under Additional Moment Assumptions
Theorem 3.2 has shorter proofs under additional assumptions of finiteness of the first and second moments of . Suppose that we only have . Since has a symmetric distribution around [math], this condition is equivalent to the existence of . In this case, the characteristic function of is differentiable on .
By an application of Theorem 2.2, we have for all . Since the distribution of is symmetric around [math], is a real valued, even function. We claim that for all . If not, then since , by the intermediate value theorem, there is a , such that . Since for all , an easy induction gives for all and all . This implies that for all , which is not possible, since is continuous at [math] and . This proves our claim.
If we denote by , then we have for all . Taking partial derivative with respect to on both sides of the above identity, we get:
[TABLE]
This implies that there is a constant such that for all . Solving this differential equation and remembering that is continuous at [math] with , we get for all . Since for all , we must have . Now, for all implies that and we are done.
If further, we assume that , the proof turns out to be surprisingly short, and is given below.
Lemma 4.1**.**
Let be a sequence of independent and identically distributed random variables, satisfying that has a spherically symmetric distribution for all . For each , denote the partial sum by . Then, are identically distributed as .
Proof.
For each , let denote the orthogonal matrix whose first row is and let . Since and have the same distribution, their first entries have the same distribution. ∎
Now, consider proving Theorem 3.2 under the assumption . The case is trivial, so assume that . As in the proof of Theorem 3.2, generate a sequence of independent random variables from the common distribution of and , and for each , let . By Theorem 2.3 and Lemma 4.1, for all . By Theorem 2.1, . Hence, by the Central Limit Theorem, . So, .
Remark 2**.**
If is an i.i.d. sequence of random variables with , and if converges in distribution to a limit, then (see exercise of [3]). The finiteness of is now an immediate consequence of this fact and Lemma 4.1, which in turn gives a second proof of Theorem 3.2.
5 Conclusion
Theorem 3.2 appears in [2] (Theorem ) and an early proof of it appears in [1]. The treatment in [1] is however not very rigorous on probabilistic grounds. Corollary 10 of [4] is a consequence of Theorem 3.2.
Our first proof of Theorem 3.2 can be divided into two broad ideas. The first idea is to derive the spherical symmetry property of any number of independent observations from a distribution based on the knowledge of the spherical symmetry of two independent observations from that distribution. The second idea is to use the outcome of the first idea along with the Strong Law of Large Numbers, to conclude the result. In the process, the unit norm spherical symmetry characterization of the uniform distribution on the surface of an dimensional sphere was crucially used. The main advantage of this proof is that it is free of any calculation trickery, and is purely conceptual.
It is possible to give a more “direct” proof of Theorem 3.2 by solving the functional equation for all , for a general characteristic function . However, this approach relies strongly on the independence assumption of the random variables, and cannot, for example, be used to prove Theorem 3.3.
Acknowledgement
The author thanks Professor J. Michael Steele for one of his assignment problems in the STAT 930 course offered by the University of Pennsylvania, which was the source of the idea behind the proof of Theorem 3.2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bartlett, M.S. (1934), The vector representation of a sample, Math. Proc. Cambr. Phil. Soc. 30, pp. 327-340.
- 2[2] Bryc, W. (2005), Normal Distribution characterizations with applications, Lecture Notes in Statistics 1995, Vol 100.
- 3[3] Durrett, R. (2013), Probability: Theory and Examples, Cambridge University Press .
- 4[4] Meckes, E.S. & Meckes, M.W. (2007), The Central Limit Problem For Random Vectors With Symmetries, J Theoret. Probab. 20, pp. 697-720.
