Lower cone distribution functions and set-valued quantiles form Galois connections
Cagin Ararat, Andreas H Hamel

TL;DR
This paper demonstrates that lower cone distribution functions and set-valued multivariate quantiles form a Galois connection, generalizing univariate results and characterizing capacity functionals of random set extensions.
Contribution
It introduces a Galois connection between convex sets and the interval [0,1] using these functions, extending univariate theory and linking to capacity functionals.
Findings
Establishes a Galois connection between convex sets and [0,1].
Generalizes univariate distribution function results.
Characterizes capacity functionals of random set extensions.
Abstract
It is shown that the recently introduced lower cone distribution function and the associated set-valued multivariate quantile generate a Galois connection between a complete lattice of closed convex sets and the intervall [0,1]. This generalizes the (not so well-known) corresponding univariate result. It is also shown that an extension of the lower cone distribution function and the set-valued quantile characterize the capacity functional of a random set extension of the original multivariate variable along with its distribution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Lower cone distribution functions and set-valued quantiles form Galois connections111In honour of Y. Kabanov on the occasion of his 70th birthday.
Çağın Ararat222Bilkent University, Department of Industrial Engineering, Ankara, Turkey, \[email protected]@bilkent.edu.tr.
Andreas H. Hamel333Free University of Bozen, Faculty of Economics and Management, Bozen-Bolzano, Italy, \hrefmailto:[email protected]@unibz.it.
Abstract
It is shown that the recently introduced lower cone distribution function and the associated set-valued multivariate quantile generate a Galois connection between a complete lattice of closed convex sets and the interval . This generalizes the corresponding univariate result. It is also shown that an extension of the lower cone distribution function and the set-valued quantile characterize the capacity functional of a random set extension of the original multivariate variable along with its distribution.
Keywords Galois connection, multivariate quantile, complete lattice, lower cone distribution function, random set
Mathematics Subject Classification 60A05, 62H05
1 Introduction
Several features of set-valued quantiles for multivariate random variables as introduced in [9] are investigated and extended. In particular, the lower cone distribution function from [9] is extended to a function on sets, and it is shown that this extension together with the set-valued quantile forms a Galois connection between and a complete lattice of closed convex sets ordered by . In the univariate case, a similar result is known (see [4, Remark 3.1]), but apparently not too popular under this label. For example, in the recent work [6], the property constituting the Galois connection is stated (formula (2), p. 5), but the Galois connection is neither mentioned, nor exploited.
Our approach turns two downsides of previous proposals for multivariate quantiles into upsides. First, by using the cone distribution function (instead of the joint distribution function even if the cone is ), an arbitrary vector order can be dealt with, thus ‘the absence of a natural ordering of Euclidean spaces of dimension greater than one’ ([13, p. 214] where ‘natural’ apparently has to be understood as ‘total’ in order theoretic terms) is turned into a huge potential for applications in statistical and financial analysis where such an order relation very often is present by default, e.g., generated by a solvency cone. Secondly, the fact that an inverse of a monotone function usually ‘defines only a correspondence, that is, a multi-valued or set-valued mapping’ ([6, p. 5]) is exploited by understanding quantiles as functions mapping into complete lattices of sets which carry a rich (order) structure. It is shown that certain lattices, e.g., generated by the closure operators of the respective Galois connection, characterize features of the underlying random vector.
Moreover, it is shown that the set-valued quantiles characterize the distribution of a random set extension of the original random variable, thus the three objects “distribution of the random set,” “(extended) cone distribution function of the random vector,” and “lattice-valued quantile function” carry the same information. This is very much parallel to the univariate case (compare, for instance, [6, formula (4), p. 5]). The ordering cone enters the definition of the lattice of sets in which the random set extension of the original multivariate random variable takes its values.
2 Set-up
The framework and notation of [9] and, when it comes to concepts from set-valued convex analysis, [8] are used. In particular, we consider a vector preorder on which is generated by a nonempty closed convex cone by means of
[TABLE]
for ; is neither assumed to have a non-empty interior, nor be pointed, i.e., is not assumed. Thus, the cases and for are not excluded. In the latter case, is a (homogeneous) halfspace and a total preorder. The (positive) dual of the cone is
[TABLE]
The bipolar theorem yields under the given assumptions. The set
[TABLE]
comprises the closed convex subsets of which are stable under addition of ; the sum is understood in the Minkowski sense with for all . The pair is an order complete lattice with the following formulas for inf and sup (see, for example, [8]) for sets :
[TABLE]
3 Lower cone distribution functions and quantiles
Let be a probability space and the space of all equivalence classes of random variables with values in . For , , the function defined by is called the -distribution function of . If , and , then is the usual cumulative distribution function (cdf) of the univariate random variable . The function defined by
[TABLE]
is called the lower -distribution function associated to .
If and , the set is called the lower -quantile, and the set
[TABLE]
is called the lower -quantile of . Clearly, for all .
If and hence , then is the Tukey depth function and is the corresponding depth region. In this case, it might happen that (for continuous distributions, for example) for all which also shows that Tukey’s depth function in the case is not a generalization of the univariate cdf which requires and has values in , in general.
A few elementary properties are collected in the following result.
Proposition 1**.**
(a) For each and , the set is a closed halfspace or empty or .
(b) For each , the set is closed, convex and satisfies .
(c) One has whenever . Moreover,
[TABLE]
(d) The function is quasiconcave, upper semicontinuous and monotone nondecreasing with respect to .
Proof. (a) This is a consequence of the monotonicity and the upper semicontinuity of as the cumulative distribution function of the univariate random variable .
(b) This follows from (a) since is the intersection of the closed halfspaces for (possibly empty or ).
(c) The sets are nested by definition. Hence holds in (2). For the contrary, assume . If is not an element of the right hand side of (2), then there is such that . This would imply , a contradiction.
(d) The first two properties follow since a function with convex and closed upper level sets is quasiconcave and upper semicontinuous while monotonicity is a straightforward consequence of the definitions of , , .
Proposition 1 yields which means that can be seen as a function mapping from into the complete lattice . Therefore, (2) can be written as where the supremum is understood in . In this sense, is left-continuous. To summarize, the quantile function is the non-decreasing, left-continuous -valued inverse of the lower -distribution function (in the sense of, e.g., [5, Definition 1]). This provides a complete analog to the univariate case. The left-continuity of yields , and this set can be non-empty. Since is the obvious choice, is well-defined on by (1). Even for the univariate case, it has been observed that ‘leaving out the probabilities 0 and 1 is artificial’ ([4, Remark 3.1]).
Proposition 1 (d) can be considered as an extension of [12, Proposition 1] which works for the Tukey depth function (but even for an arbitrary positive measure).
The next result, stated for notational convenience, prepares a continuity result for .
Lemma 1**.**
Let be convergent sequences in with limits , respectively. If , then,
[TABLE]
Proof. Suppose that so that . Let . There exists such that and for every . In particular, so that for every . Hence, The case can be treated by similar arguments.
Remark 1**.**
The condition in Lemma 1 cannot be omitted. As a counterexample, let and for every . Note that and for every . Hence, .
Proposition 2**.**
If the distribution of under is such that for each and each , then is continuous. In particular, is continuous whenever is a continuous random vector.
Proof. Let , where is the unit sphere in . Note that is a base for in the sense that every can be written in the form for some unique and unique , and we have for every . It follows that
[TABLE]
for every . Moreover is a compact set.
By Proposition 1 (d), it suffices to show that is lower semicontinuous. We fix and show that the lower level set is closed. To that end, let be a convergent sequence in with limit . Let . By (3), for every , there exists such that . Since is a sequence in the compact set , by Bolzano-Weierstrass theorem, there exists a convergent subsequence of it, say, with limit . Hence, , and applying Lemma 1 gives
[TABLE]
for every such that . By assumption, one has . Hence,
[TABLE]
Therefore, by dominated convergence theorem,
[TABLE]
Hence, . Since is arbitrary, we conclude that . So . Hence, is a closed set.
There is an alternative way of writing : the result in Theorem 1 below is inspired by [12, Propositions 2 & 6]. The proof is prepared by the following lemma which should be known and is implicitly part of the proof of [16, Theorem 2.11].
Lemma 2**.**
Let . For every and every with there exists such that .
Proof. Let us fix and with and take with , which exists since . Then, for all . Let us define for each . Then, for every ,
[TABLE]
so that and . Since is right-continuous, it follows that
[TABLE]
so there is with
[TABLE]
Hence,
[TABLE]
which proves the claim with .
Theorem 1**.**
For all ,
[TABLE]
Proof. The two expressions on the right hand side clearly coincide since .
First, assume that . Then, there exist , such that and . It follows that which implies , so
[TABLE]
hence .
Therefore, .
Conversely, assume that
[TABLE]
Then, there exists such that . Lemma 2 yields the existence of satisfying . If
[TABLE]
would be true, then also and
[TABLE]
which is a contradiction. So, . This shows .
Remark 2**.**
Assume that there is such that for all . Then, the set is a base of (every can be represented uniquely as with , ). If this is the case, then intersections such as in formula (1) and the one in Theorem 1 need to run only over instead of since for all and .
If , , then is such a base, and the formula in Theorem 1 breaks down to
[TABLE]
while (1) becomes which re-produces well-known formulas for the univariate lower quantile, see [7, p. 207].
4 Galois connections
Let denote the power set of (including ) and be an extended real-valued function. The function defined by
[TABLE]
is called the inf-extension of , where is understood.
First, we apply this concept to . The following result collects a few properties of which are basically inherited from .
Proposition 3**.**
(a) is monotone: almost surely implies for every .
(b) is monotone: implies for every .
(c) for every .
Proof. (a) follows from the monotonicity of . (b) follows by the construction of . (c) follows from the monotonicity of .
The following proposition prepares a new feature.
Proposition 4**.**
For every ,
[TABLE]
Proof. By the definition of , is certainly true. Since is monotone with respect to (see Proposition 1 (d)),
[TABLE]
hence and therefore, . This gives .
Next, take and with such that . Set . The quasiconcavity of yields
[TABLE]
This proves .
Finally, take a sequence in which converges to some . Then, the upper semicontinuity of produces
[TABLE]
hence which gives “.”
As a result of Proposition 4, it is enough to consider the inf-extension as a function on rather than the whole power set , which is done in the sequel.
Corollary 1**.**
(a) For every , one has
[TABLE]
(b) For and , one has
[TABLE]
Proof. (a) Since for all , “” certainly is true. Take . Then, there is with , hence which in turn implies . Proposition 4 now produces .
(b) Setting , for , and observing as well as
[TABLE]
(see Proposition 4) one gets (5) as a special case of (4).
Equation (4) means that preserves infima (meets) as a function from to (see also [3, 7.31 & 2.26]). This property has been called “inf-stability” in [2].
Proposition 5**.**
(a) For every and ,
[TABLE]
(b) The two compositions and are closure operators (extensive, increasing and idempotent).
(c) The set
[TABLE]
is a complete lattice with respect to .
(d) One has
[TABLE]
Proof. (a) is straightforward and can be checked by using the definitions of and . (b) follows from the theory of Galois connections (see [3, Chapter 7]). (c) follows from the Knaster-Tarski theorem since is a complete lattice and is the set of fixed points of the composition . (d) also follows from the theory of Galois connections [3].
Proposition 5 (a) establishes the fact that and form a Galois connection between the two complete lattices and where is the upper adjoint and the lower adjoint. This means that and determine each other; they carry the same information.
Remark 3**.**
The complete lattice can be generated in a different, but related way. Using the notation of [2], we set and define the -closure of by
[TABLE]
Now, one has
[TABLE]
Therefore, the set of all fixed points of the closure operator coincides with the complete lattice generated by the singleton via . The relation defined by
[TABLE]
is a total order which is extended to by
[TABLE]
The function can be understood as a ranking function for multivariate data points in . Such ranking functions are used in statistics, e.g., for outlier detection (see [14]), and also for decision making (see [10]). The function gives a corresponding ranking for subsets of .
A different (non-total) order relation can be constructed using the -distribution functions with instead of the lower -distribution function . We consider the family which induces the relation
[TABLE]
on which is non-total in general, as well as the set relation
[TABLE]
on . Since the infimum over can be taken first on the left hand side of the two scalar inequalities above and then on the right hand side, the relations and above turn out to be extensions of and , respectively: implies and implies .
Define the -closure of by
[TABLE]
It follows from [2, Proposition 2.2] that is a closure operator which generates the complete lattice
[TABLE]
with the relation . The next result characterizes the case where coincides with the identity operator, that is, .
Theorem 2**.**
The following are equivalent:
(a) For every , the cumulative distribution function is strictly increasing.
(b) For every , .
Proof. Suppose that (a) holds and let . Clearly, . To show the reverse inclusion, let . Fix . We have . By (5) one has
[TABLE]
where is understood. Hence, the strict monotonicity of implies
[TABLE]
Since this holds for every , one may conlude that . Hence, .
Conversely, suppose that (b) holds. To get a contradiction, assume that there exists such that is not strictly increasing. Hence, there exist such that and . It is clear that one can find with and . Let us define
[TABLE]
Clearly, and . We claim that . First, note that
[TABLE]
for each . Hence, if for , then
[TABLE]
On the other hand, if with for every , then
[TABLE]
Hence, the claim follows. However,
[TABLE]
which shows that . Since , we get a contradiction to (b). Hence, (a) holds.
Note that the condition (a) in Theorem 2 requires, for each , the continuous part of to be strictly increasing although may have jumps.
Remark 4**.**
It is easy to check that for each . Under the conditions of Theorem 2, we have . In general, may be a (much) smaller set than . However, if for some , then is the ray generated by so that for every . In this case, is also a total order and coincindes with .
5 The simulation result
The main question in this section is how the quantile function characterizes the distribution. In the univariate case, one can show that the quantile, taken at a random variable uniformly distributed over , produces a random variable which has the cumulative distribution function that defines the quantile (compare, for example, [7, Lemma A.19], the “simulation lemma”). In our setting, quantiles are sets, so plugging in a random variable with values in produces a random set.
Let be a standard uniform random variable and define a function by
[TABLE]
for every . To be able to talk about the distribution of under , we first view as a measurable function by equipping with the -algebra constructed below.
For each , let
[TABLE]
and
[TABLE]
which is the complement of in . Let us denote by the set of all compact subsets of . Note that the collection is a -system on since and for every . Let be the -algebra generated by , called the Borel -algebra on ; the reader is referred to [11, Section 1.1] for a detailed discussion. Clearly, is also generated by .
We shall establish the measurability of with respect to .
Lemma 3**.**
The function is measurable with respect to and .
Proof. Let . Note that
[TABLE]
where the fourth equality and the well-definedness of maximum is by the upper semicontinuity of in Proposition 1(d), and the last equality follows since is measurable with respect to the Borel -algebra on and . Hence, by [1, Proposition I.2.3], it follows that for every , that is, is measurable.
Thanks to Lemma 3, is a random variable taking values in . Hence, its distribution under is the probability measure on defined by
[TABLE]
for every . Since is a -system which generates the -algebra , the distribution of is determined by its values on this -system; see [1, Proposition I.3.7], for instance. Since for every , the distribution of is also determined by the so-called capacity functional defined by
[TABLE]
for each .
Proposition 6**.**
The lower -distribution function and the capacity functional of the set-valued random variable determine each other.
Proof. Let . Following the calculation in the proof of Lemma 3, we have
[TABLE]
since has the standard uniform distribution. Hence, determines .
Conversely, let . The above calculation yields . Hence, determines .
Proposition 6 together with (6), (7) implies that the lower -quantile , the lower -distribution function , the inf-extension , the capacity functional , and the distribution determine each other.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Çınlar E, Probability and Stochastics. Springer Science and Business Media, 2011
- 2[2] Crespi G, Hamel AH, Rocca M, Schrage C, Set relations and approximate solutions in set optimization , ar Xiv:1812.03300, 2018
- 3[3] Davey BA, Priestley HA, Introduction to Lattices and Order. Cambridge University Press, 2nd edition, 2002
- 4[4] Doering A, Dewitt B, Self-adjoint operators as functions II: quantum probability , ar Xiv:1210.574v 2, 2012 (2nd version Dec. 2013)
- 5[5] Drapeau S, Hamel AH, Kupper M, Complete duality for quasiconvex and convex set-valued functions , Set-Valued and Variational Analysis, 24(2), 253-275, 2016
- 6[6] Faugeras OP, Rüschendorf L, Markov morphisms: a combined copula and mass transportation approach to multivariate quantiles , Mathematica Applicanda, 45(1), 21-63, 2017
- 7[7] Föllmer H, Schied A, Stochastic Finance: An Introduction in Discrete Time. Walter de Gruyter, 3rd edition, 2011
- 8[8] Hamel AH, Heyde F, Löhne A, Rudloff B, Schrage C, Set optimization–a rather short introduction . In: Hamel, A.H., Heyde, F., Löhne, A., Rudloff, B., Schrage, C. (eds.), Set optimization and applications – the state of the art. From set relations to set-valued risk measures, Springer-Verlag Berlin 2015, pp. 65-141
