The Empirical Content of Binary Choice Models
Debopam Bhattacharya

TL;DR
This paper establishes nonparametric shape-restrictions for binary choice models based on economic rationality, enabling credible demand and welfare predictions without arbitrary assumptions.
Contribution
It introduces global, closed-form shape-restrictions for binary choice probabilities derived from utility maximization, applicable across various models and data sets.
Findings
Shape-restrictions are equivalent to Slutsky-like conditions under heterogeneity.
Restrictions are global, not dependent on observed budget-sets.
Provide simple bounds for demand and welfare predictions.
Abstract
An important goal of empirical demand analysis is choice and welfare prediction on counterfactual budget sets arising from potential policy-interventions. Such predictions are more credible when made without arbitrary functional-form/distributional assumptions, and instead based solely on economic rationality, i.e. that choice is consistent with utility maximization by a heterogeneous population. This paper investigates nonparametric economic rationality in the empirically important context of binary choice. We show that under general unobserved heterogeneity, economic rationality is equivalent to a pair of Slutsky-like shape-restrictions on choice-probability functions. The forms of these restrictions differ from Slutsky-inequalities for continuous goods. Unlike McFadden-Richter's stochastic revealed preference, our shape-restrictions (a) are global, i.e. their forms do not depend on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The Empirical Content of Binary Choice Models††thanks: Keywords: Binary choice, general heterogeneity, income effect, utility
maximization, integrability/rationalizability, Slutsky inequality, shape-restrictions. JEL Codes: C14, C25, D12.
Debopam Bhattacharya
University of Cambridge The author would like to thank the Editor, three anonymous referees, Michael Floater, Arthur Lewbel, Oliver Linton and seminar participants at several institutions for helpful feedback. Financial support from the European Research Council via a Consolidator Grant EDWEL, Project number 681565 is gratefully acknowledged.
(September 17, 2020)
Abstract
An important goal of empirical demand analysis is choice and welfare prediction on counterfactual budget sets arising from potential policy-interventions. Such predictions are more credible when made without arbitrary functional-form/distributional assumptions, and instead based solely on economic rationality, i.e. that choice is consistent with utility maximization by a heterogeneous population. This paper investigates nonparametric economic rationality in the empirically important context of binary choice. We show that under general unobserved heterogeneity, economic rationality is equivalent to a pair of Slutsky-like shape-restrictions on choice-probability functions. The forms of these restrictions differ from Slutsky-inequalities for continuous goods. Unlike McFadden-Richter’s stochastic revealed preference, our shape-restrictions (a) are global, i.e. their forms do not depend on which and how many budget-sets are observed, (b) are closed-form, hence easy to impose on parametric/semi/non-parametric models in practical applications, and (c) provide computationally simple, theory-consistent bounds on demand and welfare predictions on counterfactual budget-sets.
1 Introduction
Many important economic decisions faced by individuals are binary in nature, including labour force participation, retirement, college enrolment, adoption of a new technology or health product, participation in a job-training program, etc. This paper concerns nonparametric analysis of binary choice under general unobserved heterogeneity and income effects. The paper has two goals. The first is to understand, theoretically, what nonparametric restrictions does utility maximization by heterogeneous consumers impose upon choice-probabilities, i.e. whether there are analogs of Slutsky restrictions for binary choice under general unobserved heterogeneity and income effects, and conversely, whether these restrictions are also sufficient for observed choice-probabilities to be rationalizable. This issue is important for logical coherency between theory and empirics and for prediction of demand and welfare in situations involving counterfactual, i.e. previously unobserved, budget sets. It is important in these exercises to allow for general unobserved heterogeneity because economic theory typically does not restrict its dimension or distribution, and does not specify how it enters utility functions. To date, closed-form Slutsky conditions for rationalizability of demand under general heterogeneity were available only for continuous choice. The present paper, to our knowledge, is the first to establish them for the leading case of discrete demand, viz. binary choice.
The second goal of the present paper is a practical one. It is motivated by the fact that in empirical applications of binary choice, requiring the estimation of elasticities, welfare calculations and demand predictions, researchers typically use parsimonious functional-forms for conditional choice probabilities. This is because fully nonparametric estimation is often hindered by curse of dimensionality, the sensitivity of estimates to the choice of tuning parameters and insufficient price variation, especially in consumer data from developed countries. The question therefore arises as to whether the economic theory of consumer behavior can inform the choice of such functional forms. Answering this question is our second objective.
Since McFadden 1973, discrete choice models of economic behavior have been studied extensively in the econometric literature, mostly under restrictive assumptions on utility functions and unobserved heterogeneity including, inter alia, quasi-linear preferences implying absence of income effects and/or parametrically specified heterogeneity distributions (c.f. Train 2009 for a textbook treatment). Matzkin (1992) investigated the nonparametric identification of binary choice models with additive heterogeneity, where both the distribution of unobserved heterogeneity and the functional form of utilities were left unspecified. More recently, Bhattacharya (2015, 2018) has shown that in discrete choice settings, welfare distributions resulting from price changes are nonparametrically point-identified from choice probabilities without any substantive restriction on preference heterogeneity, and even when preference distribution and heterogeneity dimension are not identified.
In the present paper, we consider a setting of binary choice by a population of budget-constrained consumers with general, unobserved heterogeneity, producing an individual-level cross-sectional dataset that records prices, individual income and the choice made by the individual.111As a referee has correctly commented, income plays a prominent role in this paper, unlike many existing empirical applications which ignore the role of income. In this setting, we develop a characterization of utility maximization which takes the form of simple, closed-form shape restrictions on choice probability functions in the population. These nonparametric shape-restrictions can be consistently tested in the usual asymptotic econometric sense and are extremely easy to impose on specifications of choice-probabilities – akin to testing or imposing monotonicity of regression functions. Most importantly, they lead to computationally simple bounds for theory-consistent demand and welfare predictions on counterfactual budgets sets – an important goal of empirical demand analysis. Interestingly, our shape-restrictions differ in form from the well-known Slutsky inequalities for continuous goods.
The above results are developed in a fully nonparametric context; nonetheless, they can help guide applied researchers intending to use simple parametric or semiparametric models. As a specific example, consider the popular probit/logit type model for binary choice of whether to buy a product or not. A standard specification is that the probability of buying depends (implicitly conditioning on other observed covariates) on its price and the decision-maker’s income , e.g. , where is a distribution function. We will show below that these choice-probabilities are consistent with utility maximization by a heterogenous population of consumers, if and only if , and . While the first inequality simply means that demand falls with own price (holding income fixed), the second inequality is less obvious, and constitutes an important empirical characterization of utility maximization.
For the case of continuous goods, Lewbel 2001 explored the question of when average demand, generated from maximization of heterogeneous individual preferences, satisfies standard properties of non-stochastic demand functions. More recently, for the case of two continuous goods (i.e. a good of interest plus the numeraire) under general heterogeneity, Dette, Hoderlein and Neumayer 2016 have shown that constrained utility maximization implies quantiles of demand satisfy standard Slutsky negativity, and Hausman and Newey 2016 have shown that the two are in fact equivalent. The analog of the two goods setting in discrete choice is the case of binary alternatives. Accordingly, our main result (Theorem 1 below) may be viewed as the discrete choice counterpart of Hausman and Newey 2016, Theorem 1. Note however that quantiles are degenerate for binary outcomes, and indeed, the forms of our Slutsky-like shape restrictions are completely different from Dette et al and Hausman-Newey’s quantile-based conditions for continuous choice.
An alternative, algorithmic – as opposed to closed-form and analytic – approach to rationalizability of demand is the “revealed stochastic preference” (SRP, henceforth) method, which applies to very general choice settings where a heterogeneous population of consumers faces a finite number of budget sets, c.f. McFadden and Richter 1990, McFadden 2005. When budget sets are numerous or continuously distributed, as in household surveys with many income and/price values, SRP is well-known to be operationally prohibitive, c.f. Anderson et al 1992, Page 54-5 and Kitamura and Stoye 2016, Sec 3.3. Furthermore, the SRP conditions are difficult to impose on parametric specifications commonly used in practical applications, they change entirely in form upon addition of new budget sets, and are cumbersome to use for demand prediction on counterfactual budgets, especially in welfare calculations that typically require simultaneous prediction of demand on a continuous range of budget-sets. In contrast, our approach yields rationality conditions which (a) are global, in that they characterize choice probability functions, and their forms remain invariant to which and how many budget sets are observed in a dataset, and (b) are closed-form, analytic shape-restrictions, hence easy to impose, standard to test, and simple to use for the important practical problem of counterfactual predictions of demand and welfare. As such, these shape-restrictions establish the analogs of Slutsky conditions – the cornerstone of classical demand analysis – for binary choice under general unobserved heterogeneity and income effects.
2 The Result
Consider a population of heterogeneous individuals, each choosing whether or not to buy an indivisible good. Let represent the quantity of numeraire which an individual consumes in addition to the binary good. If the individual has income , and faces a price for the indivisible good, then the budget constraint is where represents the binary choice. Individuals derive satisfaction from both the indivisible good as well as the numeraire. Upon buying, an individual derives utility from the good but has a lower amount of numeraire left; upon not buying, she enjoys utility from her outside option and a higher quantity of numeraire . There is unobserved heterogeneity across consumers which affect their choice, and so on each budget set defined by a price and consumer income , there is a (structural) probability of buying, denoted by ; that is, if each member of the entire population were offered income and price , then a fraction would buy the good. For now, we implicitly condition our analysis on observed covariates, and later show how to incorporate them into the results. We will show that these choice probabilities will be consistent with utility maximization by a heterogeneous population if and only if the following Slutsky-like conditions222Our main result does not need smoothness; we write the conditions with derivatives here to show the Slutsky-like form of the result. hold:
[TABLE]
For establishing this result, it will be convenient to rewrite the choice probabilities in an equivalent way as . Indeed, one can go back and forth between the two specifications because and . The formulation is motivated by the fact that given the budget set , an individual faces choice between the bundles and ; thus is an equivalent representation of choice probabilities as functions of the income left over upon choosing options 0 and 1, respectively. For ease of exposition, we will state our results in terms of , and show that under smoothness they reduce to restriction (1) on .
The following theorem establishes conditions that are necessary and sufficient for the conditional choice probability function to be generated from utility maximization by a heterogeneous population, where no a priori restriction is imposed on the dimension and functional form of the distribution of unobserved heterogeneity or on the functional form of utilities.
To formally state the theorem, we introduce some notation. Let denote the support of ; let denote the support of , and for any let . Corresponding to the support of , denote the support of by , as short-hand for .
Theorem 1
For binary choice under general heterogeneity, the following two statements are equivalent:
(I) The structural choice probability function satisfies that (A) (i) is non-increasing, and (ii) is non-decreasing; (B) is continuous; (C) corresponding to any fixed value , there exist a small enough real number , satisfying and a large enough real number , satisfying .
(II) There exists a pair of utility functions and , where the first argument denotes the amount of numeraire, and denotes unobserved heterogeneity, and a distribution of such that
[TABLE]
where (A’) for each fixed , (i) is continuous and strictly increasing, and (ii) is non-decreasing; (B’) for any , it holds that ; (C’) corresponding to any fixed , there exist a small enough real number and a large enough real number , satisfying and .
Proof. In Appendix
The key step in the proof is showing that (I) implies (II). This is done by constructing the utility functions and with denoting a suitably defined inverse of the function with respect to its first argument, and the random variable . Under conditions A, B, C of Theorem 1, this construction is then shown to imply that . The formal proof appears in the Appendix.
Interpretation of conditions: Intuitively, conditions (A/A’) mean that having more numeraire ceteris paribus is (weakly) better for every consumer, i.e. preferences are increasing in the amount of income left over after any choice. Condition (B/B’) – the “no-tie” assumption – is standard in discrete choice models, and intuitively means that there is a continuum of tastes. Condition (C) adds to condition (A); it says that holding fixed the income left over upon choosing option 1, if the income left over upon choosing option 0 is, hypothetically*,* made small enough, then everyone, i.e. all , will choose option 1. In particular, means that starting from a situation with , we are lowering and by equal amounts, keeping , i.e. the income left over upon choosing option 1, fixed at while , the income left over upon choosing option 0, is lowered toward , i.e., . A symmetric interpretation applies to . The following examples illustrate Condition C.
Example 1 (High and Low Price): Suppose denote respectively not buying and buying a binary good. Suppose preferences are such that at any income , if price takes a high enough value , e.g. close to the highest income in the population, no one would buy the good; conversely, when price takes a low enough value , e.g. the good is free () or there is a high enough reward for choosing option 1 (i.e. ) as in conditional cash transfer programs for school-attendance, everyone (i.e. all ) will choose option 1. Then starting from , raising towards while simultaneously increasing by equal amount keeping , the income left upon buying, fixed at , we have that ; similarly, letting and while keeping fixed , we have that . Thus , and .
**Example 2 **(Labour supply): Suppose denote not working and working, respectively, is non-labour income (e.g. spousal earning or interest income from investment), and is the negative of net wage received upon working, so that . Here it is natural to assume that if non-labour income is zero, then an individual must work at any positive net wage for subsistence, so that , and thus for any positive . Similarly, if net wage is zero, then no one with positive non-labour income will work, i.e. , and thus .
Remark 1
Condition C/C’, which simplify the proof of the Theorem, can be dropped. In the appendix, we provide an alternative version of the theorem without conditions (C/C’), but with a slightly stronger continuity requirement (B/B’) and a significantly longer proof.
Remark 2
Note that assumptions (A)-(C) place no restriction on income effects, including its sign.
In statement (II) in Theorem 1, the functions will correspond to the utility from choosing alternative and being left with a quantity of the numeraire, and with denoting unobserved heterogeneity. This notation allows for the case where different vectors of unobservables enter the two utilities, i.e. where the utilities are given by and , respectively, with ; simply set , , . In the proof of the above theorem, when showing (II) implies (I), will be allowed to have *any arbitrary and unknown *dimension and distribution; in showing (I) implies (II) we will construct a scalar heterogeneity distribution that will rationalize the choice probabilities (see further discussion on this point under the heading ”Observational Equivalence” in the next section).
3 Further Discussion
A. Slutsky Form: To see the analogy between the shape restrictions in Theorem 1 and the traditional Slutsky inequality constraints with smooth demand, rewrite the choice probability on a budget set in the standard form as a function of price and income, viz. i.e., . Then, under continuous differentiability, the shape restrictions (A) from Theorem 1 are equivalent to
[TABLE]
for all .333I am grateful to a referee for suggesting this way of showing the equivalence. The forms of these inequalities are distinct from textbook Slutsky conditions for nonstochastic demand for a continuous good, which are given by
[TABLE]
For a continuous good and under general unobserved heterogeneity, Dette, Hoderlein and Neumeyer 2016 (building on earlier work of Hoderlein 2011), and Hausman and Newey 2016 show that (4) also holds with denoting any quantile of the demand distribution for fixed . Thus, for binary choice with general heterogeneity, the forms of the Slutsky inequality (2) and (3) are different from the continuous choice counterpart (4).444Bhattacharya, 2015 (see also Lee and Bhattacharya, 2018) noted that (2) (resp, (3)) is necessary for the CDF of equivalent variation (resp., compensating variation) resulting from price-changes to be non-decreasing. In particular, the inequalities (2) and (3) are linear in (and ), unlike (4), and hence easier to impose on nonparametric estimates of using, say, shape-preserving sieves that guarantee that , and for all .
Remark 3
It is tempting to think of (2) and (3) as (4) with the level replaced by 0 and 1 corresponding to either of the two possible individual choices. However, this interpretation is incorrect, since is average demand, and takes values strictly inside . In other words, is neither a quantile, nor individual demand at price and , and generically (e.g. in a probit model) does not take the values of 0 and 1. Thus (2) and (3) cannot be rewritten as
[TABLE]
and, as such, are different from the continuous choice counterpart (4).
Remark 4
Our rationality conditions (A) take the form of simple monotonicity restrictions on the regression function . There are several papers in the Statistics literature on testing monotonicity of nonparametrically estimated regressions, e.g. Ghosal et al 2000, Hall and Heckman 2000, Chetverikov 2012, etc. which can therefore be used here.
B. Observational Equivalence: The construction in our proof of (II) (I) shows that a rationalizable binary choice model with general heterogeneity of unspecified dimension is observationally equivalent to one where a scalar heterogeneity enters the utility function of one of the alternatives in a monotonic way, and the utility of the other alternative is non-stochastic.555For quantile demand in the continuous case, a result of similar spirit is discussed in Hausman-Newey, 2016, Page 1228-9, following Theorem 1. In general, a result holding for the continuous case with two goods does not necessarily imply that it also holds for the binary case. For example, welfare related results are different for the binary and the two-good continuous case, c.f. Hausman-Newey 2016, and Bhattacharya 2015, and so are Slutsky negativity conditions, as discussed above. An intuitive explanation of this equivalence is that in the binary case, choice probabilities are determined solely by the marginal distribution of reservation price (given income) for alternative 1, and not the relative ranking of individual consumers in terms of their preferences within that distribution. So, as income varies, choice probabilities change only insofar as the marginal distribution of the reservation price changes, irrespective of how individual consumers’ relative positions change within that distribution.
It is worth pointing out here that a binary choice model with additive scalar heterogeneity – the so-called ARUM model – is restrictive, and not observationally equivalent to a binary choice model with general heterogeneity. To see this, suppose choice probabilities are generated via the ARUM model, viz.
[TABLE]
Assuming smoothness and strict monotonicity of , and , and thus of , it follows that
[TABLE]
for every and . This equality is obviously not true for a general smooth and strictly monotone satisfying conditions (A)-(C) of Theorem 1.
Remark 5
The construction of in our proof of (II) (I) is unrelated to the almost sure representation of a continuous random variable as with , where and denote the CDF and quantile function of , and is distributed . Indeed, if we were to apply this so-called ”probability-integral transform” to for a fixed , we will have , where the scalar-valued uniform process will vary with , unlike in the proof of our theorem above, and therefore cannot represent unobserved heterogeneity in consumer preferences. In other words, our constructed will not equal the data generating process almost surely, but the probability that will equal the probability that for all .
C. Giffen Goods: Our rationalizability condition (2) says that own price effect on average demand is negative. This condition has no counterpart in the continuous case, appears to rule out Giffen behavior and may, therefore, appear restrictive. We now show that that is not the case: indeed, Giffen goods cannot arise in binary choice models if utilities are non-satiated in the numeraire. To see this, let the utility of options [math] and be given by and as in Theorem 1 above. Now note that if option 1 is Giffen for an type consumer with income , then for some prices she buys at price but does not buy at . Therefore,
[TABLE]
which is a contradiction, since is strictly increasing. In contrast, consider a continuous good with utilities , where denotes the quantity of the continuous good, and is increasing in both arguments. Now it is possible that is bought at price and is bought at price with and . That is, we can have
[TABLE]
if is preferred sufficiently over . The intuitive reason for this difference between the discrete and the continuous case is that in the former, the only non-zero option is 1. Indeed, in the continuous case, it is also not possible that for any common if .
Also, note that although Giffen behavior cannot arise in binary choice, there is no restriction on the sign of the income effect. Indeed, (2) and (3) are compatible with both and .
D. Parametric and Semiparametric Models: For a probit/logit specification of the buying decision, viz.
[TABLE]
where is a strictly increasing CDF, the shape restrictions of Theorem 1 amount to requiring and . While the first inequality is intuitive, and simply says that own price effect is negative, the second condition is not a priori obvious, and shows the additional restriction implied by budget-constrained utility maximization. Now, applying Theorem 1, we obtain
[TABLE]
where ,666We implicitly assume that for fixed , the function varies with somewhere on , and thus . implying the rationalizing utility functions
[TABLE]
Remark 6
Note that since the restrictions and are linear in parameters, it is computationally straightforward to maximize a globally concave likelihood, such as probit or logit, subject to these constraints.
The above discussion also applies to semiparametric binary choice models (c.f. Manski 1975, Han 1987, Klein and Spady 1993) where one need not specify the exact functional form of . For example, the methods of Cavanagh and Sherman (1998) and Bhattacharya (2008), which only utilize the strict monotonicity of the CDF , can be applied to estimate the binary choice model, subject to our sign restriction and standard scale-normalization, viz. and , i.e. using the specification that is a strictly increasing function of the linear index with .
E. Random Coefficients: An alternative parametric specification in this context is a random coefficient structure, popular in IO applications. It takes the form
[TABLE]
where and are now random variables with joint distribution , indexed by an unknown parameter vector , and is a specified CDF (e.g. a probit or logit). Theorem 1 then implies that the distribution must be such that the choice probability function satisfies and . One way to guarantee this would be to specify the support of and of to lie in . Using Theorem 1, a utility structure that would rationalize such a model is:
[TABLE]
where , and is .777Note that an alternative preference distribution producing the same choice probabilities is given by , , , , , , w.p.1. This shows that the rationalizing preference distribution may not be unique.
It also follows from the above discussion that not every distribution of random coefficients will lead to rationalizable choice-probability functions. In particular, the commonly used assumption that is bivariate normal (so that the support of and of do not lie in ), can lead to choice probability functions that would violate the shape restrictions of Theorem 1, and thus are not rationalizable.888As a numerical illustration, consider a random coefficient probit model
where , and , implying each of the probabilities of and exceeds 0.9999. Yet it can be verified numerically that e.g.
F. Observed Covariates: One can accommodate observed covariates in our theorem. For example, let denote a vector of observed covariates, and let denote the choice probability when , and . If for each fixed , satisfies the same properties as (I) A-C in the statement of Theorem 1, then letting
[TABLE]
we can rationalize the choice probabilities by setting and , where .
G. Endogeneity: Our results in Theorem 1 are stated in terms of structural choice probabilities . If budget sets are independent of unobserved heterogeneity (conditional on observed covariates), then these structural choice probabilities are equal to the observed conditional choice probabilities, i.e.,
[TABLE]
Early results on rationalizability of demand under heterogeneity, including McFadden and Richter 1990 and Lewbel 2001 worked under such independence. If the independence condition is violated (even conditional on observed covariates), then Theorem 1 continues to remain valid as stated, since it concerns the structural choice probability , but consistent estimation of will be more involved. In applications, if endogeneity of budget sets is a potential concern, then it would be advisable to estimate structural choice-probabilities using methods for estimating average structural functions. A specific example is the method of control functions, c.f. Blundell and Powell 2003, 2004 and Imbens and Newey 2009, which require that , where is an estimable “control function” – typically a first stage residual from a regression of endogenous covariates on instruments. The structural choice probability function can then be recovered (under regularity conditions) as the integral of the conditional choice probability given and realizations of the control variable over the marginal distribution of . Hoderlein 2011, Hoderlein and Stoye 2014, Hausman and Newey 2016, and Kitamura and Stoye 2018 have previously discussed using control functions to estimate demand nonparametrically.
4 Empirical Implications
A practical implication of Theorem 1 is that it can be used to bound predicted choice probabilities on counterfactual, i.e. previously unobserved, budget-sets, e.g. those arising from a potential policy intervention. Such predictions are more reliable when made nonparametrically, i.e. without arbitrary functional-form/distributional assumptions on unobservables, and instead based solely on economic rationality. We now show how to obtain these nonparametric bounds using Theorem 1.
Counterfactual Demand Bounds: Let denote the domain of definition of . Let denote the set of observed in the data, with corresponding choice probabilities , satisfying condition (A) of our Theorem. Suppose we are required to predict the probability of buying at a counterfactual (i.e. previously unobserved) price and income with . Then Theorem 1 implies the following bounds on this choice probability:
[TABLE]
The above calculation is extremely simple; for example, the lower bound requires collecting those observed budget sets in the data that satisfy (a one-line command in STATA), evaluating choice probabilities on them, and sorting these values.
Note also that for all , we have that .
Proposition 1
The bounds (9) and (12) are sharp.
Proof of Proposition. Define . Set for any belonging to the interval defined by the bounds in (9) and (12). Then the elements of the set satisfy the shape restrictions (A) of Theorem 1 on . In particular, if satisfies , then
[TABLE]
on the other hand, if satisfies , then
[TABLE]
Next, note that conditions (B) and (C) of our theorem have no empirical content vis-a-vis the countably finite set of values , in that there are no set of values which can imply a violation of conditions (B) and (C). Therefore, the choice probabilities corresponding to are compatible with a choice probability function on a domain containing and satisfying conditions (A)-(C) of Theorem 1 (for an explicit construction of such a function, see discussion on discrete support of in the paragraph preceding Theorem 1 above). Therefore, applying Theorem 1, we conclude that there exist utility functions and with that satisfy the restrictions (A’)-(C’) of Theorem 1, and for all ; in particular,
[TABLE]
Welfare bounds: Given bounds on choice probabilities, one can obtain lower and upper bounds on economically interesting functionals thereof, such as average welfare. For example, the average compensating variation – i.e. utility preserving income compensation – corresponding to a price increase from to at income is given by (c.f. Bhattacharya 2015). This requires prediction of demand on a continuum of budget sets, viz. . Now, it follows from our discussion immediately above, and by Theorem 1, that pointwise sharp bounds on are given by
[TABLE]
This implies that average CV at is bounded below by , and above by .
As for sharpness, let be defined analogous to above. Then the lower bound on average CV becomes . Now, by definition,
[TABLE]
is non-increasing in and non-decreasing in , and when . Furthermore, for fixed value of , as varies over the interval , the function can assume at most finitely many values (viz. , ), and therefore, must necessarily be piecewise flat in , with at most countably finite number of discontinuity points. Therefore, one can construct a function (see footnote below for an illustration) that (1) is continuous in the first argument, (2) equals (and therefore ) on , (3) equals everywhere else on the domain except in arbitrarily small (semi-closed) intervals around the points of discontinuity of , and (4) satisfies the same shape restrictions as ; also, (5) can be trivially made to satisfy the limit conditions (C) of Theorem 1 by defining the limit points , lower than the lowest and larger than the highest values respectively attained by in corresponding to any fixed value of . Using (1), (4) and (5) and applying Theorem 1, we can rationalize – which equals at all the observed data points, i.e. corresponding to – via a pair of utility functions and a uniformly distributed unobserved heterogeneity, and at the same time, , is arbitrarily close to , since they differ only on at most finitely many intervals of arbitrarily small length. Therefore, is a sharp lower bound for average CV .999As a simple illustration, consider a fixed , and suppose the point , and for some real numbers belonging to the interval where the first argument of takes its values as varies over . Now suppose the lower bound function satisfies
L\left(a_{0},a_{1}\right)=\left\{\begin{array}[]{l}q\left(k,a_{1}\right)\text{ if }l\leq a_{0}\leq k\\ L\left(k^{+},a_{1}\right)\text{ if }k<a_{0}\leq u\end{array}\right.
with . That is, equals at the point in , is non-increasing in the first argument and is (right) discontinuous at with . Choose and define the function as
Q\left(a_{0},a_{1}\right)=\left\{\begin{array}[]{l}L\left(k,a_{1}\right)\text{, if }l\leq a_{0}\leq k\\ L\left(k,a_{1}\right)\times\left[1-\frac{a_{0}-k}{\delta}\right]+L\left(k^{+},a_{1}\right)\frac{a_{0}-k}{\delta}\text{ if }k<a_{0}\leq k+\delta\\ L\left(k^{+},a_{1}\right)\text{, if }k+\delta<a_{0}\leq u\end{array}\right.
Then (1) is continuous in the first argument, since as , and as , (2) at the point , , (3) equals except on the semi-open interval of length , (4) is non-increasing, and . is non-decreasing since is non-increasing, and . is non-decreasing. Finally, equals the area of the triangle with base and height thus equalling which can be made arbitrarily close to 0 by choosing arbitrarily close to 0.
A symmetric line of argument implies that is the sharp upper bound.
5 Connection with Revealed Stochastic Preference
The welfare calculation above requires prediction of demand on a continuum of budget sets indexed by , which is operationally difficult – if not practically impossible – to implement, using the finite-dimensional matrix equation based SRP approach. But in simple cases where there are a small, countably finite number of budget sets, and it is easy to verify the SRP conditions, a natural question is whether our shape restrictions (A) of Theorem 1 are compatible with the SRP based criterion for rationalizability; condition (B) and (C) of Theorem 1 are of course irrelevant in such cases. Below, we show that our shape restrictions (A) are in fact necessary for the SRP criterion to be satisfied.
Proposition 2
The shape restrictions (A) in Theorem 1 are necessary for McFadden Richter’s SRP conditions to hold.
Proof. Consider two price and income combinations and . Suppose WLOG that , i.e., . Let , denote choice probabilities of alternative 1 on the two budgets, respectively. Assume, if possible, that out shape restriction A(ii) is violated, so that . We will show that this implies violation of McFadden-Richter’s SRP condition. Toward that end, consider three bundles and . Under nonsatiation in numeraire, there are 3 possible preference profiles in the population, given by (i) , (ii) and (iii) ; assume the population proportions of these three profiles are , respectively. Then McFadden-Richter’s SRP condition is that the matrix equation
[TABLE]
has a solution in the unit positive simplex. But if our hypothesis holds, i.e. , then (29) implies i.e. , a violation.
Next, consider the two price and income combinations and with and , say. Let , denote choice probabilities of alternative 1 on the two budgets, respectively. Now suppose our shape restriction A(i) is violated, so that . Consider the three bundles , and . Under nonsatiation, there are 3 possible preference profiles in the population, given by (i) , (ii) and (iii) ; assume the population proportions of these three profiles are , respectively. Then SRP requires a solution in the unit positive simplex to
[TABLE]
But implies that implying , which is a violation of lying in the unit positive simplex.
With more budget sets, the corresponding higher dimensional matrix equations analogous to (29) and (38) quickly become operationally impractical and cumbersome, as is well-known in the literature (see introduction). In contrast, our shape-restrictions, by being global conditions on the functions, remain invariant to which and how many budget sets are considered. Furthermore, we already know via Theorem 1 above, that these shape restrictions are also sufficient for rationalizability for any collection – finite or infinite – of budget sets.101010It does not seem possible to show directly, i.e. without using Theorem 1, that our shape restrictions are also sufficient for existence of admissible solutions to the analog of (29) and (38) corresponding to every arbitrary collection of budget sets. But given theorem 1, this exercise is probably of limited interest.
Appendix
1. Proof of Theorem 1
Proof. That (II) implies (I) is straightforward. In particular, letting denote the inverse of , we have that
[TABLE]
whence (B’) implies (B), (C’) implies (C), and (A’) implies (A).
We now show that (I) implies (II).
Note that (C) implies that for any and , the set is non-empty; for any fixed and for , define
[TABLE]
which takes values in .111111Here we are implicitly assuming that equals (or contains) . If however the support of price and income are discrete, then can be a strict subset of . Then is not defined at the points ‘in between’ the points of support, and therefore, in (39) is not well-defined. To cover this case, one can extend to a continuous function defined on a rectangle containing such that (i) equals on , (ii) satisfies the same shape restrictions on that are satisfied by on , and (iii) satisfies the limit conditions C of Theorem 1. In the online appendix, we provide an explicit construction of such a function. The proof of Theorem 1 then holds with , and equalling their corresponding extensions in the case where have discrete support. Also, by condition (A), must be non-decreasing.
Now, consider a random variable . Define and . We will now show that and will rationalize the choice-probabilities , and satisfy properties (A’)-(C’) of our theorem.
To do so, first note that for any fixed , the function is a continuous CDF by conditions A(i), B and C of the theorem, and is, by definition, the corresponding th quantile. Standard properties of quantiles, c.f. Pfeiffer 1990, Sec 11a, Page 266-7, then imply the following three results (for completeness, we state and prove these results formally as a Claim below this proof):
Result (i): for any and , we must have that (Pfeiffer 1990, page 267, property 6);
Result (ii): for any , and , we have (Pfeiffer 1990 page 266 property 1);
Result (iii): for any , the function is one-to-one on (Consequence of Result (i)).
Now, for , it follows from Result (ii) that
[TABLE]
Therefore, the utility functions and with heterogeneity rationalize the choice probabilities , and satisfy all the properties specified in panel (II) of Theorem 1. In particular, is non-decreasing in (see right after eqn. (39)), so (A’ii) holds; trivially satisfies (A’i). Next, for with , we cannot have that by Result (iii); therefore,
[TABLE]
which implies property (B’). Finally,
[TABLE]
By an analogous argument, , thus satisfying (C’).
2. Proof of Results (i), (ii) and (iii) in Theorem 1
Claim: Suppose satisfies conditions (A), (B), (C) of Theorem 1, and is as defined in (39). Then (i) for any and , we must have that ; (ii) for any , and any , we have that ; (iii) for any , the function is one-to-one on .
Proof. **Claim (i): **Pick . For , we cannot have that , since takes values in . So let , and suppose if possible that . Note that because if , then . Therefore, implies by the continuity condition (B) that there must exist such that for all . But by condition (A) and the definition of as the supremum in (39), we must have that for all , and in particular for , which contradicts .
Next, for , we cannot have that , since takes values in . So let and suppose . Condition (B) and (C) imply via the intermediate value theorem that , such that . But by hypothesis, , so (A) implies that , which, together with , contradicts being the supremum in (39). Therefore, for all , and Claim (i) is proved.
Claim (ii): To prove claim (ii), note that for any , and any ,
[TABLE]
Also, by definition of as the supremum in (39), we have by (A) that
[TABLE]
Therefore, from (42) and (43), we have that , which proves claim (ii).
**Claim (iii): **To prove claim (iii), note that for with , we cannot have that ; otherwise,
[TABLE]
contradicting .
References
Anderson, S.P., De Palma, A. and Thisse, J.F. 1992. Discrete choice theory of product differentiation. MIT press. 2. 2.
Bhattacharya, D. 2015. Nonparametric welfare analysis for discrete choice. Econometrica, 83(2), pp.617-649. 3. 3.
Bhattacharya, D. 2018. Empirical welfare analysis for discrete choice: Some general results. Quantitative Economics, 9(2), pp.571-615. 4. 4.
Bhattacharya, D. 2008. A Permutation-Based Estimator for Monotone Index Models. Econometric Theory 24(3), pp.795-807. 5. 5.
Blundell, R., and James L. Powell (2003): Endogeneity in nonparametric and semiparametric regression models. Econometric society monographs 36, 312-357. 6. 6.
Blundell, R.W. and Powell, J.L. (2004): Endogeneity in semiparametric binary response models. The Review of Economic Studies, 71(3), 655-679. 7. 7.
Cavanagh, C. and Sherman, R.P. (1998): Rank estimators for monotonic index models. Journal of Econometrics, 84(2), 351-382. 8. 8.
Chetverikov, D. (2012): Testing regression monotonicity in econometric models. Econometric Theory, 1-48. 9. 9.
Costantini, P. and Fontanella, F. (1990): Shape-preserving bivariate interpolation. SIAM Journal on Numerical Analysis 27(2), 488-506. 10. 10.
Dette, H., Hoderlein, S. and Neumeyer, N. (2016): Testing multivariate economic restrictions using quantiles: the example of Slutsky negative semidefiniteness. Journal of Econometrics 191(1), 129-144. 11. 11.
Ghosal, S., Sen, A. and Van Der Vaart, A.W. (2000): Testing monotonicity of regression. The Annals of Statistics 28(4), 1054-1082. 12. 12.
Hall, P. and Heckman, N.E. (2000): Testing for monotonicity of a regression mean by calibrating for linear functions. The Annals of Statistics 28(1), 20-39. 13. 13.
Han, A.K. (1987): Non-parametric analysis of a generalized regression model: the maximum rank correlation estimator. Journal of Econometrics, 35(2-3), 303-316. 14. 14.
Hausman, J.A. and Newey, W.K. (2016): Individual heterogeneity and average welfare. Econometrica, 84(3), 1225-1248. 15. 15.
Hoderlein, S. (2011): How many consumers are rational?, Journal of Econometrics 164(2), 294-309. 16. 16.
Hoderlein, S. and Stoye, J. (2014): Revealed preferences in a heterogeneous population. Review of Economics and Statistics, 96(2), 197-213. 17. 17.
Imbens, G.W. and Newey, W.K. (2009): Identification and estimation of triangular simultaneous equations models without additivity. Econometrica, 77(5), 1481-1512. 18. 18.
Kitamura, Y. and Stoye, J. (2016): Nonparametric analysis of random utility models, Econometrica, 86(6), 1883-1909. 19. 19.
Klein, R.W. and Spady, R.H. (1993): An efficient semiparametric estimator for binary response models. Econometrica, 387-421. 20. 20.
Lee, Y.Y. and Bhattacharya, D. (2019): Applied welfare analysis for discrete choice with interval-data on income, Journal of econometrics 211, no. 2, 361-387. 21. 21.
Lewbel, A. (2001): Demand Systems with and without Errors. American Economic Review, 611-618. 22. 22.
McFadden, D. (1973): Conditional logit analysis of qualitative choice behavior. 23. 23.
McFadden, D. and Richter, M.K. (1990): Stochastic rationality and revealed stochastic preference. Preferences, Uncertainty, and Optimality, Essays in Honor of Leo Hurwicz, Westview Press, 161-186. 24. 24.
McFadden, D. (2005): Revealed Stochastic Preference: A Synthesis. Economic Theory 26(2): 245–264. 25. 25.
Manski, C.F. (1975): Maximum score estimation of the stochastic utility model of choice. Journal of econometrics, 3(3), 205-228. 26. 26.
Matzkin, R.L. (1992): Nonparametric and distribution-free estimation of the binary threshold crossing and the binary choice models. Econometrica, 239-270. 27. 27.
Pfeiffer, P.E. (1990): Probability for applications. Springer Science & Business Media. 28. 28.
Train, K.E. (2009): Discrete choice methods with simulation. Cambridge University Press.
Online Appendix
Abstract: This online appendix contains: (i) the construction of the continuous extension of the choice probability function to a domain containing , as mentioned in Footnote 11 in the proof of Theorem 1, and (ii) a version of Theorem 1 (called Theorem 2) with proof that does not require the limit conditions C/C’ of Theorem 1, but involves a slight strengthening of the continuity conditions B/B’.
1. Construction of Continuous Extension of Choice Probability Function
In the proof of Theorem 1, the definition of in (39) implicitly assumes that equals (or contains) . If however the support of price and income are discrete, then can be a strict subset of . Then is not defined at the points ‘in between’ the points of support, and therefore, in (39) is not well-defined. To cover this case, one can extend to a continuous function defined on a rectangle containing such that (i) equals on , (ii) satisfies the same shape restrictions on that are satisfied by on , and (iii) satisfies the limit conditions C of Theorem 1. The proof of Theorem 1 then holds with , and equalling their corresponding extensions in the case where have discrete support. Here we provide an explicit construction that achieves this extension.121212Alternatively, one can construct as a smooth, tensor-product polynomial spline with coefficients chosen to satisfy the shape restrictions and a high enough degree to guarantee that passes through the interpolating points , along the lines of Costantini and Fontanella 1990.
Suppose the support of is the discrete set , with and . Suppose the choice probability , which is defined for , satisfies the shape constraints (A) of Theorem 1, i.e. is non-increasing in the first and non-decreasing in the second argument. We want to construct an extension of , denoted by , which is (i) defined for all with and , (ii) equals for , and (iii) satisfies all three conditions A, B, C of Theorem 1. The construction proceeds in three steps.
Step 1: First, we extend to the rectangular grid
[TABLE]
To do this, define as:
[TABLE]
where is arbitrary, and for any
[TABLE]
Note that , which is well defined on all of , satisfies the shape constraints (A) of Theorem 1. This is because the set is decreasing in for fixed , and increasing in for fixed , so is decreasing in the first and increasing in the second argument; an analogous argument works for . Furthermore, if , then
[TABLE]
whence the shape restrictions on imply that , and hence . Note, however, that does not satisfy the continuity condition (B) and the limit conditions (C) of Theorem 1.
Step 2: The second step is to extend to a continuous function on the entire rectangle , satisfying the shape constraints (A) of theorem 1, while also satisfying the interpolation conditions for . This is done using bilinear shape-preserving interpolation as follows.
Recall , and define with to be the ordered values of the set . We can have if for some , it holds that . For each , , and for , let
[TABLE]
where is defined in (44).
Step 3: The last step in the construction is to extend beyond to ensure that the limit conditions (C) of Theorem 1 are satisfied. To do this, choose any pair of real numbers s.t. and . Let
[TABLE]
For any , define
[TABLE]
That is for , is the negatively sloped straight line joining to , and for , is the negatively sloped straight line joining to .
Proof that equals for and satisfies conditions (A), (B) ,(C) of Theorem 1: To see the first assertion, observe that at the grid points , , we get from (47) that , so that . We have already seen that for , . Now, since implies , putting these two conclusions together, we get that for , it holds that .
As for the continuity condition (B) of Theorem 1, observe that holding fixed , as , we have that whence from (47), it follows that
[TABLE]
On the other hand, for the same and for , we have that which at equals [math], whence from (47) with replaced by and replaced by , we get
[TABLE]
which equals (49). Therefore, for fixed , is simply a piecewise linear function of joined at the end-points , and therefore continuous in for . For , continuity is obvious from (48) and the fact that and . An analogous argument shows that is also continuous in for fixed (this property is not needed to prove Theorem 1 but is used in Theorem 2, the alternative version of Theorem 1 without the limiting condition, which appears below).
The limiting conditions (C) of Theorem 1 are satisfied, since (48) implies that and for each .
Finally, to see that the shape restrictions (A) of Theorem 1 hold on , note from (47) that the coefficient of in equals
[TABLE]
Similarly, the coefficient of in equals
[TABLE]
From (48) it follows that the shape restrictions also hold on and on , and thus condition (A) of Theorem 1 holds on all of .
Thus satisfies all three conditions of Theorem 1.
2. Main Result without condition (C/C’)
The following is a version of Theorem 1 that does not require the technical conditions C and C’ of Theorem 1, but involves a slight strengthening of the technical condition B. The proof of this version is considerably longer than that of Theorem 1. The proof works by constructing an extension of which satisfies properties (A)-(C) of Theorem 1 although itself does not satisfy property (C).131313The case where have a discrete support is handled in exactly the same way as in Theorem 1 with two small modifications: (a) Step 3 in the construction immediately above is not required, and (b) continuity of in the second argument is guaranteed by the construction in Step 2.
Suppose the support of price and income in the population is . Correspondingly, the support of is . Pick any . Corresponding to , the support of is therefore
[TABLE]
Note that by definition, and are non-decreasing and continuous. Let .
Theorem 2
For binary choice under general heterogeneity, the following two statements are equivalent:
(I) The choice probabilities , defined above, satisfy that (A) is non-increasing, and is non-decreasing; (B) is continuous.
(II) There exists a pair of utility functions and , where the first argument denotes the amount of numeraire, and denotes unobserved heterogeneity, and a distribution of such that for any and correspondingly ,
[TABLE]
where (A’) for each fixed , and are non-decreasing; (B’) for each fixed , and are continuous, and for any , it holds that is continuous in .
Discussion of assumptions: Relative to Theorem 1, conditions (C/C’) are omitted, and condition (B/B’) is strengthened to continuity in both arguments. Note that under monotonicity in any one argument, the joint continuity of is equivalent to coordinate wise continuity c.f. Kruse and Deely 1969.
To prove Theorem 2, we will utilize several lemmas.
Lemma 1** (Apostol, 1974, Ex 4.19)**
Suppose , is continuous on . For , define , and . Then and are continuous on .
Proof of Lemma 2. Fix any .
First, suppose . Choose . Now by continuity of , there must exist s.t. for any , we have that . Therefore, . Therefore, , implying continuity of at .
Next, suppose the sup is at , i.e. . By continuity, for any , there exists , s.t. for all , we have that . For , , since , by assumption. But by definition. Therefore, for all , we have that . Next, for all , implying
[TABLE]
Thus for all , we have that . Therefore, is continuous at .
An exactly similar proof works for .
Lemma 2** (Taylor, 1955, Chap 15.7, Theorem VII)**
Suppose the function is continuous, and the function is continuous w.r.t. the -norm. Then the function defined as is continuous on .
Proof of Lemma 3. Pick any , and . Continuity of implies that there exists s.t. , whenever . Now, continuity of implies that given the above , there exists s.t. whenever . Choose . Then whenever , we have that and , and thus , and therefore,
[TABLE]
Construction: The following construction will be used to prove the theorem. Pick . Recall the definitions and . Let , be any pair of real numbers satisfying and . For any and , respectively, define
[TABLE]
Note that as decreases with fixed, or increases with fixed, the set expands, and therefore the sup over it weakly increases; thus is non-increasing and is non-decreasing. Similarly, is non-increasing and is non-decreasing. Now, define the function as follows. For any ,
[TABLE]
Claim 1
Suppose satisfies (A) and (B) of Theorem 2. Then the function defined in (50) satisfies the following properties:
(1) is non-increasing, and is non-decreasing for all
(2) is continuous in each argument, holding the other argument fixed.
(3) For any , there exist real numbers and such that and .
Proof. Property (3) is obvious because and , by construction. To show (1) and (2), fix . Since satisfies (A) and (B) on , we only need to establish the properties over the range and .
Property (1): First, we show that the shape restrictions hold for . We have already noted that and are both non-increasing; further since and , we have that is non-increasing in for , and is non-increasing in for . Thus is non-increasing in for all and .
Next, pick , and consider monotonicity of . Let with , implying and . Now there are 10 cases to consider, labelled (a)-(j) below, depending on the ordering of and , and where lies. Case (a) , then
[TABLE]
Case (b) , i.e. , and so , and therefore, . Case (c): , and Case (d) , the proofs are exactly analogous to respectively (a) and (b) above.
So we are left with the following cases, where Cases (e)-(g) correspond to , and (h)-(j) to .
For Case (e) , since , by continuity of and the intermediate value theorem, there exists s.t. . Therefore,
[TABLE]
where holds because and condition (A) of Theorem 1, and holds by definition of . Next, suppose Case (f) , then by continuity of and the intermediate value theorem, there exists s.t. ; and by continuity of and the intermediate value theorem, there exists s.t. , with . Then
[TABLE]
Next, for Case (g) , using continuity of and the intermediate value theorem, we have for some so that
[TABLE]
Next, consider Case (h) . Since , by continuity and the intermediate value theorem, we have that for some , whence we have
[TABLE]
Next, if Case (i) , we have that .
Finally, for the Case (j) , the same argument as in (g) applies.
This establishes the requisite shape restrictions, i.e. Property (1).
Property (2): First, consider continuity of . Note that is obviously continuous at for ; next, at , , while at ,
[TABLE]
and thus is continuous at and at . Finally, if , then we can have only if in which case and thus implying
[TABLE]
By Lemma 3, is continuous in , and therefore, by Lemma 2, is continuous in for fixed . Thus we have that is continuous on all of . An exactly analogous argument works for .
Finally, consider continuity in for fixed . If (a) , then , and therefore,
[TABLE]
which does not depend on and therefore trivially continuous in . So consider (b) , so that . Therefore, at , equals
[TABLE]
The last equality follows because if and only if . Now, since is continuous, and so is , the function is continuous in (see Lemma 3 above), and therefore, it follows from Lemma 2 that is continuous in . In particular, as , approaches and so (5) tends to (52).
Finally,** **for any , (recall , so that ), we have that
[TABLE]
which is continuous in by Lemma 2 and 3. Exactly analogous arguments hold for (a’) and (b’) respectively. Thus, we have that is continuous at each .
Lemma 3
Suppose the function satisfies on its domain that (1) is non-increasing, and is non-decreasing; (2) is continuous, and (3) for any , and . For any fixed , define for each ,
[TABLE]
Then we must have that , for any .
Proof of Lemma 4. Since satisfies the same properties as of Theorem 1 (A)-(C), the proof of this lemma is identical to the proof of Lemma 1 used to prove Theorem 1.
Proof of Theorem 2. That (II) implies (I) is straightforward, since
[TABLE]
whence (B’) implies (B), and (A’) implies (A).
We now show that (I) implies (II). To do so, recall the definition of in (54). Now, consider a random variable . Define and . We will now show that for and correspondingly, , the functions and will rationalize the choice-probabilities .
To prove this, note that for any , and ,
[TABLE]
Also, by definition of as the supremum in (54), we have that
[TABLE]
Therefore, by (55) and (56), we have that . Thus, for , it follows that
[TABLE]
Recall that for and correspondingly , we have that by definition. Therefore, it follows from (57) that the utility functions and with heterogeneity rationalize the choice probability function on its domain.
Next, note that whenever . To see this, suppose and yet . Choose s.t. . Then by conclusion (i) of the previous lemma and by definition (54) of , we must have . But since , this contradicts conclusion (1) of the Claim 1.
Next, it follows from (A) and (B) that is continuous. To see this, fix , and suppose to the contrary that is discontinuous at ; suppose there exists such that for any , for all satisfying . For any such satisfying , it follows from the definition of that there exists s.t.
[TABLE]
Inequality follows because since , and if with , then that contradicts the definition of as the sup. Therefore, it follows from (58) that
[TABLE]
which contradicts that is continuous in its second argument for fixed value of its first argument (see property (2) in Claim 1 above), since can be made arbitrarily close to by choosing small enough.
Finally, is obviously continuous and strictly increasing in , thus (A’) holds. Finally, (B) ensures that (B’) is satisfied.
Reference
-
Apostol, Tom M. (1974): Mathematical Analysis, Addison-Wesley.
-
Kruse, R.L. and Deely, J.J. 1969. Joint continuity of monotonic functions. The American Mathematical Monthly, 76(1), pp.74-76.
-
Taylor, Angus E. (1955): Advanced Calculus, Ginn and Company.
