Direction Selection in Stochastic Directional Distance Functions

Kevin Layer; Andrew L. Johnson; Robin C. Sickles; Gary D. Ferrier

arXiv:1904.01524·stat.AP·April 4, 2019·Eur. J. Oper. Res.

Direction Selection in Stochastic Directional Distance Functions

Kevin Layer, Andrew L. Johnson, Robin C. Sickles, Gary D. Ferrier

PDF

TL;DR

This paper investigates how the choice of direction in stochastic directional distance functions affects estimation accuracy, demonstrating that data-driven, orthogonal directions improve functional estimates in production and cost modeling.

Contribution

It introduces a data-driven approach for selecting optimal directions in SDDF, including shape constraints, enhancing estimation accuracy over traditional methods.

Findings

01

Orthogonal directions yield better estimates in simulations.

02

Shape constrained nonparametric methods outperform parametric ones.

03

Practitioners should choose directions with non-zero components for all variables.

Abstract

Researchers rely on the distance function to model multiple product production using multiple inputs. A stochastic directional distance function (SDDF) allows for noise in potentially all input and output variables. Yet, when estimated, the direction selected will affect the functional estimates because deviations from the estimated function are minimized in the specified direction. The set of identified parameters of a parametric SDDF can be narrowed via data-driven approaches to restrict the directions considered. We demonstrate a similar narrowing of the identified parameter set for a shape constrained nonparametric method, where the shape constraints impose standard features of a cost function such as monotonicity and convexity. Our Monte Carlo simulation studies reveal significant improvements, as measured by out of sample radial mean squared error, in functional estimates when…

Tables20

Table 1. Table 1: Average MSE over 100 simulations for the Linear Estimator compared to the true function with a DGP using random noise directions

Note: Displayed are measured values multiplied by $10^{3}$ .
	Avg MSE: Comparison
	to the True Function
	DDF Angle $θ_{t}$
MSE Dir Angle $θ_{MSE}$	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$0$	2.09	0.75	0.56	1.16	3.68
$π / 8$	1.36	0.46	0.32	0.63	1.89
$π / 4$	1.25	0.41	0.28	0.51	1.48
$3 π / 8$	1.59	0.50	0.32	0.57	1.60
$π / 2$	3.06	0.91	0.55	0.92	2.44

Table 2. Table 2: Average MSE over 100 simulations for the Linear Estimator compared to an out-of-sample testing set with a DGP using random noise directions

Note: Displayed are measured values multiplied by $10^{3}$ .
	Avg MSE: Comparison
	to Out-of-Sample
	DDF Angle $θ_{t}$
MSE Dir Angle $θ_{MSE}$	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$0$	28.28	29.43	31.29	34.23	40.67
$π / 8$	18.03	17.79	18.19	19.09	21.32
$π / 4$	16.38	15.55	15.45	15.77	16.90
$3 π / 8$	20.50	18.67	18.04	17.90	18.46
$π / 2$	38.63	33.07	30.68	29.29	28.70

Table 3. Table 3: Experiment 1: Values of the radial MSE relative to the true function. The angle used in CNLS-d estimator varies and the noise direction is randomly selected. In the DGP, the standard deviation of the noise distribution, λ 𝜆 \lambda , is 0.1.

	CNLS-d Direction Angle
	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
Average MSE across simulations	13.90	4.65	3.32	4.49	13.93
Note: Displayed are measured values multiplied by $10^{4}$ .

Table 4. Table 4: Experiment 2: Values of radial MSE relative to the true function varying the DGP noise direction and the CNLS-d estimator direction. In the DGP, the standard deviation of the noise distribution, λ 𝜆 \lambda , is 0.1.

Note: Displayed are measured values multiplied by $10^{4}$ .
	CNLS-d Direction Angle
Noise Direction Angle	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$0$	2.69	3.03	4.49	8.86	25.47
$π / 8$	7.49	3.44	4.00	8.07	28.83
$π / 4$	20.28	5.79	4.30	5.80	19.06
$3 π / 8$	25.58	7.80	4.18	3.51	6.84
$π / 2$	25.90	9.09	4.73	3.10	2.57

Table 5. Table 5: Experiment 3–Less Noise: Values of radial MSE relative to the true function varying the DGP noise direction and the CNLS-d direction. In the DGP, the standard deviation of the noise distribution, λ 𝜆 \lambda , is 0.05.

Note: Displayed are measured values multiplied by $10^{4}$ .
	CNLS-d Direction Angle
Noise Direction Angle	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$0$	0.92	0.82	0.96	1.53	5.12
$π / 8$	1.83	1.09	1.09	1.47	5.45
$π / 4$	3.70	1.41	1.29	1.43	3.93
$3 π / 8$	5.75	1.68	1.27	1.18	1.86
$π / 2$	4.61	1.40	0.95	0.79	0.90

Table 6. Table 6: Experiment 4: Values of radial MSE relative to the true function varying the CNLS-d direction and the mean of the normal distribution used in the DGP.

Note: Displayed are measured values multiplied by $10^{4}$ .
Mean of the	CNLS-d Direction angle
Normal Distribution ( $\bar{θ}$ )	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$π / 8$	3.19	2.21	3.89	10.28	46.47
$π / 4$	8.44	2.92	1.98	3.17	9.00
$3 π / 8$	45.64	10.25	4.02	2.43	3.07

Table 7. Table 7: Summary Statistics of the Hospital Data Set

	2007
	(523 observations)
	Cost ($)	MajDiag	MajTher	MinDiag	MinTher
Mean	146M	162	4083	3499	7299
Skewness	3.51	2.89	2.63	5.19	3.28
25-percentile	24M	9	277	108	512
50-percentile	72M	73	1688	938	3108
75-percentile	182M	207	5443	4082	9628
	2008
	(511 observations)
	Cost ($)	MajDiag	MajTher	MinDiag	MinTher
Mean	163M	175	4433	3688	7657
Skewness	4.19	3.80	2.97	4.87	2.82
25-percentile	28M	10	325	120	545
50-percentile	83M	76	1809	1013	3350
75-percentile	189M	246	5984	4569	10781
	2009
	(458 observations)
	Cost ($)	MajDiag	MajTher	MinDiag	MinTher
Mean	175M	161	4471	3615	7905
Skewness	3.39	3.78	2.43	4.68	2.41
25-percentile	31M	12	420	148	713
50-percentile	91M	69	1737	1136	3458
75-percentile	220M	230	6402	4694	10989

Table 8. Table 8: Results of the radial MSE values for different directions by year

Note: Displayed are the measured values
Direction	Year
( $g^{y_{1}}, g^{y_{2}}, g^{y_{3}}, g^{y_{4}}, g^{c})$	2007	2008	2009
(0.45, 0.45, 0.45, 0.45, 0.45)	2.10	1.30	1.50
(0.35, 0.35, 0.35, 0.35, 0.71)	2.15	1.65	1.29
Median Direction	1.79	1.55	1.34
multiplied by $10^{3}$

Table 9. Table 9: US Hospital K-fold Average MSE in Cost to the Cost Function Estimates for the Three Functional Specifications by Year

Note: The MSE values displayed are the measured
	Quadratic	CNLS-d	Lower Bound
Year	Regression	(Median Direction)	Estimator
2007	3.43	2.44	2.35
2008	2.76	1.93	1.48
2009	2.43	1.80	1.53
values multiplied by $10^{3}$

Table 10. Table 10: Most Productive Scale Size measured in cost ( $ M currency-dollar 𝑀 \$M ) conditional on Minor Therapeutic procedures (MinTher) and Major Therapeutic procedures (MajTher), Minor Diagnostic procedures (MinDiag) and Major Diagnostic procedures (MajDiag) held constant at the 50th percentile

Note: The values displayed are in $M
Ratio	Quadratic Regression			CNLS-d (median)			CNLS-d (equal)
MajTher/MinTher	2007	2008	2009	2007	2008	2009	2007	2008	2009
20%	13	379	252	210	61	88	224	137	106
30%	17	861	640	146	66	83	134	129	148
40%	272	377	1090	107	56	77	127	85	135
50%	870	249	1552	112	64	85	124	126	134
60%	360	210	276	90	70	120	88	96	142
70%	205	182	187	111	66	184	132	104	104
80%	151	170	150	174	69	286	221	110	111

Table 11. Table 11: Marginal Cost of Minor Therapeutic Procedures

Note: The values displayed are in $k
Percentile		Quadratic Regression			CNLS-d (median)			CNLS-d (equal)
MinTher	MajTher	2007	2008	2009	2007	2008	2009	2007	2008	2009
25	25	8.9	6.5	13.2	0.03	0.03	0.03	0.2	0.02	0.1
25	50	8.9	6.5	13.2	0.05	0.1	0.1	0.04	0.1	0.04
25	75	8.9	6.5	13.2	0.2	0.04	0.03	0.1	0.02	0.02
50	25	8.1	6.1	12.4	6.9	5.5	7.4	5.9	6.3	7.8
50	50	8.1	6.1	12.4	4.3	4.9	7.8	2.1	3.7	7.4
50	75	8.1	6.1	12.4	0.2	0.4	0.03	0.1	0.02	0.02
75	25	6.0	5.0	10.4	9.6	13.5	14.0	9.5	10.9	14.1
75	50	6.0	5.0	10.4	9.6	13.5	14.3	9.6	10.9	13.8
75	75	6.0	5.0	10.4	5.7	10.1	6.4	4.6	8.7	6.4

Table 12. Table 12: Marginal Cost of Major Therapeutic Procedures

Note: The values displayed are in $k
Percentile		Quadratic Regression			CNLS-d (median)			CNLS-d (equal)
MinTher	MajTher	2007	2008	2009	2007	2008	2009	2007	2008	2009
25	25	10.5	11.5	9.8	0.1	0.04	0.1	0.2	0.03	0.1
25	50	11.7	13.0	10.8	11.3	11.8	15.7	10.5	10.3	14.6
25	75	15.1	17.2	14.5	19.8	22.1	24.6	19.8	21.8	24.0
50	25	10.5	11.5	9.8	0.4	0.2	0.5	0.1	0.1	0.4
50	50	11.7	13.0	10.8	3.7	7.7	1.7	6.9	7.1	3.7
50	75	15.1	17.2	14.5	19.8	22.0	24.6	19.8	21.8	24.0
75	25	10.5	11.5	9.8	0.2	0.03	0.1	0.0	0.1	0.1
75	50	11.7	13.0	10.8	0.2	0.2	0.4	0.8	0.1	0.3
75	75	15.1	17.2	14.5	18.3	12.4	19.8	16.2	11.0	15.2

Table 13. Table 13: Average MSE over 100 simulations for the Linear Estimator compared to the true function with a DGP using random noise directions

Note: Displayed are the measured values multiplied by $10^{3}$
		Average MSE: Estimator
		compared to the true function
		DDF Direction Angle $θ_{t}$
Noise Dir Angle $θ_{f}$	MSE Dir Ang $θ_{MSE}$	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$0$	$0$	0.55	1.59	3.49	6.35	12.06
$0$	$π / 8$	0.32	0.86	1.81	3.17	5.70
$0$	$π / 4$	0.27	0.69	1.42	2.44	4.23
$0$	$3 π / 8$	0.32	0.77	1.54	2.58	4.36
$0$	$π / 2$	0.54	1.21	2.37	3.86	6.28
$π / 8$	$0$	3.22	1.00	2.66	7.79	22.92
$π / 8$	$π / 8$	2.16	0.59	1.39	3.80	9.98
$π / 8$	$π / 4$	2.04	0.50	1.10	2.88	7.09
$π / 8$	$3 π / 8$	2.67	0.59	1.21	3.02	7.02
$π / 8$	$π / 2$	5.40	1.03	1.88	4.45	9.68
$π / 4$	$0$	8.95	2.92	1.18	2.95	15.94
$π / 4$	$π / 8$	6.46	1.93	0.70	1.53	7.21
$π / 4$	$π / 4$	6.49	1.81	0.61	1.20	5.24
$π / 4$	$3 π / 8$	9.10	2.35	0.74	1.31	5.30
$π / 4$	$π / 2$	20.84	4.70	1.32	2.03	7.48
$3 π / 8$	$0$	9.65	4.44	1.90	1.13	5.70
$3 π / 8$	$π / 8$	6.99	3.00	1.22	0.65	2.83
$3 π / 8$	$π / 4$	7.05	2.86	1.11	0.55	2.17
$3 π / 8$	$3 π / 8$	9.92	3.76	1.40	0.64	2.30
$3 π / 8$	$π / 2$	22.76	7.71	2.66	1.09	3.45
$π / 2$	$0$	6.15	3.76	2.29	1.16	0.50
$π / 2$	$π / 8$	4.25	2.50	1.49	0.73	0.29
$π / 2$	$π / 4$	4.11	2.36	1.37	0.66	0.25
$π / 2$	$3 π / 8$	5.52	3.06	1.74	0.81	0.29
$π / 2$	$π / 2$ 7	11.62	6.10	3.33	1.50	0.49

Table 14. Table 14: Average MSE over 100 simulations for the Linear Estimator compared to an out-of-sample testing set with a DGP using fixed noise directions

Note: Displayed are the measured values multiplied by $10^{3}$
		Average MSE: Estimator
		compared to testing set data
		DDF Direction Angle $θ_{t}$
Noise Dir Angle $θ_{f}$	MSE Dir Ang $θ_{MSE}$	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$0$	$0$	30.02	31.22	33.23	36.21	42.08
$0$	$π / 8$	17.53	17.13	17.46	18.24	20.01
$0$	$π / 4$	14.95	13.99	13.86	14.10	14.92
$0$	$3 π / 8$	17.51	15.70	15.15	15.03	15.42
$0$	$π / 2$	29.93	25.30	23.55	22.64	22.32
$π / 8$	$0$	49.89	52.78	58.59	68.39	91.28
$π / 8$	$π / 8$	32.41	30.88	31.71	34.14	40.37
$π / 8$	$π / 4$	29.93	26.38	25.69	26.27	28.92
$π / 8$	$3 π / 8$	38.15	31.00	28.66	27.92	28.88
$π / 8$	$π / 2$	74.19	53.30	45.83	41.93	40.19
$π / 4$	$0$	51.54	53.79	59.55	70.76	101.99
$π / 4$	$π / 8$	36.65	34.53	35.21	38.14	47.22
$π / 4$	$π / 4$	36.39	31.60	30.32	30.83	34.75
$π / 4$	$3 π / 8$	50.32	39.87	35.91	34.32	35.52
$π / 4$	$π / 2$	112.21	76.31	62.47	54.76	50.83
$3 π / 8$	$0$	39.37	41.09	45.01	52.54	73.64
$3 π / 8$	$π / 8$	28.28	27.35	28.14	30.56	37.89
$3 π / 8$	$π / 4$	28.30	25.72	25.22	26.01	29.73
$3 π / 8$	$3 π / 8$	39.47	33.40	31.11	30.42	32.19
$3 π / 8$	$π / 2$	89.14	66.84	57.41	51.96	49.51
$π / 2$	$0$	22.47	22.94	23.97	25.85	30.66
$π / 2$	$π / 8$	15.44	15.16	15.36	15.99	17.91
$π / 2$	$π / 4$	14.89	14.17	14.01	14.21	15.27
$π / 2$	$3 π / 8$	19.88	18.27	17.59	17.35	17.88
$π / 2$	$π / 2$	41.52	36.04	33.31	31.51	30.54

Table 15. Table 15: Experiment 3–More Noise: Values of radial MSE relative to the true function varying the DGP noise direction and the CNLS-d direction. In the DGP, the standard deviation of the noise distribution, λ 𝜆 \lambda , is 0.2.

Note: Displayed are measured values multiplied by $10^{4}$
	CNLS-d Direction Angle
Noise Direction Angle	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$0$	8.15	15.62	37.66	82.16	183.39
$π / 8$	50.60	11.59	20.68	67.88	206.46
$π / 4$	145.21	29.40	11.89	33.89	149.24
$3 π / 8$	220.24	69.87	22.28	11.66	53.66
$π / 2$	165.84	72.13	33.27	14.25	7.41

Table 16. Table 16: Experiment 5: Values of radial MSE relative to the true function varying the CNLS-d direction and type of direction used for the DGP.

Note: Displayed are measured values multiplied by $10^{4}$ .
	CNLS-d Direction Angle
Distribution	$0$	$π / 8$	$π / 4$	$3 π / 8$	$π / 2$
$N o r m a l$	8.45	3.04	1.96	3.01	8.60
$G a m m a_{1}$	29.34	6.92	3.27	2.54	3.39
$G a m m a_{2}$	6.62	9.69	19.19	72.55	598.97

Table 17. Table 17: Experiment 6: Values of radial MSE relative to the true function varying the CNLS-d direction in a 3- dimensional case

Note: Displayed are measured values multiplied by $10^{4}$ .
CNLS-d Direction
$(g^{y_{1}}, g^{y_{2}}, g^{y_{3}})$	Average of radial MSE
$(0, 0, 1)$	9.07
$(0, 0.45, 0.89)$	5.23
$(0, 0.71, 0.71)$	5.04
$(0, 0.89, 0.45)$	5.53
$(0, 1, 0)$	9.62
$(0.33, 0.67, 0.67)$	4.24
$(0.41, 0.41, 0.82)$	4.29
$(0.41, 0.82, 0.41)$	4.35
$(0.45, 0, 0.89)$	5.12
$(0.45, 0.89, 0)$	5.44
$(0.58, 0.58, 0.58)$	4.21
$(0.67, 0.33, 0.67)$	4.15
$(0.67, 0.67, 0.33)$	4.18
$(0.71, 0, 0.71)$	4.89
$(0.71, 0.71, 0)$	4.91
$(0.82, 0.41, 0.41)$	4.23
$(0.89, 0, 0.45)$	5.20
$(0.89, 0.45, 0)$	5.18
$(1, 0, 0)$	8.58

Table 18. Table 18: Most Productive Scale Size (c)

Percentile				Quadratic Regression			CNLS-d (median)			CNLS-d (equal)			LL Kernel
MinDiag	MinTher	MajDiag	MajTher	2007	2008	2009	2007	2008	2009	2007	2008	2009	2007	2008	2009
25	25	25	25	529	234	823	105	58	87	109	93	103	228	301	885
25	25	25	50	118	118	122	406	81	369	412	91	112	220	143	216
25	25	25	75	102	110	93	1214	82	895	1220	92	104	209	120	189
25	25	50	25	79	693	560	177	94	166	131	93	140	226	136	162
25	25	50	50	104	139	141	98	60	94	95	79	84	233	217	210
25	25	50	75	105	114	103	165	80	340	179	90	114	219	139	208
25	25	75	25	56	414	335	179	108	554	124	96	384	158	133	27
25	25	75	50	77	245	176	149	93	194	126	91	132	292	117	185
25	25	75	75	94	133	115	93	61	107	105	71	85	226	197	197
25	50	25	25	15	42	63	330	53	78	327	123	127	400	271	1062
25	50	25	50	1074	234	1333	119	55	92	108	91	89	1027	306	215
25	50	25	75	137	131	138	209	78	331	264	98	110	222	153	206
25	50	50	25	248	330	381	80	57	83	93	74	84	373	304	336
25	50	50	50	332	273	1349	70	64	86	127	78	80	903	251	1152
25	50	50	75	133	134	141	177	76	304	182	95	109	233	261	231
25	50	75	25	108	492	718	126	91	144	143	89	129	204	187	79
25	50	75	50	122	694	1068	128	87	137	144	95	112	331	159	239
25	50	75	75	118	154	152	91	59	104	110	77	93	239	232	229
25	75	25	25	11	13	13	915	53	80	1015	125	130	246	98	921
25	75	25	50	11	231	149	192	52	78	197	130	136	537	433	1129
25	75	25	75	1139	223	1542	112	55	75	101	91	115	1015	287	215
25	75	50	25	18	16	5	133	52	79	181	114	118	293	111	887
25	75	50	50	13	311	217	135	51	77	181	111	125	528	466	1091
25	75	50	75	1155	230	1563	109	61	75	99	90	114	1062	272	214
25	75	75	25	64	220	275	81	57	85	94	82	84	300	199	274
25	75	75	50	304	483	484	79	56	93	85	73	83	478	400	437
25	75	75	75	333	265	1532	77	64	96	115	78	79	963	249	153
50	25	25	25	143	189	126	165	115	149	173	157	183	132	139	123
50	25	25	50	119	124	105	126	68	143	110	88	98	287	116	197
50	25	25	75	103	111	90	289	82	424	265	92	104	218	116	185
50	25	50	25	84	740	157	136	72	258	140	91	277	128	209	93
50	25	50	50	106	146	124	96	59	100	101	85	113	245	244	202
50	25	50	75	106	114	100	173	80	292	212	90	113	229	135	211
50	25	75	25	58	431	217	205	97	452	119	95	440	161	128	7
50	25	75	50	79	247	160	140	82	192	114	91	154	150	114	150
50	25	75	75	95	133	111	93	61	106	104	79	106	233	207	197
50	50	25	25	10	142	207	99	51	75	107	111	112	462	363	1031
50	50	25	50	1156	232	1319	109	61	81	114	89	80	1134	367	264
50	50	25	75	138	131	135	208	78	240	267	98	110	233	150	212
50	50	50	25	357	387	450	87	56	90	91	80	90	419	324	194
50	50	50	50	307	272	1329	76	63	84	88	77	78	218	269	652
50	50	50	75	134	135	137	185	76	258	183	95	108	236	170	232
50	50	75	25	110	508	702	125	90	143	124	89	139	209	178	30
50	50	75	50	123	646	1044	128	77	147	119	94	132	333	340	178
50	50	75	75	119	155	149	91	59	103	110	77	103	240	236	236
50	75	25	25	18	15	6	274	53	80	282	124	130	291	117	933
50	75	25	50	14	245	142	191	52	77	188	129	126	566	456	1134
50	75	25	75	1155	224	1523	111	55	75	101	91	115	1050	348	247
50	75	50	25	18	13	10	132	52	79	172	114	118	316	140	894
50	75	50	50	17	325	209	135	51	76	164	111	124	537	502	680
50	75	50	75	1170	230	1544	109	61	83	106	90	114	1106	308	252
50	75	75	25	85	232	264	81	57	84	94	82	84	321	205	245
50	75	75	50	323	493	471	79	56	92	85	81	82	499	406	306
50	75	75	75	335	266	1514	77	64	95	115	78	79	966	252	1192
75	25	25	25	75	101	29	548	309	213	620	177	136	20	27	34
75	25	25	50	100	118	50	129	74	176	142	137	128	100	46	73
75	25	25	75	102	112	81	101	78	133	104	79	99	242	73	160
75	25	50	25	74	142	39	244	95	190	322	92	285	49	57	42
75	25	50	50	95	140	59	112	120	154	189	112	386	123	58	81
75	25	50	75	106	116	83	107	76	131	110	78	99	259	79	179
75	25	75	25	60	534	65	163	75	260	178	84	355	79	115	17
75	25	75	50	80	213	82	139	72	237	179	81	280	129	147	135
75	25	75	75	96	137	96	91	69	111	99	84	114	242	117	188
75	50	25	25	233	593	136	232	128	157	229	254	130	109	542	677
75	50	25	50	185	196	138	171	93	156	145	145	121	154	146	135
75	50	25	75	137	132	115	107	75	118	101	76	106	243	111	182
75	50	50	25	175	670	149	133	85	132	133	98	118	135	278	412
75	50	50	50	169	223	149	120	98	141	121	108	136	179	171	139
75	50	50	75	133	137	118	104	74	117	107	75	95	300	128	211
75	50	75	25	101	607	182	106	71	258	156	80	351	139	161	59
75	50	75	50	117	359	177	102	69	236	150	77	279	171	291	160
75	50	75	75	120	159	131	97	67	108	97	90	111	274	253	210
75	75	25	25	11	57	144	92	52	85	101	92	83	380	220	872
75	75	25	50	12	371	377	88	51	83	105	90	90	642	551	1051
75	75	25	75	740	219	452	101	53	81	86	88	88	253	359	237
75	75	50	25	13	136	202	88	51	84	105	90	92	404	284	866
75	75	50	50	19	439	428	93	50	82	109	89	90	631	630	1048
75	75	50	75	549	225	459	106	59	89	91	88	96	263	349	286
75	75	75	25	296	328	415	79	48	89	85	86	89	363	233	191
75	75	75	50	507	564	593	85	48	88	83	77	97	385	423	191
75	75	75	75	285	261	472	82	56	92	88	76	101	236	264	713
Note: The values displayed are in $M

Table 19. Table 19: Marginal Cost of Minor Therapeutic Procedures

Percentile				Quadratic Regression			CNLS-d (median)			CNLS-d (equal)			LL Kernel
MinDiag	MinTher	MajDiag	MajTher	2007	2008	2009	2007	2008	2009	2007	2008	2009	2007	2008	2009
25	25	25	25	8.9	6.5	13.2	1.3	1.5	2.5	0.3	0.3	1.5	6.3	10.4	4.7
25	25	25	50	8.9	6.5	13.2	0.1	0.1	0.0	0.3	0.1	0.1	6.1	9.9	3.8
25	25	25	75	8.9	6.5	13.2	0.1	0.0	0.0	0.1	0.0	0.0	5.1	7.8	4.2
25	25	50	25	8.9	6.5	13.2	0.0	0.1	0.4	0.0	0.0	0.6	6.5	10.7	5.9
25	25	50	50	8.9	6.5	13.2	0.1	0.1	0.2	0.0	0.1	0.5	6.4	10.2	5.1
25	25	50	75	8.9	6.5	13.2	0.2	0.0	0.0	0.1	0.0	0.0	5.5	8.0	4.6
25	25	75	25	8.9	6.5	13.2	0.0	0.1	0.1	0.1	0.0	0.6	6.8	10.0	8.2
25	25	75	50	8.9	6.5	13.2	0.0	0.1	0.0	0.0	0.0	0.2	6.8	9.6	7.6
25	25	75	75	8.9	6.5	13.2	0.0	0.0	0.0	0.0	0.0	0.1	5.9	7.8	6.4
25	50	25	25	8.1	6.1	12.4	7.3	8.7	10.3	8.0	8.1	9.6	5.0	10.7	6.2
25	50	25	50	8.1	6.1	12.4	2.8	7.1	8.3	4.9	4.5	8.0	4.8	9.5	4.8
25	50	25	75	8.1	6.1	12.4	1.4	0.2	0.0	0.1	0.0	0.0	4.3	7.0	3.6
25	50	50	25	8.1	6.1	12.4	6.9	5.8	7.7	5.9	5.9	6.0	5.3	10.5	7.8
25	50	50	50	8.1	6.1	12.4	4.1	5.5	7.2	2.3	3.4	6.5	5.2	9.8	6.3
25	50	50	75	8.1	6.1	12.4	0.2	0.0	0.0	0.1	0.0	0.0	4.7	6.9	4.1
25	50	75	25	8.1	6.1	12.4	0.4	1.6	1.2	1.4	0.2	1.7	6.0	9.6	10.6
25	50	75	50	8.1	6.1	12.4	0.5	1.8	0.7	1.4	0.3	0.9	5.9	9.0	9.2
25	50	75	75	8.1	6.1	12.4	0.0	0.0	0.1	0.0	0.0	0.1	5.0	6.7	6.7
25	75	25	25	6.0	5.0	10.4	9.6	13.5	14.0	9.5	11.0	14.2	4.7	8.0	16.0
25	75	25	50	6.0	5.0	10.4	9.6	13.5	14.1	9.6	11.0	14.2	3.8	7.6	14.9
25	75	25	75	6.0	5.0	10.4	5.7	10.1	5.7	4.6	8.6	6.9	3.7	6.3	9.5
25	75	50	25	6.0	5.0	10.4	9.6	13.5	14.1	9.5	10.9	13.8	4.5	7.1	16.5
25	75	50	50	6.0	5.0	10.4	9.6	13.5	14.3	9.6	10.9	13.8	4.0	6.9	15.4
25	75	50	75	6.0	5.0	10.4	5.7	9.6	5.7	4.6	8.1	6.4	3.5	5.8	9.7
25	75	75	25	6.0	5.0	10.4	8.8	12.5	13.1	8.1	10.4	12.2	4.6	7.2	18.4
25	75	75	50	6.0	5.0	10.4	8.8	12.5	13.1	7.8	10.4	12.2	4.3	6.1	17.9
25	75	75	75	6.0	5.0	10.4	4.3	8.9	4.3	2.7	5.8	4.3	3.6	3.6	13.2
50	25	25	25	8.9	6.5	13.2	0.0	0.4	0.1	0.1	0.3	0.2	6.6	10.0	4.9
50	25	25	50	8.9	6.5	13.2	0.1	0.0	0.1	0.1	0.1	0.1	6.4	9.6	4.0
50	25	25	75	8.9	6.5	13.2	0.1	0.0	0.0	0.1	0.0	0.0	5.3	7.9	4.4
50	25	50	25	8.9	6.5	13.2	0.0	0.0	0.0	0.2	0.0	0.1	6.8	10.4	6.1
50	25	50	50	8.9	6.5	13.2	0.0	0.1	0.1	0.0	0.1	0.0	6.7	10.0	5.4
50	25	50	75	8.9	6.5	13.2	0.2	0.0	0.0	0.1	0.0	0.0	5.8	7.9	5.0
50	25	75	25	8.9	6.5	13.2	0.0	0.1	0.0	0.2	0.0	0.1	7.0	9.8	8.6
50	25	75	50	8.9	6.5	13.2	0.0	0.1	0.0	0.1	0.0	0.1	7.1	9.5	7.8
50	25	75	75	8.9	6.5	13.2	0.0	0.0	0.0	0.0	0.0	0.0	6.0	8.2	6.7
50	50	25	25	8.1	6.1	12.4	8.0	8.6	9.7	7.6	6.8	9.9	5.3	10.3	6.6
50	50	25	50	8.1	6.1	12.4	3.9	7.1	7.2	4.9	4.3	7.8	5.1	9.5	5.2
50	50	25	75	8.1	6.1	12.4	1.4	0.4	0.0	0.1	0.0	0.0	4.6	7.2	4.0
50	50	50	25	8.1	6.1	12.4	6.9	5.5	7.4	5.9	6.3	7.8	5.6	10.4	8.0
50	50	50	50	8.1	6.1	12.4	4.3	4.9	7.8	2.1	3.7	7.4	5.5	9.8	6.6
50	50	50	75	8.1	6.1	12.4	0.2	0.4	0.0	0.1	0.0	0.0	4.8	7.2	4.2
50	50	75	25	8.1	6.1	12.4	0.5	1.6	0.8	0.7	0.1	1.0	6.4	9.6	10.2
50	50	75	50	8.1	6.1	12.4	0.5	1.8	0.7	0.6	0.3	0.7	6.3	9.1	9.1
50	50	75	75	8.1	6.1	12.4	0.1	0.0	0.1	0.0	0.0	0.1	5.3	7.1	7.2
50	75	25	25	6.0	5.0	10.4	9.6	13.5	14.0	9.5	11.0	14.2	4.7	7.9	15.9
50	75	25	50	6.0	5.0	10.4	9.6	13.5	14.1	9.6	11.0	14.2	3.9	7.6	13.5
50	75	25	75	6.0	5.0	10.4	5.7	10.1	6.4	4.6	8.7	7.6	3.4	6.4	9.1
50	75	50	25	6.0	5.0	10.4	9.6	13.5	14.0	9.5	10.9	14.1	4.6	7.7	16.7
50	75	50	50	6.0	5.0	10.4	9.6	13.5	14.3	9.6	10.9	13.8	4.1	6.9	15.7
50	75	50	75	6.0	5.0	10.4	5.7	10.1	6.4	4.6	8.7	6.4	3.6	6.0	9.2
50	75	75	25	6.0	5.0	10.4	8.8	12.5	13.1	8.1	10.1	12.2	4.8	7.5	18.4
50	75	75	50	6.0	5.0	10.4	8.8	12.5	13.1	8.2	10.1	12.2	4.3	6.3	17.6
50	75	75	75	6.0	5.0	10.4	4.3	8.9	4.3	2.9	5.8	4.3	3.4	4.4	13.2
75	25	25	25	8.9	6.5	13.2	0.0	0.0	0.3	0.1	0.0	0.3	6.9	9.1	6.9
75	25	25	50	8.9	6.5	13.2	0.2	0.2	0.0	0.5	0.1	0.1	6.6	9.0	6.6
75	25	25	75	8.9	6.5	13.2	0.1	0.1	0.4	0.0	0.0	0.0	6.0	7.9	5.7
75	25	50	25	8.9	6.5	13.2	0.0	0.0	0.3	0.1	0.1	0.1	7.1	9.3	7.8
75	25	50	50	8.9	6.5	13.2	0.2	0.1	0.3	0.3	0.1	0.0	7.0	9.0	7.5
75	25	50	75	8.9	6.5	13.2	0.1	0.1	0.1	0.0	0.0	0.0	6.2	8.0	5.8
75	25	75	25	8.9	6.5	13.2	0.1	0.2	0.3	0.1	0.1	0.2	7.3	8.6	9.5
75	25	75	50	8.9	6.5	13.2	0.1	0.2	0.3	0.2	0.1	0.2	7.1	8.6	8.8
75	25	75	75	8.9	6.5	13.2	0.0	0.1	0.2	0.0	0.0	0.2	6.3	8.1	8.1
75	50	25	25	8.1	6.1	12.4	3.1	2.3	2.9	2.6	1.2	4.0	6.0	9.6	8.4
75	50	25	50	8.1	6.1	12.4	3.0	0.5	3.3	1.7	0.9	1.8	5.9	9.5	7.4
75	50	25	75	8.1	6.1	12.4	0.1	0.1	0.8	0.0	0.2	0.0	5.3	7.9	5.6
75	50	50	25	8.1	6.1	12.4	2.6	2.6	0.4	1.5	2.4	0.5	6.2	9.9	9.2
75	50	50	50	8.1	6.1	12.4	2.1	0.1	0.3	0.8	0.1	0.5	6.2	9.5	8.6
75	50	50	75	8.1	6.1	12.4	0.1	0.1	0.7	0.0	0.2	0.0	5.5	7.7	6.4
75	50	75	25	8.1	6.1	12.4	0.4	0.2	0.5	0.2	0.1	0.8	6.8	8.9	10.8
75	50	75	50	8.1	6.1	12.4	0.4	0.2	0.4	0.2	0.1	0.8	6.7	8.8	10.0
75	50	75	75	8.1	6.1	12.4	0.1	0.1	0.3	0.0	0.0	0.3	5.7	7.8	7.7
75	75	25	25	6.0	5.0	10.4	9.6	13.1	14.4	9.6	11.0	12.6	5.5	8.6	14.8
75	75	25	50	6.0	5.0	10.4	9.6	13.0	14.4	9.6	11.0	12.6	4.8	8.3	14.2
75	75	25	75	6.0	5.0	10.4	4.1	9.0	7.4	3.6	5.6	6.6	3.9	6.9	8.4
75	75	50	25	6.0	5.0	10.4	9.6	13.1	14.4	9.6	11.1	12.5	5.6	8.5	15.5
75	75	50	50	6.0	5.0	10.4	9.6	13.0	14.1	9.6	11.1	12.5	4.9	8.1	15.4
75	75	50	75	6.0	5.0	10.4	4.1	7.6	7.5	3.6	5.6	6.9	3.7	6.6	9.4
75	75	75	25	6.0	5.0	10.4	7.1	8.2	9.5	7.9	6.8	10.7	6.5	8.3	18.1
75	75	75	50	6.0	5.0	10.4	7.1	8.2	9.5	7.9	6.8	10.7	5.6	7.7	17.8
75	75	75	75	6.0	5.0	10.4	4.5	7.7	7.5	3.1	5.3	6.4	4.0	5.3	12.8
Note: The values displayed are in $k

Table 20. Table 20: Marginal Cost of Major Therapeutic Procedures

Percentile				Quadratic Regression			CNLS-d (median)			CNLS-d (equal)			LL Kernel
MinDiag	MinTher	MajDiag	MajTher	2007	2008	2009	2007	2008	2009	2007	2008	2009	2007	2008	2009
25	25	25	25	10.5	11.5	9.8	1.7	2.8	4.4	3.3	1.8	4.9	18.4	14.3	22.9
25	25	25	50	11.7	13.0	10.8	17.3	17.0	20.0	16.8	15.2	18.5	17.5	11.2	18.5
25	25	25	75	15.1	17.2	14.5	19.4	22.3	24.6	19.8	21.8	24.0	15.2	10.2	12.7
25	25	50	25	10.5	11.5	9.8	0.0	0.0	0.2	0.1	0.0	0.1	17.4	14.6	21.7
25	25	50	50	11.7	13.0	10.8	9.6	10.6	13.5	11.2	10.3	13.4	16.8	12.2	18.1
25	25	50	75	15.1	17.2	14.5	19.8	22.2	24.6	19.8	21.8	24.0	15.8	10.5	13.9
25	25	75	25	10.5	11.5	9.8	0.1	0.9	0.1	0.1	0.0	0.2	17.4	14.9	17.2
25	25	75	50	11.7	13.0	10.8	1.3	1.7	1.3	2.9	0.1	5.1	17.3	14.0	17.2
25	25	75	75	15.1	17.2	14.5	16.1	18.0	23.5	16.8	16.5	23.8	16.9	11.8	14.2
25	50	25	25	10.5	11.5	9.8	0.1	0.1	0.2	0.1	0.1	0.5	17.9	12.6	20.0
25	50	25	50	11.7	13.0	10.8	12.9	7.9	8.2	10.0	7.0	6.2	17.1	9.7	16.8
25	50	25	75	15.1	17.2	14.5	19.8	22.3	24.6	19.8	21.8	24.0	15.0	8.7	12.1
25	50	50	25	10.5	11.5	9.8	0.4	0.2	0.4	0.1	0.5	0.3	17.3	13.3	18.9
25	50	50	50	11.7	13.0	10.8	5.2	5.2	1.4	10.5	8.1	5.8	16.6	10.8	16.5
25	50	50	75	15.1	17.2	14.5	19.8	22.2	24.6	19.8	21.8	24.0	15.4	8.5	12.6
25	50	75	25	10.5	11.5	9.8	0.1	0.3	0.1	0.2	0.1	0.2	17.3	14.1	15.7
25	50	75	50	11.7	13.0	10.8	0.1	0.5	0.9	0.2	0.2	4.6	16.9	13.0	16.3
25	50	75	75	15.1	17.2	14.5	16.1	18.0	22.8	16.8	16.5	23.8	15.9	10.2	14.3
25	75	25	25	10.5	11.5	9.8	0.0	0.0	0.1	0.2	0.1	0.1	17.1	9.3	9.9
25	75	25	50	11.7	13.0	10.8	1.6	0.0	0.3	0.7	0.1	0.1	15.4	7.0	9.8
25	75	25	75	15.1	17.2	14.5	18.3	12.4	20.9	16.2	10.9	14.3	15.3	6.2	6.7
25	75	50	25	10.5	11.5	9.8	0.2	0.0	0.2	0.0	0.1	0.2	16.3	9.2	9.4
25	75	50	50	11.7	13.0	10.8	0.2	0.3	0.4	0.8	0.1	0.3	15.6	6.6	8.4
25	75	50	75	15.1	17.2	14.5	18.3	12.8	20.9	16.2	11.3	15.2	15.2	6.2	5.8
25	75	75	25	10.5	11.5	9.8	0.1	0.1	0.1	0.2	0.1	0.1	15.7	11.1	9.8
25	75	75	50	11.7	13.0	10.8	0.1	0.1	0.1	0.6	0.1	0.1	15.6	8.4	9.7
25	75	75	75	15.1	17.2	14.5	15.5	10.4	19.7	17.2	10.8	16.7	14.7	6.9	8.0
50	25	25	25	10.5	11.5	9.8	0.3	0.1	0.0	0.2	0.3	2.7	18.6	14.1	21.4
50	25	25	50	11.7	13.0	10.8	17.8	17.7	17.0	16.3	15.4	19.3	18.0	11.4	17.2
50	25	25	75	15.1	17.2	14.5	19.2	22.3	24.6	19.8	21.8	24.0	15.1	11.1	13.1
50	25	50	25	10.5	11.5	9.8	0.1	0.0	0.1	0.2	0.0	0.1	17.6	14.7	20.4
50	25	50	50	11.7	13.0	10.8	11.3	11.8	15.7	10.5	10.3	14.6	17.2	12.5	17.2
50	25	50	75	15.1	17.2	14.5	19.8	22.1	24.6	19.8	21.8	24.0	15.9	10.7	13.5
50	25	75	25	10.5	11.5	9.8	0.1	1.0	0.3	0.1	0.0	1.3	17.3	15.1	16.8
50	25	75	50	11.7	13.0	10.8	0.9	1.5	2.1	0.5	0.2	1.3	17.2	14.5	16.9
50	25	75	75	15.1	17.2	14.5	16.1	18.0	23.5	16.8	16.5	23.6	16.7	13.0	14.3
50	50	25	25	10.5	11.5	9.8	0.2	0.1	0.2	0.1	0.4	0.5	18.2	12.8	18.6
50	50	25	50	11.7	13.0	10.8	11.1	7.9	10.0	9.4	7.7	5.5	17.6	10.3	15.8
50	50	25	75	15.1	17.2	14.5	19.8	22.2	24.6	19.8	21.8	24.0	14.9	9.4	12.1
50	50	50	25	10.5	11.5	9.8	0.4	0.2	0.5	0.1	0.1	0.4	17.5	13.6	17.6
50	50	50	50	11.7	13.0	10.8	3.7	7.7	1.7	6.9	7.1	3.7	16.9	11.5	15.6
50	50	50	75	15.1	17.2	14.5	19.8	22.0	24.6	19.8	21.8	24.0	15.2	9.5	12.8
50	50	75	25	10.5	11.5	9.8	0.1	0.3	0.2	0.0	0.0	0.4	17.4	14.6	14.8
50	50	75	50	11.7	13.0	10.8	0.1	0.5	0.3	0.1	0.2	1.0	17.0	13.6	15.3
50	50	75	75	15.1	17.2	14.5	17.4	18.0	22.8	16.8	16.5	23.8	16.0	11.5	14.8
50	75	25	25	10.5	11.5	9.8	0.0	0.0	0.1	0.2	0.1	0.0	17.0	9.3	9.5
50	75	25	50	11.7	13.0	10.8	1.6	0.0	0.3	0.7	0.1	0.1	15.7	7.6	6.3
50	75	25	75	15.1	17.2	14.5	18.3	12.4	19.8	16.2	11.0	13.4	14.4	6.6	6.5
50	75	50	25	10.5	11.5	9.8	0.2	0.0	0.1	0.0	0.1	0.1	16.6	10.0	8.9
50	75	50	50	11.7	13.0	10.8	0.2	0.2	0.4	0.8	0.1	0.3	15.8	8.2	8.1
50	75	50	75	15.1	17.2	14.5	18.3	12.4	19.8	16.2	11.0	15.2	14.8	6.7	5.3
50	75	75	25	10.5	11.5	9.8	0.1	0.1	0.1	0.2	0.1	0.1	16.1	11.6	9.6
50	75	75	50	11.7	13.0	10.8	0.1	0.1	0.1	0.3	0.1	0.1	15.5	9.2	9.1
50	75	75	75	15.1	17.2	14.5	15.5	10.4	19.7	16.3	10.8	16.7	14.6	8.3	7.2
75	25	25	25	10.5	11.5	9.8	0.3	0.0	0.3	0.3	0.0	1.5	18.9	14.7	15.0
75	25	25	50	11.7	13.0	10.8	2.4	9.7	4.0	6.7	6.2	4.3	18.0	13.9	13.9
75	25	25	75	15.1	17.2	14.5	19.6	19.5	24.7	19.3	18.3	24.4	15.7	13.2	11.0
75	25	50	25	10.5	11.5	9.8	0.1	0.0	0.2	0.3	0.0	0.2	18.0	15.4	14.8
75	25	50	50	11.7	13.0	10.8	3.9	5.5	0.8	4.5	2.8	3.3	17.6	14.7	13.8
75	25	50	75	15.1	17.2	14.5	19.6	19.5	24.7	19.3	18.9	24.4	16.4	13.4	11.9
75	25	75	25	10.5	11.5	9.8	0.1	0.1	0.3	0.0	0.1	0.4	17.1	16.1	14.0
75	25	75	50	11.7	13.0	10.8	0.1	0.1	0.3	0.2	0.1	0.4	17.1	16.5	14.3
75	25	75	75	15.1	17.2	14.5	19.5	11.6	23.4	19.1	18.5	20.8	17.5	15.4	16.5
75	50	25	25	10.5	11.5	9.8	0.3	0.1	0.5	1.7	0.1	0.7	18.6	13.9	13.2
75	50	25	50	11.7	13.0	10.8	0.9	7.4	1.4	3.1	4.5	3.5	17.8	13.3	12.5
75	50	25	75	15.1	17.2	14.5	19.6	19.5	24.7	19.3	19.2	24.4	15.0	12.4	10.9
75	50	50	25	10.5	11.5	9.8	0.5	0.1	0.2	0.7	0.1	0.1	17.8	15.0	12.8
75	50	50	50	11.7	13.0	10.8	0.7	5.5	0.8	2.5	2.8	3.3	17.3	14.2	12.2
75	50	50	75	15.1	17.2	14.5	19.6	19.5	24.7	19.3	19.8	24.4	15.7	12.3	12.3
75	50	75	25	10.5	11.5	9.8	0.1	0.1	0.2	0.2	0.1	0.2	17.3	16.0	12.1
75	50	75	50	11.7	13.0	10.8	0.1	0.1	0.3	0.2	0.1	0.2	17.0	16.1	12.9
75	50	75	75	15.1	17.2	14.5	19.2	11.6	24.2	19.1	18.5	20.0	16.5	15.0	15.5
75	75	25	25	10.5	11.5	9.8	0.1	0.2	0.1	0.1	0.2	0.3	16.8	11.6	6.9
75	75	25	50	11.7	13.0	10.8	0.1	0.4	0.1	0.1	0.2	0.3	16.2	10.4	6.5
75	75	25	75	15.1	17.2	14.5	18.6	12.6	15.4	15.9	15.0	14.1	14.6	10.1	4.5
75	75	50	25	10.5	11.5	9.8	0.1	0.2	0.1	0.1	0.1	0.1	16.5	12.4	7.2
75	75	50	50	11.7	13.0	10.8	0.1	0.4	0.1	0.1	0.1	0.1	15.8	11.5	7.2
75	75	50	75	15.1	17.2	14.5	18.6	13.4	15.9	15.9	15.0	13.6	14.4	10.6	5.3
75	75	75	25	10.5	11.5	9.8	0.1	0.1	0.1	0.1	0.1	0.2	15.7	14.1	8.5
75	75	75	50	11.7	13.0	10.8	0.1	0.2	0.1	0.1	0.1	0.2	15.3	13.1	7.4
75	75	75	75	15.1	17.2	14.5	13.5	7.2	14.2	12.2	11.7	12.1	14.6	12.7	7.9
Note: The values displayed are in $k

Equations126

(y _{i} x _{i}) = (y ~ _{i} x ~ _{i}) + (ϵ _{i}^{y} ϵ _{i}^{x}) .

(y _{i} x _{i}) = (y ~ _{i} x ~ _{i}) + (ϵ _{i}^{y} ϵ _{i}^{x}) .

(y _{i} x _{i}) = (y ~ _{i} x ~ _{i}) + e_{i} (g _{i}^{y} g _{i}^{x}) .

(y _{i} x _{i}) = (y ~ _{i} x ~ _{i}) + e_{i} (g _{i}^{y} g _{i}^{x}) .

(ϵ _{i}^{y} ϵ _{i}^{x}) = e_{i} (g _{i}^{y} g _{i}^{x}) .

(ϵ _{i}^{y} ϵ _{i}^{x}) = e_{i} (g _{i}^{y} g _{i}^{x}) .

e_{i} = j = 1 \sum d (ϵ_{ij}^{x})^{2} + j = 1 \sum Q (ϵ_{ij}^{y})^{2},

e_{i} = j = 1 \sum d (ϵ_{ij}^{x})^{2} + j = 1 \sum Q (ϵ_{ij}^{y})^{2},

j = 1 \sum d (g_{ij}^{x})^{2} + j = 1 \sum Q (g_{ij}^{y})^{2} = 1.

j = 1 \sum d (g_{ij}^{x})^{2} + j = 1 \sum Q (g_{ij}^{y})^{2} = 1.

T = {(\tilde{x}, \tilde{y}) \in R_{+}^{d + Q} ∣ \tilde{x} can produce \tilde{y}} .

T = {(\tilde{x}, \tilde{y}) \in R_{+}^{d + Q} ∣ \tilde{x} can produce \tilde{y}} .

D_{T} (\tilde{x}, \tilde{y}; g^{x}, g^{y}) = max {δ \in R : (\tilde{x} - δ g^{x}, \tilde{y} + δ g^{y}) \in T},

D_{T} (\tilde{x}, \tilde{y}; g^{x}, g^{y}) = max {δ \in R : (\tilde{x} - δ g^{x}, \tilde{y} + δ g^{y}) \in T},

D_{T} (\tilde{x}, \tilde{y}; g^{x}, g^{y}) \geq 0, if and only if (\tilde{x}, \tilde{y}) \in T .

D_{T} (\tilde{x}, \tilde{y}; g^{x}, g^{y}) \geq 0, if and only if (\tilde{x}, \tilde{y}) \in T .

D_{T} (x_{i}, y_{i}, g^{x}, g^{y}) = ϵ_{i} \forall i .

D_{T} (x_{i}, y_{i}, g^{x}, g^{y}) = ϵ_{i} \forall i .

α, β, γ, ϵ min i = 1 \sum n ϵ_{i}^{2}

α, β, γ, ϵ min i = 1 \sum n ϵ_{i}^{2}

s.t. γ^{'} y_{i} = α + β^{'} x_{i} - ϵ_{i},

β^{'} g^{x} + γ^{'} g^{y} = 1,

α, β, γ, ϵ min i = 1 \sum n ϵ_{i}^{2}

α, β, γ, ϵ min i = 1 \sum n ϵ_{i}^{2}

s.t. γ_{i}^{'} y_{i} = α_{i} + β_{i}^{'} x_{i} - ϵ_{i},

α_{i} + β_{i}^{'} x_{i} - γ_{i}^{'} y_{i} \leq α_{j} + β_{j}^{'} x_{i} - γ_{j}^{'} y_{i},

β_{i} \geq 0,

β_{i}^{'} g^{x} + γ_{i}^{'} g^{y} = 1,

γ_{i} \geq 0,

ϵ_{0} = \frac{1}{2} [\frac{1}{n - 1} i = 1 \sum n (\tilde{y}_{i} - \overset{y}{ˉ})^{2} + \frac{1}{n - 1} i = 1 \sum n (\tilde{c}_{i} - \overset{c}{ˉ})^{2}],

ϵ_{0} = \frac{1}{2} [\frac{1}{n - 1} i = 1 \sum n (\tilde{y}_{i} - \overset{y}{ˉ})^{2} + \frac{1}{n - 1} i = 1 \sum n (\tilde{c}_{i} - \overset{c}{ˉ})^{2}],

(c _{i} y _{i}) = (c ~ _{i} y ~ _{i}) + (ϵ _{c_{i}} ϵ _{y_{i}}), i = 1, \dots, n .

(c _{i} y _{i}) = (c ~ _{i} y ~ _{i}) + (ϵ _{c_{i}} ϵ _{y_{i}}), i = 1, \dots, n .

MSE = \frac{1}{n} i = 1 \sum n ((\overset{y}{^}_{t s_{i}} - y_{t s_{i}})^{2} + (\overset{c}{^}_{t s_{i}} - c_{t s_{i}})^{2}) .

MSE = \frac{1}{n} i = 1 \sum n ((\overset{y}{^}_{t s_{i}} - y_{t s_{i}})^{2} + (\overset{c}{^}_{t s_{i}} - c_{t s_{i}})^{2}) .

(y_{i 1}, \dots, y_{i Q}, c_{i}), i = 1, \dots, n .

(y_{i 1}, \dots, y_{i Q}, c_{i}), i = 1, \dots, n .

\overset{y}{˘}_{ij}

\overset{y}{˘}_{ij}

\overset{c}{˘}_{i}

γ, ϵ min i = 1 \sum n ϵ_{i}^{2}

γ, ϵ min i = 1 \sum n ϵ_{i}^{2}

s.t. - ϵ_{j} + ϵ_{i} - γ_{i}^{'} (y_{i} - y_{j}) \leq 0,

γ_{i}^{'} g^{y} = 1,

γ_{i} \geq 0,

\hat{y}_{i}

\hat{y}_{i}

α_{i}

y_{q i} = \tilde{y}_{q i} + ϵ_{q i}, q = 1, \dots, Q, i = 1, \dots, n,

y_{q i} = \tilde{y}_{q i} + ϵ_{q i}, q = 1, \dots, Q, i = 1, \dots, n,

\tilde{y}_{1 i} = cos (θ_{i}), i = 1, \dots, n

\tilde{y}_{1 i} = cos (θ_{i}), i = 1, \dots, n

\tilde{y}_{2 i} = sin (θ_{i}), i = 1, \dots, n,

ϵ_{1 i} = l cos (θ_{ϵ_{i}}), i = 1, \dots, n

ϵ_{1 i} = l cos (θ_{ϵ_{i}}), i = 1, \dots, n

ϵ_{2 i} = l sin (θ_{ϵ_{i}}), i = 1, \dots, n,

\overset{y}{˘}_{ij}

\overset{y}{˘}_{ij}

\overset{c}{˘}_{i}

g^{y_{1}} ⋮ g^{y_{Q}} g^{c} = median (\overset{y}{˘}_{i 1}) ⋮ median (\overset{y}{˘}_{i Q}) 1 - median (\overset{c}{˘}_{i}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Direction Selection in Stochastic Directional Distance Functions

Kevin Layer

Department of Industrial and Systems Engineering, Texas A&M University, College Station, TX, USA.

Andrew L. Johnson

School of Information Science and Technology, Osaka University, Suita, Japan.

Robin C. Sickles

Department of Economics, Rice University, Houston, TX, USA.

Gary D. Ferrier

Department of Economics, University of Arkansas, Fayetteville, AR, USA.

Abstract

Researchers rely on the distance function to model multiple product production using multiple inputs. A stochastic directional distance function (SDDF) allows for noise in potentially all input and output variables. Yet, when estimated, the direction selected will affect the functional estimates because deviations from the estimated function are minimized in the specified direction. Specifically, the parameters of the parametric SDDF are point identified when the direction is specified; we show that the parameters of the parametric SDDF are set identified when multiple directions are considered. Further, the set of identified parameters can be narrowed via data-driven approaches to restrict the directions considered. We demonstrate a similar narrowing of the identified parameter set for a shape constrained nonparametric method, where the shape constraints impose standard features of a cost function such as monotonicity and convexity.

Our Monte Carlo simulation studies reveal significant improvements, as measured by out of sample radial mean squared error, in functional estimates when we use a directional distance function with an appropriately selected direction and the errors are uncorrelated across variables. We show that these benefits increase as the correlation in error terms across variables increase. This correlation is a type of endogeneity that is common in production settings. From our Monte Carlo simulations we conclude that selecting a direction that is approximately orthogonal to the estimated function in the central region of the data gives significantly better estimates relative to the directions commonly used in the literature. For practitioners, our results imply that selecting a direction vector that has non-zero components for all variables that may have measurement error provides a significant improvement in the estimator’s performance. We illustrate these results using cost and production data from samples of approximately 500 US hospitals per year operating in 2007, 2008, and 2009, respectively, and find that the shape constrained nonparametric methods provide a significant increase in flexibility over second order local approximation parametric methods.

keywords:

Nonparametric regression , Shape Constraints , Data Envelopment Analysis , Hospital production.

††journal: European Journal of Operational Research

1 Introduction

The focus of this paper is direction selection in stochastic directional distance functions (SDDF).111Here we use the term stochastic in reference to a model with a noise term. While the DDF is typically used to measure efficiency, in this paper we use a nonparametric shape constrained SDDF to model the conditional mean behavior of production. The stochastic distance function (SDF) was introduced by Lovell et al. (1994) and was used in a series of early empirical studies by Coelli and Perelman (1999, 2000) and Sickles et al. (2002). The parameters of a parametric distance function are point identified; however, if the direction in the DDF is not specified, then the parameters of a parametric DDF are set identified.222Let $\phi$ be what is known (e.g., via assumptions and restrictions) about the data generating process (DGP). Let $\theta$ represent the parameters to be identified, let $\Theta$ denote all possible values of $\theta$ , and let $\theta_{0}$ be the true but unknown value of $\theta$ . Then the vector $\theta$ of unknown parameters is point identified if it is uniquely determined from $\phi$ . However, $\theta$ is set identified if some of the possible values of $\theta$ are observationally equivalent to $\theta_{0}$ (Lewbel (forthcoming)). A set of axiomatic properties related to production and cost functions, such as monotonicity and convexity in the case of a cost function, are well established in the production literature (Shephard (1970), Chambers (1988)). Although the stochastic distance function literature acknowledges the axiomatic properties necessary for duality, it does not impose them globally. Instead, authors typically impose them only on a particular point in the data (e.g., Atkinson et al. (2003)). Recognizing these issues, we provide an axiomatic nonparametric estimator of the SDDF and a method to restrict the pool of the directions to choose from for the SDDF, thereby reducing the size of the set identified parameter set. Most empirical studies that use establishment or hospital level data to estimate production or cost functions either assume a specific parametric form or ignore noise, or both ((Hollingsworth, 2003)). In contrast, we use an axiomatic nonparametric SDDF estimator and the proposed method to determine a set of acceptable directions to estimate a cost function that maintains global axiomatic properties for the US hospital industry. Furthermore, we demonstrate the importance of global axiomatic properties for the estimation of most productive scale size and marginal costs.

A few papers have attempted to implement the directional distance function in a stochastic setting (see, for example, Färe et al. (2005), Färe et al. (2010), and Färe and Vardanyan (2016)). The latter two papers discuss the challenges of selecting a parametric functional form that does not violate the axioms typically assumed in production economics. Based on their observations, Färe and Vardanyan (2016) use a quadratic functional specification.333As Kuosmanen and Johnson (2017) note, the translog function used for multi-output production cannot satisfy the standard assumptions for the production technology $T$ globally for any parameter values. The quadratic functional form does not have this shortcoming. Yet several papers show a loss of flexibility in parametric functional forms, such as the translog or the quadratic functional form, when shape constraints are imposed (e.g., Diewert and Wales (1987)). Also important to implementation, the selection of the direction vector in the SDDF has been discussed in Färe et al. (2017) and Atkinson and Tsionas (2016), among others. These papers focus on selecting the direction corresponding to a particular interpretation of the inefficiency measure, based on the distance to the economically efficient point. In contrast, we consider Kuosmanen and Johnson (2017)’s multi-step efficiency analysis and focus on the first step, estimating a conditional mean function. Our goal is to select the direction that best recovers the underlying technology while acknowledging that the data is likely to contain noise in potentially all variables.444For researchers interested in productivity measurement and productivity variation (e.g., Syverson (2011)), the results from this paper can be used directly. For authors interested in efficiency analysis, the insights from this paper could be used to improve the estimates from the first stage of Kuosmanen and Johnson (2017)’s three-step procedure where efficiency is estimated in the third step.

To model multi-product production, Kuosmanen and Johnson (2017) have proposed the use of axiomatic nonparametric methods to estimate the SDDF which they name Directional Convex Nonparametric Least Squares (CNLS-d), a type of sieve estimator. Their methods have the benefits of relaxing standard functional form assumptions for production, cost, or distance functions, but also improve the interpretability and finite sample efficiency over nonparametric methods such as kernel regression (Yagi et al. (2018)). A variety of models can be interpreted as special cases of Kuosmanen and Johnson (2017), among these are a set of models that specify the direction (e.g., Johnson and Kuosmanen (2011), Kuosmanen and Kortelainen (2012)). All CNLS models are sieve estimators and fall into the category of partially identified or set identified estimators discussed in Manski (2003) and Tamer (2010). The guidance our paper provides in selecting a direction will reduce the size of the set identified for CNLS-d and other DDF estimators with flexible direction specifications.

Much of the production function literature concerns endogeneity issues, for example see Olley and Pakes (1996), Levinsohn and Petrin (2003), and Ackerberg et al. (2015). These methods are often referred to as proxy variable approaches. The argument for endogeneity is typically that decisions regarding variable inputs such as labor are made with some knowledge of the factors included in the unobserved residuals. Recently, these methods have been reinterpreted as instrumental variable approaches (Wooldridge (2009)), or control function approaches (Ackerberg et al. (2015)). Unfortunately, the assumptions on the particular timing of input decisions is not innocuous. Indeed every firm must adjust its inputs in exactly the same way, otherwise the moment restrictions needed for point identification are violated. For an alternative in the stochastic frontier setting, see Kutlu (2018).

Kuosmanen and Johnson (2017) have shown that a production function estimated using a stochastic distance function under a constant returns-to-scale assumption is robust to endogeneity issues because the normalization by one of the inputs or outputs causes the errors-in-variables to cancel each other. In this paper we consider the more general case of a convex technology that does not necessarily satisfy constant returns-to-scale, and show that when errors across variables are highly correlated, a specific type of endogeneity, the SDDF improves estimation performance significantly over the typical alternative of ignoring the endogeneity.

When considering alternative directions in the DDF, we show that the direction that performs the best is often related to the particular performance measure used. We use an out-of-sample mean squared error (MSE) that is measured radially to address this issue. This measure is motivated by the results of our Monte Carlo simulations and is natural for a function that satisfies monotonicity and convexity, assuring the true function and the estimated function are close in the areas were most data are observed.

We analyze US hospital data and characterize the most productive scale size and marginal costs for the US hospital sector. We demonstrate that out-of-sample MSE is reduced significantly by relaxing parametric functional form restrictions. We also observe the advantage of imposing axioms that allow the estimated function to still be interpretable. Concerning the direction selection, we find, for this data set, that the exact direction selected is not very critical in terms of MSE performance, but some commonly used directions should be avoided.

The remainder of this paper is organized as follows. Section 2 introduces the statistical model and the production model. Section 3 describes the estimators used for the analysis. Section 4 outlines our reasons for the MSE measure we propose. Section 5 highlights the importance of the direction selection through Monte Carlo experiments. Section 6 describes our direction selection method. Section 7 demonstrates the benefits of using non-parametric shape-constrained estimators with an appropriately selected direction for US hospital data. Section 8 concludes.

2 Models

2.1 Statistical Model

We consider a statistical model that allows for measurement error in potentially all of the input and output variables. Let $\bm{\tilde{x}}_{i}\in\bm{X}\subset\mathbb{R}_{+}^{d},d\geq 1$ , be a vector of random input variables of length $d$ and $\bm{\tilde{y}}_{i}\in\bm{Y}\subset\mathbb{R}_{+}^{Q}$ , $Q\geq 1$ , be a vector of random output variables of length $Q$ , where $i$ indexes observations. Let $\bm{\epsilon}^{x}_{i}\in\mathbb{R}^{d}$ , $d\geq 1$ , be a vector of random error variables of length $d$ and $\bm{\epsilon}^{y}_{i}\in\mathbb{R}^{Q}$ , $Q\geq 1$ , be a vector of random error variables of length $Q$ . One way of modeling the errors-in-variable (EIV) is:

[TABLE]

Equation (1) is only identified when multiple measurements exist for the same vector of regressors or when a subsample of observations exists in which the regressors are measured exactly (Carroll et al. (2006)). Carroll et al. (2006) discussed a standard regression setting, not a multi-input/multi-output production process. Thus, repeated measurement requires all but one of the netputs to be identical across at least two observations.555Here we use the term netputs to describe the union of the input and output vectors. Neither of of these conditions is likely to hold for typical production data sets; therefore, we develop an alternative approach to identification.

As our starting point, we use the alternative, but equivalent, representation of the EIV model proposed by Kuosmanen and Johnson (2017):

[TABLE]

Clearly, the representations of Carroll et al. (2006) and Kuosmanen and Johnson (2017) are equivalent if:

[TABLE]

We define the following normalization:

[TABLE]

which implies:

[TABLE]

We refer to $(\bm{g}_{i}^{x},\bm{g}_{i}^{y})$ as the true noise direction and in the most general case we allow the direction to be observation specific.666When the noise direction is observation specific and random, all inputs and outputs potentially contain noise and therefore are endogeneous variables. If some components of the $(\bm{g}^{x},\bm{g}^{y})$ vector are zero, this implies the associated variables are exogeneous and measured with certainty. See Kuosmanen and Johnson (2017) for more details. The estimation methods to consider noise in potentially all inputs will depend on our assumptions about the production technology, which are discussed in the following subsection.

2.2 Production Model

Researchers use production function models, cost function models, or distance function models to characterize production technologies. Considering a general production process with multiple inputs used to produce multiple outputs, we define the production possibility set as:

[TABLE]

Following Shephard (1970), we adopt the following standard assumptions to assure that $T$ represents a production technology:

T is closed; 2. 2.

T is convex; 3. 3.

Free Disposability of inputs and outputs; i.e., if $\left(\bm{\tilde{x}}^{l},\bm{\tilde{y}}^{l}\right)\in T$ and $\left(\bm{\tilde{x}}^{k},-\bm{\tilde{y}}^{k}\right)\geq\left(\bm{\tilde{x}}^{l},-\bm{\tilde{y}}^{l}\right)$ , then $\left(\bm{\tilde{x}}^{k},\bm{\tilde{y}}^{k}\right)\in T$ .

For an alternative representation, see, for example, Frisch (1964).

Developing methods to estimate characteristics of the production technology while imposing these standard axioms was a popular and fruitful topic from the early 1950’s until the early 1980’s, generating such classic papers as Koopmans (1951), Shephard (1953, 1970), Afriat (1972), Charnes et al. (1978),777Data Envelopment Analysis is perhaps one of the largest success stories and has become an extremely popular method in the OR toolbox for studying efficiency. and Varian (1984). Unfortunately, these methods are deterministic in the sense that they rely on a strong assumption that the data do not contain any measurement errors, omitted variables, or other sources of random noise. Furthermore, for some research communities linear programs were seen as harder to implement than parametric regression which could be calculated via normal equations. Thus, most econometricians and applied economists have chosen to use parametric models, sacrificing flexibility for ease of estimation and the inclusion of noise in the model.

Here we focus our attention on the distance function because it allows the joint production of multi-outputs using multi-inputs. The production function and cost functions can be seen as special cases of the distance function in which there is either a single output or a single input (cost), respectively. Further, motivated by our discussion of EIV models above, we consider a directional distance function which allows for measurement error in potentially all variables. We try to relax both the parametric and deterministic assumptions common in earlier approaches to modeling multi-output/multi-input technologies. We do this by building on an emerging literature that revisits the axiomatic nonparametric approach incorporating standard statistical structures including noise (Kuosmanen (2008);Kuosmanen and Johnson (2010)).

2.2.1 The Deterministic Directional Distance Function (DDF)

Luenberger (1992) and Chambers et al. (1996, 1998) introduced the directional distance function, defined for a technology T as:

[TABLE]

where $\bm{\tilde{x}}$ and $\bm{\tilde{y}}$ are the observed input and output vectors, such that $\bm{\tilde{x}}\in\mathbb{R}_{+}^{d}$ and $\bm{\tilde{y}}\in\mathbb{R}_{+}^{Q}$ are assumed to be observed without noise and fully describe the resources used in production and the goods or services generated from production. $\bm{g}^{x}\in\mathbb{R}_{+}^{d}$ is the direction vector in the input space, $\bm{g}^{y}\in\mathbb{R}_{+}^{Q}$ is the direction vector in the output space, and $\left(\bm{g}^{x},\bm{g}^{y}\right)\in\mathbb{R}_{+}^{d+Q}$ defines the direction from the point $\left(\bm{\tilde{x}},\bm{\tilde{y}}\right)$ in which the distance function is measured.888We assume $\left(\bm{g}^{x},\bm{g}^{y}\right)\neq\bm{0}$ ; i.e., at least one of the components of either $\bm{g}^{x}$ or $\bm{g}^{y}$ is non-zero. $\delta$ is commonly interpreted as a measure of inefficiency by quantifying the number of bundles of size $\left(\bm{g}^{x},\bm{g}^{y}\right)$ needed to move the observed point $\left(\bm{\tilde{x}},\bm{\tilde{y}}\right)$ to the boundary of the technology in a deterministic setting.

Chambers et al. (1998) explained how the directional distance function characterizes the technology T for a given direction vector $\left(\bm{g}^{x},\bm{g}^{y}\right)$ ; specifically:

[TABLE]

If T satisfies the assumptions stated in Section 2.2, then the directional distance function $\overrightarrow{D}_{T}:\mathbb{R}_{+}^{d}\times\mathbb{R}_{+}^{Q}\times\mathbb{R}_{+}^{d}\times\mathbb{R}_{+}^{Q}\to\mathbb{R}_{+}$ has the following properties (see Chambers et al. (1998)):

(a)

$\overrightarrow{D}_{T}\left(\bm{\tilde{x}},\bm{\tilde{y}};\bm{g}^{x},\bm{g}^{y}\right)$ is upper semicontinuous in $\bm{\tilde{x}}$ and $\bm{\tilde{y}}$ (jointly); 2. (b)

$\overrightarrow{D}_{T}\left(\bm{\tilde{x}},\bm{\tilde{y}};\lambda\ \bm{g}^{x},\lambda\ \bm{g}^{y}\right)=\left(1/\lambda\right)\overrightarrow{D}_{T}\left(\bm{\tilde{x}},\bm{\tilde{y}};\bm{g}^{x},\bm{g}^{y}\right),\lambda>0$ ; 3. (c)

$\bm{\tilde{y}^{\prime}}\geq\bm{\tilde{y}}\Rightarrow\overrightarrow{D}_{T}\left(\bm{\tilde{x}},\bm{\tilde{y}^{\prime}};\bm{g}^{x},\bm{g}^{y}\right)\leq\overrightarrow{D}_{T}\left(\bm{\tilde{x}},\bm{\tilde{y}};\bm{g}^{x},\bm{g}^{y}\right)$ ; 4. (d)

$\bm{\tilde{x}^{\prime}}\geq\bm{\tilde{x}}\Rightarrow\overrightarrow{D}_{T}\left(\bm{\tilde{x}^{\prime}},\bm{\tilde{y}};\bm{g}^{x},\bm{g}^{y}\right)\geq\overrightarrow{D}_{T}\left(\bm{\tilde{x}},\bm{\tilde{y}};\bm{g}^{x},\bm{g}^{y}\right)$ ; 5. (e)

If T is convex, then $\overrightarrow{D}_{T}\left(\bm{\tilde{x}},\bm{\tilde{y}};\bm{g}^{x},\bm{g}^{y}\right)$ is concave in $\bm{\tilde{x}}\ \text{and}\ \bm{\tilde{y}}$ .

An additional property of the DDF is the translation invariance:

(f)

$\overrightarrow{D}_{T}\left(\bm{\tilde{x}}-\alpha\bm{g}^{x},\bm{\tilde{y}}+\alpha\bm{g}^{y};\bm{g}^{x},\bm{g}^{y}\right)=\overrightarrow{D}_{T}\left(\bm{\tilde{x}},\bm{\tilde{y}};\bm{g}^{x},\bm{g}^{y}\right)-\alpha$ .

Several theoretical contributions have been made to extend the deterministic DDF, see for example Färe and Grosskopf (2010), Aparicio et al. (2017), Kapelko and Oude Lansink (2017), and Roshdi et al. (2018). The deterministic DDF has been used in several recent applications, including Baležentis and De Witte (2015), Adler and Volta (2016), and Fukuyama and Matousek (2018).

2.2.2 The Stochastic Directional Distance Function

The properties of the deterministic DDF also apply for the stochastic DDF (Färe et al. (2017)). Here we focus on estimating a stochastic DDF considering a residual which is mean zero.999Two models are possible, 1) a mean zero residual indicating that the residual contains only noise used to pursue a productivity analysis, or 2) a composed residual with both inefficiency and noise. Our direction selection analysis is used in the first step of Kuosmanen and Johnson’s three step procedure in which a conditional mean is estimated. This is represented in Figure 1.

Using the statistical model in Section 2.1 and the functional representation of technology in Section 2.2, we restate Proposition 2 in Kuosmanen and Johnson (2017) as:

Proposition 1.

If the observed data are generated according to the statistical model described in Section 2.1, then the value of the DDF in the observed data point $(\bm{x}_{i},\bm{y}_{i})$ is equal to the realization of the random variable $\epsilon_{i}$ with mean zero; specifically

[TABLE]

In the stochastic distance function literature, the translation property, (f) above, is commonly invoked to move an arbitrarily chosen netput variable out of the distance function to the left-hand side of the equation, yielding an equation that looks like a standard regression model; see, for example, Lovell et al. (1994) and Kuosmanen and Johnson (2017). Instead, we write the SDDF with all of the outputs on one side to emphasize that all netputs are treated symmetrically.

Under the assumption of constant returns to scale, normalizing by one of the netputs causes the noise terms to cancel for the regressors, thus eliminating the issue of endogeneity (e.g., Coelli (2000), Kuosmanen and Johnson (2017)). However, since we relax the constant returns to scale assumption, endogeneity can still be an issue.101010If the endogeneity is caused by correlations in the errors across variables, it can be addressed by selecting an appropriate direction for the directional distance function. This is the direction we explore in the Monte Carlo simulation below in Section 4.1.

Färe et al. (2017), among others, have recognized that the selection of the direction vector affects the parameter estimates of the production function. In A.1, for the linear parametric DDF defined below, we prove that alternative directions lead to distinct parameter estimates.

3 Estimation

We now describe the estimation of the DDF under a specific parametric functional form and under nonparametric shape constrained methods.

3.1 Parametric Estimation and the DDF

Consider data composed of $n$ observations where the inputs are defined by $\bm{x_{i}},\ i=1,...,n$ and the outputs by $\bm{y_{i}},\ i=1,...,n$ . The estimator minimizes the squared residuals for a DDF with an arbitrary prespecified direction $\left(-\bm{g}^{x},\bm{g}^{y}\right)$ . For a linear production function, we formulate the estimator as:

[TABLE]

where $\alpha$ is the intercept, $\bm{\beta}$ and $\bm{\gamma}$ are the vectors of the marginal effects of the inputs and outputs, respectively, and the $\epsilon_{i},\,i=1,...,n$ are the residuals.

Equation (9b) enforces the translation property described in Chambers et al. (1998); i.e., scaling the netput vector by $\delta$ in the direction $(-\bm{g}^{x},\bm{g}^{y})$ causes the distance function to decrease by $\delta$ . The combination of Equation (9a) and Equation (9b) ensures that the residual is computed along the direction $(-\bm{g}^{x},\bm{g}^{y})$ . Intuitively this is because the $\bm{\beta}$ and $\bm{\gamma}$ are rescaled proportionally to the direction $(-\bm{g}^{x},\bm{g}^{y})$ in Equation (9b). For a formal proof, see Kuosmanen and Johnson (2017), Proposition 2.

3.2 The CNLS-d Estimator

Convex Nonparametric Least Squares (CNLS) is a non-parametric estimator that imposes the axiomatic properties, such as monotonicity and concavity, on the production technology. The estimator CNLS-d is the directional distance function generalization of CNLS (Hildreth (1954), Kuosmanen (2008)). While CNLS allows for just a single output, CNLS-d permits multiple outputs. In CNLS the direction along which residuals are computed is specified a priori and is typically measured in terms of the unique output, $\bm{y}$ . This corresponds to the assumption that noise is only present in $\bm{y}$ and that all other variables, $\bm{\tilde{x}}$ , do not contain noise. CNLS-d allows the residual to be measured in an arbitrary prespecified direction. If all components of the direction vector are non-zero, this corresponds to an assumption that noise is present in all inputs.

Using the same input-output data defined in Section 2.1, the CNLS-d estimator is given by:

[TABLE]

where $\alpha_{i},\,i=1,...,n$ is the vector of the intercept terms, $\bm{\beta_{i}},\,i=1,..,n$ and $\bm{\gamma_{i}},\,i=1,..,n$ are the matrices of the marginal effects of the inputs and the outputs, respectively, and $\epsilon_{i},\,i=1,...,n$ is the vector of the residuals (Kuosmanen and Johnson, 2017).

Equation (10a) is similar to (9a) with the notable different that $(\alpha_{i},\bm{\beta_{i}},\bm{\gamma_{i}})$ are indexed by $i$ indicating each observation has their own hyperplane defined by the triplet $(\alpha_{i},\bm{\beta_{i}},\bm{\gamma_{i}})$ . Equation (10b), which corresponds to the Afriat inequalities, imposes concavity. Given Equation (10b), Equation (10c) imposes the monotonicity of the estimated frontier relative to the inputs. Equation (10d) enforces the translation property described in Chambers et al. (1998) and has the same interpretation as Equation (9b). Similar to Equation (10c), the combination of Equation (10b) and Equation (10e) imposes the monotonicity of the DDF relative to the outputs. In Equation (10), we specify the CNLS-d estimator with a single common direction, $\left(-\bm{g}^{x},\bm{g}^{y}\right)$ .111111Alternatively, some researchers may be interested in using observation specific directions or perhaps group specific directions (Daraio and Simar (2016)). In A.3, we derive the conditions under which multiple directions can be used in CNLS-d while still maintaining the axiomatic property of global convexity of the production technology. Consider two groups each with their own direction used in the directional distance function. Essentially, the convexity constraint holds as long as the noise is orthogonal to the difference of the two directions used in the estimation. A simple example of this situation is all the noise being in one dimension and the difference between the two directions for this dimension is zero. However, this condition is restrictive when noise is potentially present in all variables. Thus, specifying multiple directions in CNLS-d while maintaining the axiomatic properties of the estimator, specifically, the convexity of the production possibility set, is still an open research question.

4 Measuring MSE under Alternative Directions

4.1 Illustrative Example

Data Generation Process

For our illustrative example, we use a simple linear cost function and a directional distance linear parametric estimator. We consider two noise generation processes: a random noise direction and a fixed noise direction. Here we discuss the random noise direction case, but direct the reader to B for a discussion of the fixed noise direction case.

For our example we consider a single output cost function where the observations $\left(y_{i},c_{i}\right),i=1,\ldots,n$ , are created by the Data Generation Process (DGP) outlined in Algorithm 1:

Algorithm 1

Output, $\tilde{y}_{i}$ , is drawn from the continuous uniform distribution $U\left[0,1\right]$ .

Cost is calculated as $\tilde{c}_{i}=\beta_{0}\ \tilde{y}_{i}$ , where $\beta_{0}=1$ .

The noise terms, $\epsilon_{y_{i}},\epsilon_{c_{i}}$ , are constructed as follows:

(a)

$\epsilon_{0}$ is calculated as:

$\epsilon_{0}=\frac{1}{2}\left[\;\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}\left(\tilde{y}_{i}-\bar{y}\right)^{2}}+\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}\left(\tilde{c}_{i}-\bar{c}\right)^{2}}\;\right],$

(11)

where $\bar{y}=\frac{1}{n}\sum_{i=1}^{n}{\tilde{y}_{i}}$ and $\bar{c}=\frac{1}{n}\sum_{i=1}^{n}{\tilde{c}_{i}}$ are the means of the output and cost without noise, respectively.

(b)

The scalar length of the noise is rescaled by the vector, $v_{q\epsilon_{i}}$ , in each dimension. These scaling factors are calculated as $v_{q\epsilon_{i}}=\frac{v^{*}_{q\epsilon_{i}}}{\lvert\lvert\bm{v^{*}_{\epsilon_{i}}}\rvert\rvert_{2}},q=\{1,2\}$ where $v^{*}_{q\epsilon_{i}}$ are drawn from a continuous uniform distribution $U[-1,1]$ .

(c)

$\left(\epsilon_{y_{i}},\epsilon_{c_{i}}\right)=l_{\epsilon_{i}}\,\bm{v_{\epsilon_{i}}},i=1,\ldots,n$ , where $l_{\epsilon_{i}}$ is a scalar length drawn from the normal distribution, $N\left(0,\lambda\,\epsilon_{0}\right)$ , where $\lambda$ is prespecified initial value for the standard deviation and $\bm{v_{\epsilon_{i}}}=\left[v_{1\epsilon_{i}},v_{2\epsilon_{i}}\right]$ is a normalized direction vector.

The observations with noise are obtained by appending the noise terms to the generated data:

$\binom{y_{i}}{c_{i}}=\binom{\tilde{y}_{i}}{\tilde{c}_{i}}+\binom{\epsilon_{y_{i}}}{\epsilon_{c_{i}}},i=1,\ldots,n.$

(12)

Figure 3 illustrates the results for two cases of the data generating process; in the first case the direction of the noise is random, while in the second case the direction of the noise is fixed.

Evaluating the Parametric Estimator’s Performance

We use two criteria to assess the performance of the parametric estimator: 1) Mean Squared Error (MSE) comparing the true function to the estimated function, and 2) MSE comparing the estimated function to a testing data set. While we can calculate both metrics for our Monte Carlo simulations, only the second metric can be used with our application data below.

To calculate deviations, we use the MSE direction $\left(g_{MSE}^{y},g_{MSE}^{c}\right)$ . For any particular point of the testing set, $\left(y_{ts_{i}},c_{ts_{i}}\right),i=1,\ldots,n$ , we determine the estimates, $\left(\hat{y}_{ts_{i}},\hat{c}_{ts_{i}}\right),i=1,\ldots,n$ , defined as the intersection of the estimated function characterized by the coefficients $\left(\hat{\alpha},\hat{\beta}\right)$ and the line passing through $\left(y_{ts_{i}},c_{ts_{i}}\right),i=1,\ldots,n$ , and direction vector $\left(g_{MSE}^{y},g_{MSE}^{c}\right)$ . We evaluate the value of the MSE as:

[TABLE]

To compare the true function to the estimated function, we use the Linear Function Data Generation Process, Algorithm 1, steps 1 and 2, to construct our testing data set $\left(y_{ts_{i}},c_{ts_{i}}\right),i=1,\ldots,n$ . To evaluate the estimated function without knowing the true function the testing set is built using the full Linear Function Data Generation Process.

Figure 4 show the MSE computations.

Additional Information Describing the Simulations

We apply the DGP described above to generate a training set, $\left(y_{tr_{i}},c_{tr_{i}}\right),i=1,\ldots,n_{tr}$ , and a testing set $\left(y_{ts_{i}},c_{ts_{i}}\right),i=1,\ldots,n_{ts}$ , in which noise is introduced to the observations in random directions. We set the noise scaling coefficient to $\lambda=0.6$ and the number of observations to $n_{tr}=n_{ts}=100$ . We run $100$ repetitions of the simulation for each experiment on a computer with a processor Intel Core i7 CPU 860 2.80 GHz and 8 GB RAM. We use the quadratic solver on MATLAB 2017a.

For the estimator, we define the direction vector used in the parametric DDF as a function of an angular variable $\theta$ , which allows us to investigate alternative directions. Specifically, the direction vector used in the DDF is $\left(g^{y},g^{c}\right)=\left(\cos(\theta_{t}),\sin(\theta_{t})\right)$ . We examine the set of directions corresponding to the angles $\theta_{t}\in\left\{0,\ \pi/8,\ \pi/4,\ 3\pi/8,\ \pi/2\right\}$ .

Results: Random Noise Directions

Table 1 and Table 2 show results corresponding to the two performance criteria introduced above and shown in Figure 4, the MSE relative to the true function and the MSE relative to a testing data set, respectively. Table 1 shows that the direction corresponding to the angle $\pi/4$ , $\left(g^{y}=0.707,g^{c}=0.707\right)$ , produces the smallest values of MSE (shown in bold in the table) regardless of the direction used for the MSE computation. However, the estimator’s quality diminishes if we select the extreme directions corresponding to the angles [math] and $\pi/2$ . Table 2 reports performance via a testing set, the direction corresponding to the smallest MSE value (shown in bold) is always the one matching the direction used in the MSE computation. In applications, using a testing set is necessary because the true function is unknown. Table 2 shows the benefits of matching the direction of MSE evaluation direction outweigh the benefits of selecting a direction based on the properties of the function being estimated.

For the out-of-sample testing set, the direction that provides the smallest MSE value is the direction used for the MSE computation. Because the functional estimate is optimized for the direction specified in the SDDF, it is perhaps expected that using the same direction that will be used in the MSE evaluation would produce a relatively low MSE compared to other directions. However, when the functional estimate is compared to the true function, the MSE values are around ten times smaller than the out-of-sample testing case. In out-of-sample testing the presence of noise in the observations causes a deviation regardless of the quality of the estimator or the number of observations. The DDF direction corresponding to the smallest MSE is the direction orthogonal to the true function (i.e., $\pi/4$ for our DGP). This direction provides the shortest distance from the observations to the true function. We conclude that, in this experiment, it is preferable to select a direction orthogonal to the true function (see Section 5 for further experiments).

From the fixed noise direction experiments (see B.1), we observe that using a direction for the estimator that matches the direction used for the noise generation significantly reduces the MSE values compared to the true function. From this, we infer that when endogeneity is severe, using a direction that matches the characteristics of this endogeneity significantly improves the fit of the estimator; i.e., the MSE is $50\%$ smaller for the matching direction than for the second best direction in $70\%$ of the cases (see Section 5 for the details).

Finally, we need to solve the problem of evaluating alternative directions when the true function is unknown so that we can evaluate alternative directions in the application data. Below, we describe our proposed alternative measure of fit.

4.2 Radial MSE Measure

MSE is typically measured by the average sum of squared errors in the dimension of a single variable, such as cost or output. As explained in Section 4.1, when we compare out-of-sample performance, we find that the best direction to use in estimating a SDDF is the direction used for MSE evaluation regardless of the direction of noise in the DGP or any other characteristics of the DGP. To avoid this relationship between the direction of estimation and the direction of evaluation, we propose a radial MSE measure.

We begin by normalizing the data to a unit cube and consider a case of $Q$ outputs and $n$ observations, where the original observations are:

[TABLE]

The normalized observations are:

[TABLE]

Our radial MSE measure is the distance from the testing set observation to the estimated function measured along a ray from the testing set observations to the center $C$ . Having normalized the data, the center for the radial measure is $C=[\breve{y}_{1},…,\breve{y}_{Q},\breve{c}]=\left[\overbrace{0,\ldots,0}^{Q},1\right].$

The radial MSE measure is the average of the distance from each testing set observation to the estimated function measured radially. Figure 5 illustrates this measure. For a convex function, a radial measure reduces the bias in the measure for extreme values in the domain.

5 Monte Carlo Simulations

We next examine how different DGPs affect the optimal direction for the DDF estimator based on a set of Monte Carlo simulations. We consider both random noise directions for each observation and a fixed noise direction representing a high endogeneity case. We consider the effects of the different variance levels for the noise and changes in the underlying distribution of the production data. Using the simplest case of two outputs and a fixed cost level for all observed units allows us to separate the effects of the data and of the function.

5.1 CNLS-d Formulation for Cost Isoquant Estimation

Before describing our experiments, we first outline the CNLS-d for estimating the iso-cost level set. It is based on the following optimization problem:

[TABLE]

Note all observations, $\bm{y}_{i}$ , have a common cost level. This allows us to focus on a 2-dimensional estimation problem. For results related to 3-dimensional estimation problems see B.2, Experiment 6.

We can recover the fitted values, $\hat{y}_{i}$ , and the coefficient, $\alpha_{i},\ i=1,\ldots,n$ , using:

[TABLE]

5.2 Experiments

We conducted several experiments to investigate the optimal direction for the DDF estimator. Four experiments’ results are shown in the main text of the paper with two additional experiments described in the appendix.

Experiment 1 - Base case: A two output circular isoquant with uniformly distributed angle parameters and random noise direction

For the base case, we consider a fixed cost level and approximate a two output isoquant; i.e., $Q=2$ . Indexing the outputs by $q$ and observations by $i$ , we generate the output variables as:

[TABLE]

where $\bm{\tilde{y}_{i}}$ is the observation on the isoquant and $\bm{\epsilon_{i}}$ is the noise. We generate the output levels $\tilde{y}_{qi},\ q=1,\ldots,Q\ ,i=1,\ldots,n$ as:

[TABLE]

where $\theta_{i},\ i=1,\ldots,n$ , is drawn randomly from a continuous uniform distribution, $U\left[0,\frac{\pi}{2}\right]$ . The noise terms, $\epsilon_{qi},\ q=1,\ldots,Q,\ i=1,\ldots,n$ , have the following expressions:

[TABLE]

where the length $l$ is drawn from the normal distribution $N\left(0,\lambda\right)$ , the angle $\theta_{\epsilon_{i}}$ is observation specific and characterizes the noise direction for each observation, and $\theta_{\epsilon_{i}}$ is drawn from a continuous uniform distribution $U\left[-\frac{\pi}{2},\frac{\pi}{2}\right]$ . The values considered for the directions in CNLS-d estimator are $\theta_{\textit{CNLS-d}}\in\{0,\frac{\pi}{8},\frac{\pi}{4},\frac{3\pi}{8},\frac{\pi}{2}\}$ . The standard deviation of the normal distribution is $\lambda=0.1$ . We perform the experiment $100$ times for each parameter setting.

Table 3 reports the radial MSE values from a testing set of $n$ observations lying on the true function.

As shown in Table 3, the angle corresponding to the smallest MSE (shown in bold) is the one that gives an orthogonal direction to the center of the true function, $\frac{\pi}{4}$ , and that the MSE values differ significantly, increasing at similar rates as the direction angle deviates from $\frac{\pi}{4}$ in either direction.

Experiment 2 - The base case with fixed noise directions

In this experiment, $\theta_{\epsilon_{i}}$ , which characterizes the noise direction for each observation, is constant for all observations, $\theta_{\epsilon}$ . The values used for $\theta_{\epsilon}$ and the directions in CNLS-d estimator are the same, $0,\frac{\pi}{8},\frac{\pi}{4},\frac{3\pi}{8},\frac{\pi}{2}$ . The standard deviation of the normal distribution is again $\lambda=0.1$ . We perform the experiment $100$ times for each parameter settings. Table 4 reports the results.

Each row in the Table 4 corresponds to a different noise direction in DGP. The bold numbers identify the directions in CNLS-d estimator that obtain the smallest MSE for each noise direction. We confirm our previous insight, from the parametric estimator and fixed noise direction case described in B.1, that the bold values appearing on the diagonal (from the upper-left to the lower-right of Table 4) correspond to the directions used in CNLS-d. This result indicates that selecting the direction in the SDDF that matches the underlying noise direction in the DGP results in improved functional estimates.

Experiment 3. Base case with fixed noise direction and different noise levels

In Experiment 3, we vary the noise term by changing the $\lambda$ coefficient. Table 5 reports the results for $\lambda=0.05$ .

In Table 5 (Experiment 3, with $\lambda=0.05$ ), we do not observe the same diagonal pattern observed in Experiment 2, and the best direction for CNLS-d estimator does not match the direction selected for the noise. This leads us to hypothesize that when the noise level is small, data characteristics, such as the distribution of the regressors or the shape of the function, affect the estimation whereas when the noise level is large, regressors’ relative variability becomes a more dominant factor in determining the best direction for the CNSL-d estimator.

However, with $\lambda=0.2$ the results of Experiment 3 are consistent with those from Experiment 2; i.e., the best direction always coincides with the noise direction selected. The results of Experiment 3 with $\lambda=0.2$ are reported in B, Table 15 (Experiment 3 with $\lambda=0.2$ ).

Experiment 4: Base case with different distributions for the initial observations on the true function

In Experiment 4, we seek to understand how changing the DGP for the angle, $\theta_{i},\ i=1,\ldots,n$ , affects the optimal direction. We consider the three normal distributions with different parameters: $N\left[\frac{\pi}{8},\frac{\pi}{16}\right]$ , $N\left[\frac{\pi}{4},\frac{\pi}{16}\right]$ and $N\left[\frac{3\pi}{8},\frac{\pi}{16}\right]$ . We truncate the tails of the distribution so that the generated angles fall in the range $\left[0,\pi/2\right]$ . Noise is specified as in Experiment 1. Table 6 reports the results of this experiment.

In Table 6, we observe that selecting a direction in the SDDF to match $\bar{\theta}$ , the mean of the distribution for the angle variable used in the DGP, corresponds to the smallest MSE value. This result suggests that the estimator’s performance improves when we select a direction that points to the “center” of the data.

B.2 presents additional experiments, varying the distribution of the observations and considering three outputs with a fixed costed level. These experiments lend further support to the strategy of selecting a direction pointed to the “center” of the data.

6 Proposed Approach to Direction Selection

Based on Monte Carlo simulations, we found that the optimal direction depends on the shape of the function and the distribution of the observed data. This of itself is not surprising. However, by assuming a unimodal distribution for the data generation process, a direction that aims towards the “center” of the data and is perpendicular to the true function at that point tends to outperform other directions. To apply this finding for a data set with $Q$ outputs and $n$ observations, $(y_{i1},\ldots,y_{iQ},c_{i}),\ i=1,\ldots,n$ , we suggest selecting the direction for the DDF as follows:

Normalize the data:

$\displaystyle\breve{y}_{ij}$ $\displaystyle=$ $\displaystyle\frac{y_{ij}-\min_{k}{y_{kj}}}{\max_{k}{y_{kj}}-\min_{k}{y_{kj}}},\ j=1,\ldots,Q,\ i,k=1,\ldots,n$

(24)

$\displaystyle\breve{c}_{i}$ $\displaystyle=$ $\displaystyle\frac{y_{i}-\min_{k}{c_{k}}}{\max_{k}{c_{k}}-\min_{k}{c_{k}}},\ i,k=1,\ldots,n$

(25)

Select the direction:

$\displaystyle\begin{bmatrix}g^{y_{1}}\\ \vdots\\ g^{y_{Q}}\\ g^{c}\end{bmatrix}=\begin{bmatrix}\text{median}\left(\breve{y}_{i1}\right)\\ \vdots\\ \text{median}\left(\breve{y}_{iQ}\right)\\ 1-\text{median}\left(\breve{c}_{i}\right).\end{bmatrix}$

(26)

This provides a method for direction selection that can be used in applications when the true direction is unknown.121212A cost function is convex with respect to the point $[\breve{y}_{1},…,\breve{y}_{Q},\breve{C}]=[0,…,0,1]$ . Therefore, to have a ray that points from the point $[0,…,0,1]$ to the median of the data, the directional vector $[\text{median}(\breve{y}_{i1}),...,\text{median}(\breve{y}_{iQ}),1-\text{median}(\breve{c}_{i})]$ is needed. We test the proposed method by estimating a cost function for a US hospital data set.

7 Cost Function Estimation of the US Hospital Sector

We analyze the cost variation across US hospitals using a conditional mean estimate of the cost function. We estimate a multi-output cost function for the US hospital sector by implementing our data-driven method for selecting the direction vector for the DDF. We report most productive scale size and marginal cost estimates.

7.1 Description of the Data Set

We obtain cost data from the American Hospital Association’s (AHA) Annual Survey Databases from 2007 to 2009. The costs reported include payroll, employee benefits, depreciation, interest, supply expenses and other expenses. We estimate a cost function which can be interpreted as a distance function with a single input when hospitals face the same input prices131313Unfortunately we do not observe input prices. We chose to estimate a cost function and make the assumption of common input prices rather than impose an arbitrary division of the cost.. We obtain hospital output data from the Healthcare Cost and Utilization Project (HCUP) National Inpatient Sample (NIS) core file that captures data annually for all discharges for a 20% sample of US community hospitals. The hospital sample changes every year. For each patient discharged, all procedures received are recorded as International Classification of Diseases, Ninth Revision, Clinical Modification (ICD9-CM) codes. The typical hospital in the US relies on these detailed codes to quantify the medical services it provides (Zuckerman et al. (1994)). We map the codes to four categories of procedures, specifically the procedure categories are “Minor Diagnostic,” “Minor Therapeutic,” “Major Diagnostic,” and “Major Therapeutic” which are standard output categories in the literature (Pope and Johnson (2013)). The number of procedures is each category are summed for each hospital by year to construct the output variables. The total number of hospitals sampled is around 1,000 per year from 2007 to 2009.141414The NIS survey is a stratified systematic random sample. The strata criteria are urban or rural location, teaching status, ownership, and bed size. This stratification ensures a more representative sample of discharges than a simple random sample would yield. For details see https://www.hcup-us.ahrq.gov/tech_assist/sampledesign/508_compliance/508course.htm#{463754B8-A305-47E3-B7EE-A43953AA9478}. However, mapping between the two databases is only possible for approximately 50% of the hospitals in the HCUP data, resulting in approximately 450 to 525 observations available each year.

7.2 Pre-Analysis of the Data Set

7.2.1 Testing the Relevance of the Regressors

We begin by testing the statistical significance of our four output variables, $\bm{y}=\left(y_{1},y_{2},y_{3},y_{4}\right)$ , for predicting cost. While the variables selected have been used in previous studies, we use these tests to evaluate whether this variable specification can be rejected for the current data set of U.S. hospitals from 2007-2009.

The null hypothesis stated for the $q$ th output is:

[TABLE]

against:151515Where the notation $\bm{y}-\left\{y_{q}\right\}$ implies the vector $\bm{y}$ excluding the $q$ th component.

[TABLE]

We implement the test with a Local Constant Least Squares (LCLS) estimator described in Henderson and Parmeter (2015), calculating bandwidths using least-squares cross-validation. We use 399 wild bootstraps. We found that all output variables were highly statistically significant for all years.

7.3 Results

CNLS-d and Different Directions

We analyze each year of data as a separate cross-section because, as noted above, the HCUP does not track the same set of hospitals across years. To illuminate the direction’s effect on the functional estimates, we graph “Cost” as a function of “Major Diagnostic Procedures” and “Major Therapeutic Procedures” holding “Minor Diagnostic Procedures” and “Minor Therapeutic Procedures” constant at their median values. Figure 6 illustrates the estimates for three different directions, one with only a cost component, one with only a component in Major Therapeutic Procedures, and one that comes from our median approach. Visual inspection indicates that the estimates with different directions produce significantly different estimates, highlighting the importance of considering the question of direction selection.

We compare the estimator’s performance when using different directions. Table 8 reports the MSE for three sample directions in each year. We define our direction vector as $(g^{y1},g^{y2},g^{y3},g^{y4},g^{c})$ .161616We focus on types of directions found to be competitive in our Monte Carlo simulations.

We pick two directions, one with equal components in all dimensions, and a second direction that has a cost component that is double the value of the output components. The median vector is $(0.014,0.041,0.033,0.038,0.998)$ , which is very close to the cost-only direction. The MSE varies by 15-30% over the different directions. We observe that there is no clear dominant direction; however, the median direction performs reasonably well in all cases. We conclude that as long as a direction with non-zero components for all variables that could contain noise is selected, then the precise direction selected is not critical to obtaining improved estimation results.

Comparison with other estimators

We compare three methods to estimate a cost function: 1) a quadratic functional form (without the cross-product terms), Färe et al. (2010); 2) CNLS-d with the direction selection method proposed in Section 6; and 3) lower bound estimate calculated using a local linear kernel regression with a Gaussian kernel and leave one-out cross-validation for bandwidth selection, Li and Racine (2007).171717For CNLS-d, we select a value for an upper bound through a tuning process, $\text{Ubound}=0.5$ , and impose the upper bound on the slope coefficients estimated (Lim, 2014). We select these estimators because a quadratic functional form to model production has been used in recent productivity and efficiency analysis of healthcare. See, for example, Ferrier et al. (2018). The local linear kernel is selected because it is an extremely flexible nonparametric estimator and provides a lower bound for the performance of a functional estimate. However, note that the local linear kernel does not satisfy standard properties of a cost function; i.e., cost is monotonic in output and marginal costs are increasing as output increases.

We will use the criteria of K-fold average MSE with $k=5$ to compare the approaches. This means we split the data equally into 5 parts. We use 4 of the 5 parts for estimation (training) and evaluate the performance of the estimator on the 5th part (testing). We do this for all 5 parts and average the results. The values presented in Table 9 correspond to the average across folds.

While the average MSEs for all years are lowest for the lower bound estimator, CNLS-d performs relatively well as it is close to the lower bound in terms of fitting performance while imposing standard axioms of a cost function. As is true of most production data, the hospital data are very noisy. The shape restrictions imposed in CNLS-d improves the interpretability. The CNLS-d estimator outperforms the parametric approach, indicating the general benefits of nonparametric estimators.

Description of Functional Estimates - MPSS and Marginal Costs

We report the most productive scale size (MPSS) and the marginal costs for the a quadratic parametric estimator, the CNLS-d estimator with our proposed direction selection method, and an alternative.181818Here most productive scale size is measured on each ray from the origin (fixing the output ratios) and is defined as the cost level that maximizes the ratio of aggregate output to cost. Marginal cost is measured on each ray from the origin (fixing the output ratios) and is defined as the cost to increase aggregate output by one unit. These metrics are determined on the averaged K-fold estimations for each estimation method. For the MPSS, we present the cost levels obtained for different ratios of Minor Therapeutic procedures (MinTher) and Major Therapeutic procedures (MajTher), with the minor and major diagnostics held constant at their median levels.

MPSS results are presented in Table 10 and the values for CNLS-d (Median Direction) are illustrated in Figure 7. We observe small variations across both years and estimators. The differences across years are in part due to the sample changing across years. Most hospitals are small and operate close to the MPSS. However, there are several large hospitals that are operating significantly above MPSS. Hospitals might choose to operate at larger scales and provide a large array of services allowing consumers to fulfill multiple healthcare needs.

For marginal costs, we present the values for different percentiles of the MinTher and MajTher, with the minor and major diagnostics held constant at their median levels. A more exhaustive comparison across all outputs is presented in C. Marginal cost information can be used by hospital decision makers to select the types of improvements that are likely to result in higher productivity with minimal cost increase. For example, consider a hospital that is in the $50^{th}$ percentile of the data set for all four outputs in 2008 and the hospital manager has the option to expand operations for either minor or major diagnostic procedures. Results reported in Tables 11 and 12 indicate that an increase of 1 minor therapeutic procedures would result in a $\$ 4.9k $increase in cost. Alternatively, an increase of 1 major therapeutic procedures would result in a$ $7.7k$ increase in cost. A decision maker would want to consider the revenue generated by the different procedures; however, these estimates provide insights regarding the incremental cost of additional major and minor therapeutic procedures.

CNLS-d is the most flexible of the estimators and allows MPSS values to fluctuate significantly across percentiles. CNLS-d does not smooth variation, rather it minimizes the distance from each observation to the shape constrained estimator. In C, results for the local linear kernel estimator are also presented. Even though the local linear kernel bandwidths are selected via cross-validation, relatively large values are selected due to the relatively noisy data and the highly skewed distribution of output. These large bandwidths and the parametric nature of the quadratic function make these two estimators relatively less flexible compared to CNLS-d. A feature of performance that is captured only by CNLS-d is that, hospitals specializing in either minor or major therapeutics maximize productivity at a larger scales of operation as illustrated in Figure 7.

The marginal cost results for Minor Therapeutic procedures are presented in Table 11 and Figure 8 (left) and the marginal cost results for Major Therapeutic procedures are reported in Table 12 and Figure 8 (right). As was the case for MPSS (see Table 10), CNLS-d is more flexible and its marginal cost estimates vary significantly across percentiles. The CNLS-d with different directions provides very similar marginal costs estimates. However, the CNLS-d estimates differ significantly from the marginal cost estimates obtained with the parametric estimator. For CNLS-d the marginal costs results are in line with the theory that marginal costs are increasing with scale. This property can also be violated if using a non-parametric estimator without any shape constraints imposed. For example this can be seen in the marginal costs of minor therapeutic procedures for the parametric (quadratic) regression estimator, Figure 8.

Our data set, which combines AHA cost data with AHRQ output data for a broad sample of hospitals from across the US, is unique to the best of our knowledge. However, the marginal cost estimates are broadly in line with marginal cost estimates for US hospitals for similar time periods. Gowrisankaran et al. (2015) studied a considerably smaller set of Northern Virginia hospitals observed in 2006 that, on average, were larger that hospitals in our data set. Due to the differences in the measures of output the marginal cost levels are not directly comparable. However, conditional on the size variation, the variation in marginal costs is similar to the variation we observe for the parametric (quadratic) regression specification applied to our data. Boussemart et al. (2015) analyzed data on nearly 150 hospitals located in Florida observed in 2005. The authors use a different output specification and a translog model; however, their distribution of hospital size is similar to our data set and we observe similar variances in marginal costs with the parametric (quadratic) regression specification applied to our data.

8 Conclusions

This paper investigated the improvement in functional estimates when specifying a particular direction in CNLS-d. Based on Monte Carlo experiments, two primary findings emerged from our analysis. First, directions close to the average orthogonal direction to the true function performed well. Second, when the data are noisy, selecting a direction that matched the noise direction of the DGP improves estimator performance. Our simulations indicate that CNLS-d with a direction orthogonal to the data is preferable if the noise level is not too large and that a direction that matches the noise direction of the DGP is preferred if the noise level is large. Thus, if users know the shape of the data or the characteristics of the noise, they can use CNLS-d with a direction orthogonal to the data if the noise coefficient is small. Or if the noise coefficient is large, the user can select a direction close to the true noise direction, with non-zero components in all variables that potentially have noise. Our application to US hospital data shows that CNLS-d performs similarly across different directions that all include non-zero components of the direction vector for variables that potentially have noise in their measurement.

In future research, we propose developing an alternative estimator that incorporates multiple directions in CNLS-d while maintaining the concavity axiom. This would permit treating subgroups within the data, allowing different assumptions to be made across subgroups (e.g., for-profit vs. not-for-profit hospitals).

Appendix A Properties of Directional Distance Functions and CNLS-d

A.1 Direction Selection in Directional Distance Functions

In this appendix we prove that the direction vector affects the functional estimates. Let $\bm{g}^{x,y}=(\bm{g}^{x},\bm{g}^{y})$ , then we can state the following theorem:

Theorem 1.

Suppose that two direction vectors exist, $\bm{g}^{x,y}_{a}$ and $\bm{g}^{x,y}_{b}$ , such that $\bm{g}^{x,y}_{a}\neq\bm{g}^{x,y}_{b}$ . Then the directional distance function estimates using these two different directions are not equal, $D(\bm{X},\bm{Y};\bm{g}^{x,y}_{a})\neq D(\bm{X},\bm{Y};\bm{g}^{x,y}_{b})$ .

Proof.

Rewrite Problem (10) from Section 3.2 as

[TABLE]

Observe that all decision variables appear in the objective function and that the objective function is a quadratic function while the constraints define a convex solution space; i.e., this optimization problem has a unique solution (Bertsekas (1999)). If we solve Problem (27) with $\bm{g}^{x,y}_{a}$ , then the resulting solution vector is $(\bm{\alpha}_{a},\bm{\beta}_{a},\bm{\gamma}_{a})$ . Changing the direction vector from $\bm{g}^{x,y}_{a}$ to $\bm{g}^{x,y}_{b}$ the normalization constraint $\bm{\beta_{i}}^{\prime}\,\bm{g}^{x}_{b}+\bm{\gamma_{i}}^{\prime}\,\bm{g}^{y}_{b}=1$ no longer holds for $\bm{\beta}_{a}$ and $\bm{\gamma}_{a}$ . However, the previous argument holds for the uniqueness of $(\bm{\alpha}_{b},\bm{\beta}_{b},\bm{\gamma}_{b})$ . Thus, $(\bm{\alpha}_{a},\bm{\beta}_{a},\bm{\gamma}_{a})\neq(\bm{\alpha}_{b},\bm{\beta}_{b},\bm{\gamma}_{b})$ .

∎

A.2 Details of CNLS-d

An alternative expression for CNLS-d (cf. equations (16)-(16c) from Section 5.1) is given by:

[TABLE]

It’s possible to recover $\alpha_{i},i=1,\ldots,n$ , and the final estimates using the following relations:

[TABLE]

A.3 Different Directions for Different Groups in CNLS-d

Consider the case where all observations have the same input level and produce two outputs and estimate the isoquant. Define two groups of observations $G_{1}$ and $G_{2}$ such that $\lvert G_{1}\cup G_{2}\rvert=n$ and $G_{1}\cap G_{2}=\emptyset$ .191919The notation $\lvert\cdot\rvert$ corresponds to the cardinality of the set. Using the notation in A.1, the direction vector for the first group of observations $G_{1}$ is $\bm{g}^{y_{G_{1}}}$ and it’s $\bm{g}^{y_{G_{2}}}$ for the second group of observations $G_{2}$ .

For either a fixed input vector, $\bm{X}$ , or a fixed cost level, $c$ , formulate the iso-cost estimator for $G_{1}$ and $G_{2}$ with different directions vectors as:

[TABLE]

Note that using more than one direction for CNLS-d can lead to violations on convexity. Only under very limiting conditions can we allow for multiple directions in CNLS-d and guarantee that the resulting estimated function will maintain convexity. The following theorem formalizes the conditions.

Theorem 2.

If a CNLS-d estimator is calculated using two groups of observations with different direction vectors as shown in Equation (32) and the following condition holds regarding the direction vectors and the noise direction:

[TABLE]

where

[TABLE]

then the resulting CNLS-d estimate is a concave function.

Proof.

Consider the Afriat inequalities in the context of cost isoquant estimation. One of the conditions of Equation (16) is:

[TABLE]

Knowing that $\epsilon_{i}\,\frac{\bm{g^{y_{k(i)}}}}{\|\bm{g^{y_{k(i)}}}\|}=\bm{\hat{y}_{i}}-\bm{y_{i}}$ means that $\epsilon_{i}=\left(\bm{\hat{y}_{i}}-\bm{y_{i}}\right)^{\prime}\,\frac{\bm{g^{y_{k(i)}}}}{\|\bm{g^{y_{k(i)}}}\|}$ .

Substituting $\epsilon_{i}$ and $\epsilon_{j}$ in the inequalities (34) obtains:

[TABLE]

Next, consider the case where both observations have the same direction. Then the expression is:

[TABLE]

If Equation (36) is satisfied, we know that the CNLS-d constraints hold. By comparison observe that the condition listed below is a sufficient condition for Equation (36) being satisfied when Equation (35) holds:

[TABLE]

which, after simplifying, becomes:

[TABLE]

∎

Thus Theorem 2 is proved and a sufficient condition is found that, if verified, ensures the concavity property of the estimator even when multiple directions are used in the estimation of the directional distance function.

The following corollary, concerning the convex case, is directly inferred from Theorem 2:

Corollary 1.

If a CNLS-d estimator is calculated using two groups of observations with different direction vectors as shown in Equation (32), and the following condition holds regarding the direction vectors and the noise direction:

[TABLE]

where

[TABLE]

then the resulting CNLS-d estimate is a convex function.

Proof.

Reverse the inequality sign in Equation (34):

[TABLE]

and follow the logic of the proof of Theorem 2 to obtain Corollary 1 and Equation (38).

∎

Theorem 2 clarifies that if the directions for each respective group are orthogonal to each other, then condition 33 is verified. This means that if the direction for group 1 has a single nonzero component in the output 1 dimension and group 2 has a single nonzero component in the output 2 dimension, then we will not observe violations of the convexity property.

We state a second Corollary that follows from Theorem 2, which is useful when there are more than two groups each with their own estimation direction in CNLS-d.

Corollary 2.

Let $n\in\mathbb{N}$ the total number of observation. Let $Q$ the number of outputs considered. Let $\bm{Y}=\{\bm{y}_{i}\in\mathbb{R}^{Q}_{+},i=1,\ldots,n\}$ the set of observed outputs. Let $P_{g}$ a partition of $\bm{Y}$ of cardinal $N_{g}\in\mathbb{N}$ . Let $\bm{g}^{y}=\{\bm{g}^{y_{k}},k=1,\ldots,N_{g}\}$ the set of directions used for each respective group of the partition. If a CNLS-d estimator is calculated using the directions from $\bm{g}^{y}$ based on partition $P_{g}$ , and the following condition holds regarding the direction vectors and the noise direction:

[TABLE]

where $\text{for each}\ i=1,\ldots,n,\ k(i)$ corresponds to the indicator of the part of the partition $P_{g}$ , in which $\bm{y}_{i}$ belongs. Then the resulting CNLS-d estimate is a concave function.

Proof.

We can follow the proof of Theorem 2, as the condition does not change. The condition still concerns pairwise observations, the only difference is that now the partition of observations corresponds to more than two groups. This does not affect the proof of the condition.

∎

Corollary 2 extends the statement of Theorem 2 to provide sufficient conditions to avoid violations of the shape constraints in a scenario where there are more than two groups each with their own estimation direction in CNLS-d estimation.

Simulations to investigate the frequency with which multiple directions leads to violations

We run simulations to investigate the effects of using multiple directions. We use the same DGP as stated in Section 5, Example 1. However, we define two groups and assign different directions for each one of them:

[TABLE]

and,

[TABLE]

where $\bm{g^{y_{G_{1}}}}=\left[\cos(\pi/8),\sin(\pi/8)\right]$ and $\bm{g^{y_{G_{2}}}}=\left[\cos(3\pi/8),\sin(3\pi/8)\right]$ .

We run a total of $100$ simulations. For comparison, for each simulation, we also record the estimates when using only the direction based on $\pi/8$ and $3\pi/8$ only for all observations. We identify violations of the monotonicity and concavity by sorting the estimates by $y_{1}$ . We identify all adjacent pairs and triplets, which means 99 pairs and 98 triplets given that we consider 100 observations for each simulation.

As expected, there are no violations when we use a single direction for the estimation. However, when we use two directions violations are observed. For monotonicity, we observe no violations for pairs of observations that are part of the same group. However, for pairs with one member from each group we observe violations of monotonicity for 6% of the pairs. We use the triplets to analyze concavity. When the members of the triplet are from the same group, we observe violations of concavity for 2% of the triplets. When one member of the triplet is from a different group, the violations of concavity increase to 45%. These results indicate that for one instance when the conditions of Theorem 2 do not hold, we see a significant number of violations of the maintained assumptions.

Appendix B Additional Experiments

B.1 Experiments Related to Section 4.1 - with the Linear Estimator.

Measuring MSE Example, Section 4.1 - Noise Generated in a Common and Prespecified Direction $\theta_{f}$

This section describes the simulations and the results for the fixed noise direction case referenced in Section 4.1.

The Data Generation Process (DGP) for observations $\left(\bm{y}_{i},c_{i}\right),i=1,\ldots,n$ , is as follows:

The output, $\tilde{y}_{i}$ , is drawn from the continuous uniform distribution $U\left[0,1\right]$

The cost is calculated as $\tilde{c}_{i}=\beta_{0}\ \tilde{y}_{i}$ , where $\beta_{0}=1$ .

In the case of fixed direction, the noise term is determined as:

(a)

$l_{\epsilon_{i}}$ is the scalar length that is drawn from a normal distribution, $N\left(0,\lambda\,\epsilon_{0}\right)$ , $\lambda$ is prespecified and an initial value for the standard deviation, $\epsilon_{0}$ , is calculated as in Equation (11) in Section 4.1.:

$\epsilon_{0}=\frac{1}{2}\left[\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}\left(\tilde{y}_{i}-\bar{y}\right)^{2}}+\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}\left(\tilde{c}_{i}-\bar{c}\right)^{2}}\right],$

(44)

where $\bar{y}=\frac{1}{n}\sum_{i=1}^{n}{\tilde{y}_{i}}$ and $\bar{c}=\frac{1}{n}\sum_{i=1}^{n}{\tilde{c}_{i}}$ are the mean of the output and the mean of the cost without noise, respectively.

(b)

$\bm{v_{f}}=\left[\cos(\theta_{f}),\sin(\theta_{f})\right]$ is the fixed noise direction that is inferred from the prespecified angle $\theta_{f}$ .

(c)

$\left(\epsilon_{y_{i}},\epsilon_{c_{i}}\right)=l_{\epsilon_{i}}\,\bm{v_{f}},\ i=1,\ldots,n.$

The observations with noise are obtained by appending the noise term:

$\binom{y_{i}}{c_{i}}=\binom{\tilde{y}_{i}}{\tilde{c}_{i}}+\binom{\epsilon_{y_{i}}}{\epsilon_{c_{i}}},\ i=1,\ldots,n.$

(45)

Apply the DGP described above to generate a training set, $\left(y_{tr_{i}},c_{tr_{i}}\right),\ i=1,\ldots,n_{tr}$ , and a testing set $\left(y_{ts_{i}},c_{ts_{i}}\right),\ i=1,\ldots,n_{ts}$ . Consider $100$ repetitions of the simulation and set the number of observations in each group to $n_{tr}=n_{ts}=100$ . Set the scaling coefficient for the noise to $\lambda=0.6$ . Consider different DGP since data is generated for the following values of noise direction angles, $\theta_{f}\in\left\{0,\pi/8,\pi/4,3\pi/8,\pi/2\right\}$ .

We test the set of directions corresponding to the angle $\theta_{t}\in\left\{0,\pi/8,\pi/4,3\pi/8,\pi/2\right\}$ . If the direction of the noise, $\theta_{f}$ , matches the direction used in the DDF, $\theta_{t}$ , then the smallest MSE results for all cases.

Results: Fixed Noise Direction

Table 13 reports the MSE computed by comparing the estimated function to the true function and Table 14 reports the MSE computed by comparing the estimated function to the testing set.

In Table 13, the direction for the DDF corresponding to the smallest MSE always matches the noise direction in the DGP. Further for more than $70\%$ of the cases tested there is more than a $50\%$ decrease in MSE by using the correctly specified direction compared to the next best direction tested, which was not as large in the random direction case in Table 1 of Section 4.1. In other words, when endogeneity is severe, the benefits of using a DDF with a well-selected direction are potentially large.

Table 14 is consistent with the results observed in the random noise case, in Table 2 of Section 4.1. The DDF directions corresponding to the smallest MSE values are those matching the directions used for the MSE computation. Thus, the proposed radial MSE measure addresses the challenge of measuring performance in applications with a testing dataset.

Monte Carlo Simulations - Experiments, Section 5.2 - Experiment 3. Base case with fixed noise direction and different noise levels

This section summarizes the results of Experiment 3 with $\lambda=0.2$ .

B.2 Experiments related to Section 5.2 - with CNLS-d

Here we complete Section 5.2 with additional experiments and we follow the numbering experiments numbering established then.

Experiment 5: Base case with different distributions for the initial observations on the true function

In Experiment 5, we extend the analysis performed in Experiment 4. We consider additional distributions of the DGP for the angle, $\theta_{i},\ i=1,\ldots,n$ and see how it affects the optimal direction. Unlike Experiment 4, we don’t consider only normal distributions, instead we consider the following: a normal distribution, $N\left(\frac{\pi}{4},\frac{\pi}{16}\right)$ , and two gamma distributions, $\Gamma\left(3,\frac{\pi}{2}\right)$ and $\Gamma\left(.5,\frac{\pi}{24}\right)$ . For the gamma distributions, the first parameter corresponds to the shape coefficient and the second the scale coefficient. Each distribution is later referenced respectively as $Normal$ , $Gamma_{1}$ and $Gamma_{2}$ . We truncate the tails of the distribution so that the generated angles fall within the range $\left[0,\pi/2\right]$ . Noise is specified as in Experiment 1. In Figure 10, the distributions of the angles $\theta_{i}$ are illustrated and in particular the median values are highlighted. Table 16 reports the results of this experiment.

Two main conclusion can be drawn from the results in Table 16. First, the smaller the variance of the data distribution, the greater is the importance of direction selection. Looking at the differences between the two gamma distributions, $Gamma_{1}$ has a larger tail than $Gamma_{2}$ , which means the observations for $Gamma_{2}$ have a smaller variance. Table 16 indicates that the MSE increases rapidly with deviations from the optimal direction when variance of observations is smaller as with $Gamma_{2}$ compared to $Gamma_{1}$ . Second, among the directions tested, $\theta_{i}$ , MSE is minimized for the direction closest to the direction corresponding to the median of the distribution. This second point supports the selection approach proposed in Section 6.

Experiment 6: Adaptation of the Base Case to a 3-Dimensional Case

We adapt the DGP from Experiment 1, the base case. We consider a fixed input level and approximate a three output isoquant, $Q=3$ . Indexing the outputs by $q$ and observations by $i$ , we define the outputs,

[TABLE]

where $\bm{\tilde{y}_{i}}$ is the observation on the isoquant and $\bm{\epsilon_{i}}$ is the noise. The output levels $\tilde{y}_{qi},\ q=1,\ldots,Q,\ i=1,\ldots,n$ are generated:

[TABLE]

where $l_{qi},\ q=1,\ldots,Q,\ i=1,\ldots,n$ , are drawn randomly from a continuous uniform distribution, $U\left[0,1\right]$ .

The noise terms $\bm{\epsilon}_{i},\ i=1,\ldots,n$ is adapted to the 3-dimensional isoquant:

[TABLE]

where the length $l_{\epsilon_{i}}$ is drawn from the normal distribution $N\left(0,\lambda\right)$ , and $v_{qi}=\frac{v^{*}_{qi}}{\lvert\lvert\bm{v}_{i}\rvert\rvert_{2}},\ q=1,\ldots,Q,\ i=1,\ldots,n$ for which $v^{*}_{qi}$ are drawn from a continuous uniform distribution $U\left[-1,1\right]$ .

In Experiment 6, 19 directions are considered for the CNLS-d estimators. The directions are determined using the following steps:

enumerate all 3 component vectors, corresponding to $\mathbb{R}^{3}$ with elements from the set $\{0,0.5,1\}$ and excluding $\left(0,0,0\right)$ ; 2. 2.

normalize the direction vectors dividing them by their respective Euclidean norms; 3. 3.

eliminate duplicates

The 19 directions are represented by the markers in Figure 11 and create a balanced grid on the eighth of a unit sphere, our isoquant. The median direction is $[1/\sqrt{3},1/\sqrt{3},1/\sqrt{3}]=[.58,58,.58]$ . The standard deviation of the normal distribution is $\lambda=0.1$ . We perform this experiment $100$ times for each direction. We report the averaged radial MSE values on a testing set of $n$ observations lying on the true function in Table 17. In addition to the table, the MSE results are also illustrated in Figure 11 where the size of the markers has a positive affine relation with the MSE values and that in the color range from yellow to red, with larger the MSE values associated with more red markers.

We can establish three categories of directions that correspond to certain ranges of MSE values. The first category corresponds to the worst MSE values, which are almost twice the smallest values. These are the directions that have only one non-zero component shown with red markers on the corners of the surface shown in Figure 11. The second category is for the MSE values that are above $5\cdot 10^{-4}$ but less than $8\cdot 10^{-4}$ . These directions are labeled with the orange markers in Figure 11 that are on the edges of the surface but not the corners. One of their directional components, $\left(\bm{g}^{x},\bm{g}^{y}\right)$ , is zero but all others are not. The third category of directions, which has the smallest MSEs, correspond to the yellow markers in Figure 11. These directions have only positive components. Thus, we observe a trend that the directions that have positive components in all variables correspond to the best MSE values. The median value direction, $[0.58,0.58,0.58]$ , is among the yellow markers. These results support the selection approach proposed in Section 6 and confirm the results obtained on the US hospitals data set.

Appendix C U.S. Hospital Dataset Application

We describe the functional estimates provided by quadratic regression, CNLS-d using a direction with equal components in all dimensions and CNLS-d using the median direction, and the local linear kernel. Table 18 provides most productive scale size (MPSS) measurements in cost in $\$ M $. Tables [19](#A3.T19) and [20](#A3.T20) provide the marginal cost of Minor Therapeutic procedures and the marginal cost of Major Therapeutic procedures, respectively. The units for Tables [19](#A3.T19) and [20](#A3.T20) are cost in$ $k$ over Minor and Major Therapeutic procedures, respectively.

Our conclusions are the same as stated in the body of the paper, CNLS-d provides the advantage of being more flexible than the parametric estimator (quadratic regression) while having shape constraints that maintain the interpretability of the results.

Bibliography58

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Ackerberg et al. (2015) Ackerberg, D. A., Caves, K., Frazer, G., 2015. Identification properties of recent production function estimators. Econometrica 83 (6), 2411–2451.
2Adler and Volta (2016) Adler, N., Volta, N., 2016. Accounting for externalities and disposability: a directional economic environmental distance function. European Journal of Operational Research 250 (1), 314–327.
3Afriat (1972) Afriat, S. N., 1972. Efficiency estimation of production functions. International Economic Review 13 (3), 568–598.
4Aparicio et al. (2017) Aparicio, J., Pastor, J., Zofio, J., 2017. Can Farrell’s allocative efficiency be generalized by the directional distance function approach? European Journal of Operational Research 257 (1), 345–351.
5Atkinson et al. (2003) Atkinson, S., Cornwell, C., Honerkamp, O., 2003. Measuring and decomposing productivity change: stochastic distance function estimation versus data envelopment analysis. Journal of Business & Economic Statistics 21 (2), 284–294.
6Atkinson and Tsionas (2016) Atkinson, S., Tsionas, M., 2016. Directional distance functions: optimal endogenous directions. Journal of Econometrics 190 (2), 301–314.
7Baležentis and De Witte (2015) Baležentis, T., De Witte, K., 2015. One- and multi-directional conditional efficiency measurement: efficiency in Lithuanian family farms. European Journal of Operational Research 245 (2), 612–622.
8Bertsekas (1999) Bertsekas, D. P., 1999. Nonlinear programming. Athena Scientific, Belmont, MA.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Direction Selection in Stochastic Directional Distance Functions

Abstract

keywords:

1 Introduction

2 Models

2.1 Statistical Model

2.2 Production Model

2.2.1 The Deterministic Directional Distance Function (DDF)

2.2.2 The Stochastic Directional Distance Function

Proposition 1**.**

3 Estimation

3.1 Parametric Estimation and the DDF

3.2 The CNLS-d Estimator

4 Measuring MSE under Alternative Directions

4.1 Illustrative Example

Data Generation Process

Evaluating the Parametric Estimator’s Performance

Additional Information Describing the Simulations

Results: Random Noise Directions

4.2 Radial MSE Measure

5 Monte Carlo Simulations

5.1 CNLS-d Formulation for Cost Isoquant Estimation

5.2 Experiments

Experiment 1 - Base case: A two output circular isoquant with uniformly distributed angle parameters and random noise direction

Experiment 2 - The base case with fixed noise directions

Experiment 3. Base case with fixed noise direction and different noise levels

Experiment 4: Base case with different distributions for the initial observations on the true function

6 Proposed Approach to Direction Selection

7 Cost Function Estimation of the US Hospital Sector

7.1 Description of the Data Set

7.2 Pre-Analysis of the Data Set

7.2.1 Testing the Relevance of the Regressors

7.3 Results

CNLS-d and Different Directions

Comparison with other estimators

Description of Functional Estimates - MPSS and Marginal Costs

8 Conclusions

Appendix A Properties of Directional Distance Functions and CNLS-d

A.1 Direction Selection in Directional Distance Functions

Theorem 1**.**

Proof.

A.2 Details of CNLS-d

A.3 Different Directions for Different Groups in CNLS-d

Theorem 2**.**

Proof.

Corollary 1**.**

Proof.

Corollary 2**.**

Proof.

Simulations to investigate the frequency with which multiple directions leads to violations

Appendix B Additional Experiments

B.1 Experiments Related to Section 4.1 - with the Linear Estimator.

Measuring MSE Example, Section 4.1 - Noise Generated in a Common and Prespecified Direction θf\theta_{f}θf​

Results: Fixed Noise Direction

Monte Carlo Simulations - Experiments, Section 5.2 - Experiment 3. Base case with fixed noise direction and different noise levels

B.2 Experiments related to Section 5.2 - with CNLS-d

Experiment 5: Base case with different distributions for the initial observations on the true function

Experiment 6: Adaptation of the Base Case to a 3-Dimensional Case

Appendix C U.S. Hospital Dataset Application

Proposition 1.

Theorem 1.

Theorem 2.

Corollary 1.

Corollary 2.

Measuring MSE Example, Section 4.1 - Noise Generated in a Common and Prespecified Direction $\theta_{f}$