Title: | Stochastic Frontier Analysis Routines |
Version: | 1.0.1 |
Description: | Maximum likelihood estimation for stochastic frontier analysis (SFA) of production (profit) and cost functions. The package includes the basic stochastic frontier for cross-sectional or pooled data with several distributions for the one-sided error term (i.e., Rayleigh, gamma, Weibull, lognormal, uniform, generalized exponential and truncated skewed Laplace), the latent class stochastic frontier model (LCM) as described in Dakpo et al. (2021) <doi:10.1111/1477-9552.12422>, for cross-sectional and pooled data, and the sample selection model as described in Greene (2010) <doi:10.1007/s11123-009-0159-1>, and applied in Dakpo et al. (2021) <doi:10.1111/agec.12683>. Several possibilities in terms of optimization algorithms are proposed. |
License: | GPL (≥ 3) |
URL: | https://github.com/hdakpo/sfaR |
BugReports: | https://github.com/hdakpo/sfaR/issues |
Depends: | R (≥ 3.5.0) |
Imports: | cubature, fastGHQuad, Formula, marqLevAlg, maxLik, methods, mnorm, nleqslv, plm, qrng, randtoolbox, sandwich, stats, texreg, trustOptim, ucminf |
Suggests: | lmtest |
Encoding: | UTF-8 |
Language: | en-US |
LazyData: | true |
RoxygenNote: | 7.3.1 |
NeedsCompilation: | no |
Packaged: | 2024-10-28 15:04:10 UTC; Dakpo |
Author: | K Hervé Dakpo [aut, cre], Yann Desjeux [aut], Arne Henningsen [aut], Laure Latruffe [aut] |
Maintainer: | K Hervé Dakpo <k-herve.dakpo@inrae.fr> |
Repository: | CRAN |
Date/Publication: | 2024-10-29 08:40:01 UTC |
sfaR: A package for estimating stochastic frontier models
Description
The sfaR package provides a set of tools (maximum likelihood - ML and maximum simulated likelihood - MSL) for various specifications of stochastic frontier analysis (SFA).
Details
Three categories of functions are available: sfacross
,
sfalcmcross
, sfaselectioncross
,
which estimate different types of frontiers and offer eleven alternative
optimization algorithms (i.e., "bfgs", "bhhh", "nr", "nm", "cg", "sann",
"ucminf", "mla", "sr1", "sparse", "nlminb").
sfacross
sfacross
estimates the basic stochastic
frontier analysis (SFA) for cross-sectional or pooled data and allows for
ten different distributions for the one-sided error term. These distributions
include the exponential, the gamma, the generalized exponential,
the half normal, the lognormal, the truncated normal, the truncated skewed
Laplace, the Rayleigh, the uniform, and the Weibull distributions.
In the case of the gamma, lognormal, and Weibull distributions, maximum
simulated likelihood (MSL) is used with the possibility of four specific
distributions to construct the draws: halton, generalized halton, sobol and
uniform. Heteroscedasticity in both error terms can be implemented, in
addition to heterogeneity in the truncated mean parameter in the case of the
truncated normal and lognormal distributions. In addition, in the case of the
truncated normal distribution, the scaling property can be estimated.
sfalcmcross
sfalcmcross
estimates latent class
stochastic frontier models (LCM) for cross-sectional or pooled data.
It accounts for technological heterogeneity by splitting the observations
into a maximum number of five classes. The classification operates based on
a logit functional form that can be specified using some covariates (namely,
the separating variables allowing the separation of observations in several
classes). Only the half normal distribution is available for the one-sided
error term. Heteroscedasticity in both error terms is possible. The choice of
the number of classes can be guided by several information criteria (i.e.,
AIC, BIC, or HQIC).
sfaselectioncross
sfaselectioncross
estimates the
frontier for cross-sectional or pooled data in the presence of sample
selection. The model solves the selection bias due to the correlation
between the two-sided error terms in both the selection and the frontier
equations. The likelihood can be estimated using five different
possibilities: gauss-kronrod quadrature, adaptive integration over hypercubes
(hcubature and pcubature), gauss-hermite quadrature, and
maximum simulated likelihood. Only the half normal
distribution is available for the one-sided error term. Heteroscedasticity
in both error terms is possible.
Bugreport
Any bug or suggestion can be reported using the
sfaR
tracker facilities at:
https://github.com/hdakpo/sfaR/issues
Author(s)
K Hervé Dakpo, Yann Desjeux, Arne Henningsen and Laure Latruffe
Extract coefficients of stochastic frontier models
Description
From an object of class 'summary.sfacross'
,
'summary.sfalcmcross'
, or 'summary.sfaselectioncross'
,
coef
extracts the coefficients,
their standard errors, z-values, and (asymptotic) P-values.
From on object of class 'sfacross'
, 'sfalcmcross'
, or
'sfaselectioncross'
, it extracts only the estimated coefficients.
Usage
## S3 method for class 'sfacross'
coef(object, extraPar = FALSE, ...)
## S3 method for class 'summary.sfacross'
coef(object, ...)
## S3 method for class 'sfalcmcross'
coef(object, extraPar = FALSE, ...)
## S3 method for class 'summary.sfalcmcross'
coef(object, ...)
## S3 method for class 'sfaselectioncross'
coef(object, extraPar = FALSE, ...)
## S3 method for class 'summary.sfaselectioncross'
coef(object, ...)
Arguments
object |
A stochastic frontier model returned by |
extraPar |
Logical (default =
|
... |
Currently ignored. |
Value
For objects of class 'summary.sfacross'
,
'summary.sfalcmcross'
, or 'summary.sfaselectioncross'
,
coef
returns a matrix with four columns. Namely, the
estimated coefficients, their standard errors, z-values,
and (asymptotic) P-values.
For objects of class 'sfacross'
, 'sfalcmcross'
, or
'sfaselectioncross'
, coef
returns a numeric vector of
the estimated coefficients. If extraPar = TRUE
, additional parameters,
detailed in the section ‘Arguments’, are also returned. In the case
of object of class 'sfalcmcross'
, each additional
parameter ends with '#'
that represents the class number.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
Examples
## Not run:
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog SFA (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1,
scaling = TRUE, method = 'mla')
coef(tl_u_ts, extraPar = TRUE)
coef(summary(tl_u_ts))
## End(Not run)
Data on Norwegian dairy farms
Description
This dataset contains nine years (1998-2006) of information on Norwegian dairy farms.
Format
A data frame with 2,727 observations on the following 23 variables.
- farmid
Farm identification.
- year
Year identification.
- y1
Milk sold (1000 liters).
- y2
Meat (1000 NOK).
- y3
Support payments (1000 NOK).
- y4
Other outputs (1000 NOK).
- p1
Milk price (NOK/liter).
- p2
Meat price (cattle index).
- p3
Support payments price (CP index).
- p4
Other outputs price index.
- x1
Land (decare (daa) = 0.1 ha).
- x2
Labour (1000 hours).
- x3
Purchase feed (1000 NOK).
- x4
Other variable costs (1000 NOK).
- x5
Cattle capital (1000 NOK).
- x6
Other capital (1000 NOK).
- w1
Land price (NOK/daa).
- w2
Labour price (NOK/hour).
- w3
Feed price index.
- w4
Other variable cost index.
- w5
Cattle capital rent.
- w6
Other capital rent and depreciation.
- tc
Total cost.
Source
https://sites.google.com/view/sfbook-stata/home
References
Kumbhakar, S.C., H.J. Wang, and A. Horncastle. 2014. A Practitioner's Guide to Stochastic Frontier Analysis Using Stata. Cambridge University Press.
Examples
str(dairynorway)
summary(dairynorway)
Data on Spanish dairy farms
Description
This dataset contains six years of observations on 247 dairy farms in northern Spain, drawn from 1993-1998. The original data consist in the farm and year identifications, plus measurements on one output (i.e. milk), and four inputs (i.e. cows, land, labor and feed).
Format
A data frame with 1,482 observations on the following 29 variables.
- FARM
Farm identification.
- AGEF
Age of the farmer.
- YEAR
Year identification.
- COWS
Number of milking cows.
- LAND
Agricultural area.
- MILK
Milk production.
- LABOR
Labor.
- FEED
Feed.
- YIT
Log of
MILK
.- X1
Log of
COWS
.- X2
Log of
LAND
.- X3
Log of
LABOR
.- X4
Log of
FEED
.- X11
1/2 *
X1
^2.- X22
1/2 *
X2
^2.- X33
1/2 *
X3
^2.- X44
1/2 *
X4
^2.- X12
X1
*X2
.- X13
X1
*X3
.- X14
X1
*X4
.- X23
X2
*X3
.- X24
X2
*X4
.- X34
X3
*X4
.- YEAR93
Dummy for
YEAR = 1993
.- YEAR94
Dummy for
YEAR = 1994
.- YEAR95
Dummy for
YEAR = 1995
.- YEAR96
Dummy for
YEAR = 1996
.- YEAR97
Dummy for
YEAR = 1997
.- YEAR98
Dummy for
YEAR = 1998
.
Details
This dataset has been used in Alvarez et al. (2004). The data have been normalized so that the logs of the inputs sum to zero over the 1,482 observations.
Source
https://pages.stern.nyu.edu/~wgreene/Text/Edition7/tablelist8new.htm
References
Alvarez, A., C. Arias, and W. Greene. 2004. Accounting for unobservables in production models: management and inefficiency. Econometric Society, 341:1–20.
Examples
str(dairyspain)
summary(dairyspain)
Compute conditional (in-)efficiency estimates of stochastic frontier models
Description
efficiencies
returns (in-)efficiency estimates of models
estimated with sfacross
, sfalcmcross
, or
sfaselectioncross
.
Usage
## S3 method for class 'sfacross'
efficiencies(object, level = 0.95, newData = NULL, ...)
## S3 method for class 'sfalcmcross'
efficiencies(object, level = 0.95, newData = NULL, ...)
## S3 method for class 'sfaselectioncross'
efficiencies(object, level = 0.95, newData = NULL, ...)
Arguments
object |
A stochastic frontier model returned
by |
level |
A number between between 0 and 0.9999 used for the computation
of (in-)efficiency confidence intervals (defaut = |
newData |
Optional data frame that is used to calculate the efficiency
estimates. If NULL (the default), the efficiency estimates are calculated
for the observations that were used in the estimation. In the case of object of
class |
... |
Currently ignored. |
Details
In general, the conditional inefficiency is obtained following Jondrow et al. (1982) and the conditional efficiency is computed following Battese and Coelli (1988). In some cases the conditional mode is also returned (Jondrow et al. 1982). The confidence interval is computed following Horrace and Schmidt (1996), Hjalmarsson et al. (1996), or Berra and Sharma (1999) (see ‘Value’ section).
In the case of the half normal distribution for the one-sided error term,
the formulae are as follows (for notations, see the ‘Details’ section
of sfacross
or sfalcmcross
):
The conditional inefficiency is:
E\left\lbrack u_i|\epsilon_i\right
\rbrack=\mu_{i\ast} + \sigma_\ast\frac{\phi
\left(\frac{\mu_{i\ast}}{\sigma_\ast}\right)}{
\Phi\left(\frac{\mu_{i\ast}}{\sigma_\ast}\right)}
where
\mu_{i\ast}=\frac{-S\epsilon_i\sigma_u^2}{ \sigma_u^2 + \sigma_v^2}
and
\sigma_\ast^2 = \frac{\sigma_u^2 \sigma_v^2}{\sigma_u^2 + \sigma_v^2}
The Battese and Coelli (1988) conditional efficiency is obtained with:
E\left\lbrack\exp{\left(-u_i\right)}
|\epsilon_i\right\rbrack = \exp{\left(-\mu_{i\ast}+
\frac{1}{2}\sigma_\ast^2\right)}\frac{\Phi\left(
\frac{\mu_{i\ast}}{\sigma_\ast}-\sigma_\ast\right)}{
\Phi\left(\frac{\mu_{i\ast}}{\sigma_\ast}\right)}
The reciprocal of the Battese and Coelli (1988) conditional efficiency is obtained with:
E\left\lbrack\exp{\left(u_i\right)}
|\epsilon_i\right\rbrack = \exp{\left(\mu_{i\ast}+
\frac{1}{2}\sigma_\ast^2\right)} \frac{\Phi\left(
\frac{\mu_{i\ast}}{\sigma_\ast}+\sigma_\ast\right)}{
\Phi\left(\frac{\mu_{i\ast}}{\sigma_\ast}\right)}
The conditional mode is computed using:
M\left\lbrack u_i|\epsilon_i\right
\rbrack= \mu_{i\ast} \quad \hbox{For} \quad
\mu_{i\ast} > 0
and
M\left\lbrack u_i|\epsilon_i\right
\rbrack= 0 \quad \hbox{For} \quad \mu_{i\ast} \leq 0
The confidence intervals are obtained with:
\mu_{i\ast} + I_L\sigma_\ast \leq
E\left\lbrack u_i|\epsilon_i\right\rbrack \leq
\mu_{i\ast} + I_U\sigma_\ast
with LB_i = \mu_{i*} + I_L\sigma_*
and
UB_i = \mu_{i*} + I_U\sigma_*
and
I_L = \Phi^{-1}\left\lbrace 1 -
\left(1-\frac{\alpha}{2}\right)\left\lbrack 1-
\Phi\left(-\frac{\mu_{i\ast}}{\sigma_\ast}\right)
\right\rbrack\right\rbrace
and
I_U = \Phi^{-1}\left\lbrace 1-
\frac{\alpha}{2}\left\lbrack 1-\Phi
\left(-\frac{\mu_{i\ast}}{\sigma_\ast}\right)
\right\rbrack\right\rbrace
Thus
\exp{\left(-UB_i\right)} \leq E\left
\lbrack\exp{\left(-u_i\right)}|\epsilon_i\right\rbrack
\leq\exp{\left(-LB_i\right)}
In the case of the sample selection, as underlined in Greene (2010), the conditional inefficiency could be computed using Jondrow et al. (1982). However, here the conditionanl (in)efficiency is obtained using the properties of the closed skew-normal (CSN) distribution (Lai, 2015). The conditional efficiency can be obtained using the moment generating functions of a CSN distribution (see Gonzalez-Farias et al. (2004)). We have:
E\left\lbrack\exp{\left(tu_i\right)}
|\epsilon_i\right\rbrack = M_{u|\epsilon}(t)=\frac{\Phi_2\left(\tilde{\mathbf{D}}
\tilde{\bm{\Sigma}}t; \tilde{\bm{\kappa}}, \tilde{\bm{\Delta}} +
\tilde{\mathbf{D}}\tilde{\bm{\Sigma}}\tilde{\mathbf{D}}' \right)}{
\Phi_2\left(\mathbf{0}; \tilde{\bm{\kappa}}, \tilde{\bm{\Delta}} +
\tilde{\mathbf{D}}\tilde{\bm{\Sigma}}\tilde{\mathbf{D}}'\right)}\exp{
\left(t\tilde{\bm{\pi}} + \frac{1}{2}t^2\tilde{\bm{\Sigma}}\right)}
where \tilde{\bm{\pi}} = \frac{-S\epsilon_i\sigma_u^2}{\sigma_v^2 + \sigma_u^2}
,
\tilde{\bm{\Sigma}} = \frac{\sigma_v^2\sigma_u^2}{\sigma_v^2 + \sigma_u^2}
,
\tilde{\mathbf{D}} = \begin{pmatrix} \frac{S\rho}{\sigma_v} \\ 1 \end{pmatrix}
,
\tilde{\bm{\kappa}} = \begin{pmatrix} - \mathbf{Z}'_{si}\bm{\gamma} -
\frac{\rho\sigma_v\epsilon_i}{\sigma_v^2 + \sigma_u^2}\\
\frac{S\sigma_u^2\epsilon_i}{\sigma_v^2 + \sigma_u^2} \end{pmatrix}
,
\tilde{\bm{\Delta}} = \begin{pmatrix}1-\rho^2 & 0 \\ 0 & 0\end{pmatrix}
.
The derivation of the efficiency and the reciprocal efficiency is obtained by replacing
t = -1
and t =1
, respectively. To obtain the inefficiency as
E\left[u_i|\epsilon_i\right]
is more complicated as it requires the
derivation of a multivariate normal cdf. We have:
E\left[u_i|\epsilon_i\right] = \left. \frac{\partial M_{u|\epsilon}(t)}{\partial t}\right\rvert_{t = 0}
Then
E\left[u_i|\epsilon_i\right] = \tilde{\bm{\pi}} +
\left(\tilde{\mathbf{D}}\tilde{\bm{\Sigma}}\right)'\frac{\Phi_2^*
\left(\mathbf{0}; \tilde{\bm{\kappa}}, \ddot{\bm{\Delta}}\right)}{
\Phi_2\left(\mathbf{0}; \tilde{\bm{\kappa}}, \ddot{\bm{\Delta}}\right)}
where \Phi_2^* \left(\mathbf{s}; \tilde{\bm{\kappa}}, \ddot{\bm{\Delta}}\right)=
\frac{\partial \Phi_2\left(\mathbf{s}; \tilde{\bm{\kappa}}, \ddot{\bm{\Delta}} \right)}{\partial \mathbf{s}}
Value
A data frame that contains individual (in-)efficiency estimates. These are ordered in the same way as the corresponding observations in the dataset used for the estimation.
- For object of class 'sfacross'
the following elements are returned:
u |
Conditional inefficiency. In the case argument |
uLB |
Lower bound for conditional inefficiency. Only when the argument
|
uUB |
Upper bound for conditional inefficiency. Only when the argument
|
teJLMS |
|
m |
Conditional model. Only when the argument |
teMO |
|
teBC |
Battese and Coelli (1988) conditional efficiency. Only when, in
the function sfacross, |
teBC_reciprocal |
Reciprocal of Battese and Coelli (1988) conditional
efficiency. Similar to |
teBCLB |
Lower bound for Battese and Coelli (1988) conditional
efficiency. Only when, in the function sfacross, |
teBCUB |
Upper bound for Battese and Coelli (1988) conditional
efficiency. Only when, in the function sfacross, |
theta |
In the case |
- For object of class 'sfalcmcross'
the following elements are returned:
Group_c |
Most probable class for each observation. |
PosteriorProb_c |
Highest posterior probability. |
u_c |
Conditional inefficiency of the most probable class given the posterior probability. |
teJLMS_c |
|
teBC_c |
|
teBC_reciprocal_c |
|
PosteriorProb_c# |
Posterior probability of class #. |
PriorProb_c# |
Prior probability of class #. |
u_c# |
Conditional inefficiency associated to class #, regardless of
|
teBC_c# |
Conditional efficiency
( |
teBC_reciprocal_c# |
Reciprocal conditional efficiency
( |
ineff_c# |
Conditional inefficiency ( |
effBC_c# |
Conditional efficiency ( |
ReffBC_c# |
Reciprocal conditional efficiency ( |
theta_c# |
In the case |
- For object of class 'sfaselectioncross'
the following elements are returned:
u |
Conditional inefficiency. |
teJLMS |
|
teBC |
Battese and Coelli (1988) conditional efficiency. Only when, in
the function sfaselectioncross,
|
teBC_reciprocal |
Reciprocal of Battese and Coelli (1988) conditional
efficiency. Similar to |
References
Battese, G.E., and T.J. Coelli. 1988. Prediction of firm-level technical efficiencies with a generalized frontier production function and panel data. Journal of Econometrics, 38:387–399.
Bera, A.K., and S.C. Sharma. 1999. Estimating production uncertainty in stochastic frontier production function models. Journal of Productivity Analysis, 12:187-210.
Gonzalez-Farias, G., Dominguez-Molina, A., Gupta, A. K., 2004. Additive properties of skew normal random vectors. Journal of Statistical Planning and Inference. 126: 521-534.
Greene, W., 2010. A stochastic frontier model with correction for sample selection. Journal of Productivity Analysis. 34, 15–24.
Hjalmarsson, L., S.C. Kumbhakar, and A. Heshmati. 1996. DEA, DFA and SFA: A comparison. Journal of Productivity Analysis, 7:303-327.
Horrace, W.C., and P. Schmidt. 1996. Confidence statements for efficiency estimates from stochastic frontier models. Journal of Productivity Analysis, 7:257-282.
Jondrow, J., C.A.K. Lovell, I.S. Materov, and P. Schmidt. 1982. On the estimation of technical inefficiency in the stochastic frontier production function model. Journal of Econometrics, 19:233–238.
Lai, H. P., 2015. Maximum likelihood estimation of the stochastic frontier model with endogenous switching or sample selection. Journal of Productivity Analysis, 43: 105-117.
Nguyen, N.B. 2010. Estimation of technical efficiency in stochastic frontier analysis. PhD Dissertation, Bowling Green State University, August.
See Also
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
Examples
## Not run:
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog SFA (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) +
log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) *
log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal',
muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla')
eff.tl_u_ts <- efficiencies(tl_u_ts)
head(eff.tl_u_ts)
summary(eff.tl_u_ts)
## End(Not run)
Data on U.S. electric power generation
Description
This dataset is on electric power generation in the United States.
Format
A data frame with 123 observations on the following 9 variables.
- firm
Firm identification.
- cost
Total cost in 1970, MM USD.
- output
Output in million KwH.
- lprice
Labor price.
- lshare
Labor's cost share.
- cprice
Capital price.
- cshare
Capital's cost share.
- fprice
Fuel price.
- fshare
Fuel's cost share.
Details
The dataset is from Christensen and Greene (1976) and has also been used in Greene (1990).
Source
https://pages.stern.nyu.edu/~wgreene/Text/Edition7/tablelist8new.htm
References
Christensen, L.R., and W.H. Greene. 1976. Economies of scale in US electric power generation. The Journal of Political Economy, 84:655–676.
Greene, W.H. 1990. A Gamma-distributed stochastic frontier model. Journal of Econometrics, 46:141–163.
Examples
str(electricity)
summary(electricity)
Extract frontier information to be used with texreg package
Description
Extract coefficients and additional information for stochastic frontier models
returned by sfacross
, sfalcmcross
, or
sfaselectioncross
.
Usage
extract.sfacross(model, ...)
extract.sfalcmcross(model, ...)
extract.sfaselectioncross(model, ...)
Arguments
model |
objects of class |
... |
Currently ignored |
Value
A texreg object representing the statistical model.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional data.
Examples
hlf <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs')
trnorm <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, data = utility, S = -1, method = 'bfgs')
tscal <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility,
S = -1, method = 'bfgs', scaling = TRUE)
expo <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'exponential', uhet = ~ regu, data = utility, S = -1, method = 'bfgs')
texreg::screenreg(list(hlf, trnorm, tscal, expo))
Extract fitted values of stochastic frontier models
Description
fitted
returns the fitted frontier values from stochastic
frontier models estimated with sfacross
, sfalcmcross
,
or sfaselectioncross
.
Usage
## S3 method for class 'sfacross'
fitted(object, ...)
## S3 method for class 'sfalcmcross'
fitted(object, ...)
## S3 method for class 'sfaselectioncross'
fitted(object, ...)
Arguments
object |
A stochastic frontier model returned
by |
... |
Currently ignored. |
Value
In the case of an object of class 'sfacross'
, or
'sfaselectioncross'
, a vector of fitted values is returned.
In the case of an object of class 'sfalcmcross'
, a data frame
containing the fitted values for each class is returned where each variable
ends with '_c#'
, '#'
being the class number.
Note
The fitted values are ordered in the same way as the corresponding observations in the dataset used for the estimation.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
Examples
## Not run:
## Using data on eighty-two countries production (GDP)
# LCM Cobb Douglas (production function) half normal distribution
cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal',
data = worldprod)
fit.cb_2c_h <- fitted(cb_2c_h)
head(fit.cb_2c_h)
## End(Not run)
Extract information criteria of stochastic frontier models
Description
ic
returns information criterion from stochastic
frontier models estimated with sfacross
, sfalcmcross
,
or sfaselectioncross
.
Usage
## S3 method for class 'sfacross'
ic(object, IC = "AIC", ...)
## S3 method for class 'sfalcmcross'
ic(object, IC = "AIC", ...)
## S3 method for class 'sfaselectioncross'
ic(object, IC = "AIC", ...)
Arguments
object |
A stochastic frontier model returned
by |
IC |
Character string. Information criterion measure. Three criteria are available:
. |
... |
Currently ignored. |
Details
The different information criteria are computed as follows:
-
AIC:
-2 \log{LL} + 2 * K
BIC:
-2 \log{LL} + \log{N} * K
HQIC:
-2 \log{LL} + 2 \log{\left[\log{N}\right]} * K
where
LL
is the maximum likelihood value, K
the number of parameters
estimated and N
the number of observations.
Value
ic
returns the value of the information criterion
(AIC, BIC or HQIC) of the maximum likelihood coefficients.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
Examples
## Not run:
## Using data on Swiss railway
# LCM (cost function) half normal distribution
cb_2c_u <- sfalcmcross(formula = LNCT ~ LNQ2 + LNQ3 + LNNET + LNPK + LNPL,
udist = 'hnormal', uhet = ~ 1, data = swissrailways, S = -1, method='ucminf')
ic(cb_2c_u)
ic(cb_2c_u, IC = 'BIC')
ic(cb_2c_u, IC = 'HQIC')
## End(Not run)
Extract log-likelihood value of stochastic frontier models
Description
logLik
extracts the log-likelihood value(s) from stochastic
frontier models estimated with sfacross
, sfalcmcross
,
or sfaselectioncross
.
Usage
## S3 method for class 'sfacross'
logLik(object, individual = FALSE, ...)
## S3 method for class 'sfalcmcross'
logLik(object, individual = FALSE, ...)
## S3 method for class 'sfaselectioncross'
logLik(object, individual = FALSE, ...)
Arguments
object |
A stochastic frontier model returned
by |
individual |
Logical. If |
... |
Currently ignored. |
Value
logLik
returns either an object of class
'logLik'
, which is the log-likelihood value with the total number of
observations (nobs
) and the number of free parameters (df
) as
attributes, when individual = FALSE
, or a list of elements, containing
the log-likelihood of each observation (logLik
), the total number of
observations (Nobs
) and the number of free parameters (df
),
when individual = TRUE
.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
Examples
## Not run:
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog SFA (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1,
scaling = TRUE, method = 'mla')
logLik(tl_u_ts)
## Using data on eighty-two countries production (GDP)
# LCM Cobb Douglas (production function) half normal distribution
cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal',
data = worldprod, S = 1)
logLik(cb_2c_h, individual = TRUE)
## End(Not run)
Marginal effects of the inefficiency drivers in stochastic frontier models
Description
This function returns marginal effects of the inefficiency drivers from stochastic
frontier models estimated with sfacross
, sfalcmcross
,
or sfaselectioncross
.
Usage
## S3 method for class 'sfacross'
marginal(object, newData = NULL, ...)
## S3 method for class 'sfalcmcross'
marginal(object, newData = NULL, ...)
## S3 method for class 'sfaselectioncross'
marginal(object, newData = NULL, ...)
Arguments
object |
A stochastic frontier model returned
by |
newData |
Optional data frame that is used to calculate the marginal
effect of |
... |
Currently ignored. |
Details
marginal
operates in the presence of exogenous
variables that explain inefficiency, namely the inefficiency drivers
(uhet = ~ Z_u
or muhet = ~ Z_{mu}
).
Two components are computed for each variable: the marginal effects on the
expected inefficiency (\frac{\partial E[u]}{\partial Z_{mu}}
) and
the marginal effects on the variance of inefficiency (\frac{\partial
V[u]}{\partial Z_{mu}}
).
The model also allows the Wang (2002) parametrization of \mu
and
\sigma_u^2
by the same vector of exogenous variables. This double
parameterization accounts for non-monotonic relationships between the
inefficiency and its drivers.
Value
marginal
returns a data frame containing the marginal
effects of the Z_u
variables on the expected inefficiency (each
variable has the prefix 'Eu_'
) and on the variance of the
inefficiency (each variable has the prefix 'Vu_'
).
In the case of the latent class stochastic frontier (LCM), each variable
ends with '_c#'
where '#'
is the class number.
References
Wang, H.J. 2002. Heteroscedasticity and non-monotonic efficiency effects of a stochastic frontier model. Journal of Productivity Analysis, 18:241–253.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
Examples
## Not run:
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog SFA (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu + wl, uhet = ~ regu + wl, data = utility,
S = -1, scaling = TRUE, method = 'mla')
marg.tl_u_ts <- marginal(tl_u_ts)
summary(marg.tl_u_ts)
## Using data on eighty-two countries production (GDP)
# LCM Cobb Douglas (production function) half normal distribution
cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal',
data = worldprod, uhet = ~ initStat + h, S = 1, method = 'mla')
marg.cb_2c_h <- marginal(cb_2c_h)
summary(marg.cb_2c_h)
## End(Not run)
Extract total number of observations used in frontier models
Description
This function extracts the total number of 'observations' from a fitted frontier model.
Usage
## S3 method for class 'sfacross'
nobs(object, ...)
## S3 method for class 'sfalcmcross'
nobs(object, ...)
## S3 method for class 'sfaselectioncross'
nobs(object, ...)
Arguments
object |
a |
... |
Currently ignored. |
Details
nobs
gives the number of observations actually
used by the estimation procedure. It is not necessarily the number
of observations of the model frame (number of rows in the model
frame), because sometimes the model frame is further reduced by the
estimation procedure especially in the presence of NA. In the case of
sfaselectioncross
, nobs
returns the number of observations used in the
frontier equation.
Value
A single number, normally an integer.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
Examples
## Not run:
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog (cost function) half normal with heteroscedasticity
tl_u_h <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs')
nobs(tl_u_h)
## End(Not run)
Extract residuals of stochastic frontier models
Description
This function returns the residuals' values from stochastic frontier models
estimated with sfacross
, sfalcmcross
, or
sfaselectioncross
.
Usage
## S3 method for class 'sfacross'
residuals(object, ...)
## S3 method for class 'sfalcmcross'
residuals(object, ...)
## S3 method for class 'sfaselectioncross'
residuals(object, ...)
Arguments
object |
A stochastic frontier model returned
by |
... |
Currently ignored. |
Value
When the object
is of class 'sfacross'
, or
'sfaselectioncross'
, residuals
returns a vector of
residuals values.
When the object
is of 'sfalcmcross'
,
residuals
returns a data frame containing the residuals values
for each latent class, where each variable ends with '_c#'
,
'#'
being the class number.
Note
The residuals values are ordered in the same way as the corresponding observations in the dataset used for the estimation.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
Examples
## Not run:
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog SFA (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1,
scaling = TRUE, method = 'mla')
resid.tl_u_ts <- residuals(tl_u_ts)
head(resid.tl_u_ts)
## Using data on eighty-two countries production (GDP)
# LCM Cobb Douglas (production function) half normal distribution
cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal',
data = worldprod, S = 1)
resid.cb_2c_h <- residuals(cb_2c_h)
head(resid.cb_2c_h)
## End(Not run)
Data on rice production in the Philippines
Description
This dataset contains annual data collected from 43 smallholder rice producers in the Tarlac region of the Philippines between 1990 and 1997.
Format
A data frame with 344 observations on the following 17 variables.
- YEARDUM
Time period (1= 1990, ..., 8 = 1997).
- FARMERCODE
Farmer code (1, ..., 43).
- PROD
Output (tonnes of freshly threshed rice).
- AREA
Area planted (hectares).
- LABOR
Labor used (man-days of family and hired labor).
- NPK
Fertiliser used (kg of active ingredients).
- OTHER
Other inputs used (Laspeyres index = 100 for Farm 17 in 1991).
- PRICE
Output price (pesos per kg).
- AREAP
Rental price of land (pesos per hectare).
- LABORP
Labor price (pesos per hired man-day).
- NPKP
Fertiliser price (pesos per kg of active ingredient).
- OTHERP
Price of other inputs (implicit price index).
- AGE
Age of the household head (years).
- EDYRS
Education of the household head (years).
- HHSIZE
Household size.
- NADULT
Number of adults in the household.
- BANRAT
Percentage of area classified as bantog (upland) fields.
Details
This dataset is published as supplement to Coelli et al. (2005). While most variables of this dataset were supplied by the International Rice Research Institute (IRRI), some were calculated by Coelli et al. (2005, see p. 325–326). The survey is described in Pandey et al. (1999).
References
Coelli, T. J., Rao, D. S. P., O'Donnell, C. J., and Battese, G. E. 2005. An Introduction to Efficiency and Productivity Analysis, Springer, New York.
Pandey, S., Masciat, P., Velasco, L, and Villano, R. 1999. Risk analysis of a rainfed rice production system system in Tarlac, Central Luzon, Philippines. Experimental Agriculture, 35:225–237.
Examples
str(ricephil)
summary(ricephil)
Deprecated functions of sfaR
Description
These functions are provided for compatibility with older versions of ‘sfaR’ only, and could be defunct at a future release.
Usage
lcmcross(
formula,
uhet,
vhet,
thet,
logDepVar = TRUE,
data,
subset,
weights,
wscale = TRUE,
S = 1L,
udist = "hnormal",
start = NULL,
whichStart = 2L,
initAlg = "nm",
initIter = 100,
lcmClasses = 2,
method = "bfgs",
hessianType = 1,
itermax = 2000L,
printInfo = FALSE,
tol = 1e-12,
gradtol = 1e-06,
stepmax = 0.1,
qac = "marquardt"
)
## S3 method for class 'lcmcross'
print(x, ...)
## S3 method for class 'lcmcross'
bread(x, ...)
## S3 method for class 'lcmcross'
estfun(x, ...)
## S3 method for class 'lcmcross'
coef(object, extraPar = FALSE, ...)
## S3 method for class 'summary.lcmcross'
coef(object, ...)
## S3 method for class 'lcmcross'
fitted(object, ...)
## S3 method for class 'lcmcross'
ic(object, IC = "AIC", ...)
## S3 method for class 'lcmcross'
logLik(object, individual = FALSE, ...)
## S3 method for class 'lcmcross'
marginal(object, newData = NULL, ...)
## S3 method for class 'lcmcross'
nobs(object, ...)
## S3 method for class 'lcmcross'
residuals(object, ...)
## S3 method for class 'lcmcross'
summary(object, grad = FALSE, ci = FALSE, ...)
## S3 method for class 'summary.lcmcross'
print(x, digits = max(3, getOption("digits") - 2), ...)
## S3 method for class 'lcmcross'
efficiencies(object, level = 0.95, newData = NULL, ...)
## S3 method for class 'lcmcross'
vcov(object, ...)
Arguments
formula |
A symbolic description of the model to be estimated based on
the generic function |
uhet |
A one-part formula to account for heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to account for heteroscedasticity in the two-sided error variance (see section ‘Details’). |
thet |
A one-part formula to account for technological heterogeneity in the construction of the classes. |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted
log-likelihood. Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Distribution specification for the one-sided
error term. Only the half normal distribution |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
whichStart |
Integer. If |
initAlg |
Character string specifying the algorithm used for
initialization and obtain the starting values (when |
initIter |
Maximum number of iterations for initialization algorithm.
Default |
lcmClasses |
Number of classes to be estimated (default = |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class lcmcross (returned by the function
|
... |
additional arguments of frontier are passed to lcmcross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
object |
an object of class lcmcross (returned by the function
|
extraPar |
Logical (default = |
IC |
Character string. Information criterion measure. Three criteria are available:
. |
individual |
Logical. If |
newData |
Optional data frame that is used to calculate the efficiency estimates. If NULL (the default), the efficiency estimates are calculated for the observations that were used in the estimation. |
grad |
Logical. Default = |
ci |
Logical. Default = |
digits |
Numeric. Number of digits displayed in values. |
level |
A number between between 0 and 0.9999 used for the computation
of (in-)efficiency confidence intervals (defaut = |
Details
The following functions are deprecated and could be removed from sfaR in a near future. Use the replacement indicated below:
lcmcross:
sfalcmcross
bread.lcmcross:
bread.sfalcmcross
coef.lcmcross:
coef.sfalcmcross
coef.summary.lcmcross:
coef.summary.sfalcmcross
efficiencies.lcmcross:
efficiencies.sfalcmcross
estfun.lcmcross:
estfun.sfalcmcross
fitted.lcmcross:
fitted.sfalcmcross
ic.lcmcross:
ic.sfalcmcross
logLik.lcmcross:
logLik.sfalcmcross
marginal.lcmcross:
marginal.sfalcmcross
nobs.lcmcross:
nobs.sfalcmcross
print.lcmcross:
print.sfalcmcross
print.summary.lcmcross:
print.summary.sfalcmcross
residuals.lcmcross:
residuals.sfalcmcross
summary.lcmcross:
summary.sfalcmcross
vcov.lcmcross:
vcov.sfalcmcross
Stochastic frontier estimation using cross-sectional data
Description
sfacross
is a symbolic formula-based function for the
estimation of stochastic frontier models in the case of cross-sectional or
pooled cross-sectional data, using maximum (simulated) likelihood - M(S)L.
The function accounts for heteroscedasticity in both one-sided and two-sided error terms as in Reifschneider and Stevenson (1991), Caudill and Ford (1993), Caudill et al. (1995) and Hadri (1999), but also heterogeneity in the mean of the pre-truncated distribution as in Kumbhakar et al. (1991), Huang and Liu (1994) and Battese and Coelli (1995).
Ten distributions are possible for the one-sided error term and eleven optimization algorithms are available.
The truncated normal - normal distribution with scaling property as in Wang and Schmidt (2002) is also implemented.
Usage
sfacross(
formula,
muhet,
uhet,
vhet,
logDepVar = TRUE,
data,
subset,
weights,
wscale = TRUE,
S = 1L,
udist = "hnormal",
scaling = FALSE,
start = NULL,
method = "bfgs",
hessianType = 1L,
simType = "halton",
Nsim = 100,
prime = 2L,
burn = 10,
antithetics = FALSE,
seed = 12345,
itermax = 2000,
printInfo = FALSE,
tol = 1e-12,
gradtol = 1e-06,
stepmax = 0.1,
qac = "marquardt"
)
## S3 method for class 'sfacross'
print(x, ...)
## S3 method for class 'sfacross'
bread(x, ...)
## S3 method for class 'sfacross'
estfun(x, ...)
Arguments
formula |
A symbolic description of the model to be estimated based on
the generic function |
muhet |
A one-part formula to consider heterogeneity in the mean of the pre-truncated distribution (see section ‘Details’). |
uhet |
A one-part formula to consider heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to consider heteroscedasticity in the two-sided error variance (see section ‘Details’). |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted
log-likelihood. Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Default =
|
scaling |
Logical. Only when |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
simType |
Character string. If |
Nsim |
Number of draws for MSL. Default 100. |
prime |
Prime number considered for Halton and Generalized-Halton
draws. Default = |
burn |
Number of the first observations discarded in the case of Halton
draws. Default = |
antithetics |
Logical. Default = |
seed |
Numeric. Seed for the random draws. |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class sfacross (returned by the function
|
... |
additional arguments of frontier are passed to sfacross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
Details
The stochastic frontier model for the cross-sectional data is defined as:
y_i = \alpha + \mathbf{x_i^{\prime}}\bm{\beta} + v_i - Su_i
with
\epsilon_i = v_i -Su_i
where i
is the observation, y
is the
output (cost, revenue, profit), \mathbf{x}
is the vector of main explanatory
variables (inputs and other control variables), u
is the one-sided
error term with variance \sigma_{u}^2
, and v
is the two-sided
error term with variance \sigma_{v}^2
.
S = 1
in the case of production (profit) frontier function and
S = -1
in the case of cost frontier function.
The model is estimated using maximum likelihood (ML) for most distributions
except the Gamma, Weibull and log-normal distributions for which maximum
simulated likelihood (MSL) is used. For this latter, several draws can be
implemented namely Halton, Generalized Halton, Sobol and uniform. In the
case of uniform draws, antithetics can also be computed: first Nsim/2
draws are obtained, then the Nsim/2
other draws are obtained as
counterpart of one (1-draw
).
To account for heteroscedasticity in the variance parameters of the error
terms, a single part (right) formula can also be specified. To impose the
positivity to these parameters, the variances are modelled as:
\sigma^2_u = \exp{(\bm{\delta}'\mathbf{Z}_u)}
or \sigma^2_v =
\exp{(\bm{\phi}'\mathbf{Z}_v)}
, where \mathbf{Z}_u
and \mathbf{Z}_v
are the heteroscedasticity
variables (inefficiency drivers in the case of \mathbf{Z}_u
) and \bm{\delta}
and \bm{\phi}
the coefficients. In the case of heterogeneity in the
truncated mean \mu
, it is modelled as \mu=\bm{\omega}'\mathbf{Z}_{\mu}
. The
scaling property can be applied for the truncated normal distribution:
u \sim h(\mathbf{Z}_u, \delta)u
where u
follows a truncated normal
distribution N^+(\tau, \exp{(cu)})
.
In the case of the truncated normal distribution, the convolution of
u_i
and v_i
is:
f(\epsilon_i)=\frac{1}{\sqrt{\sigma_u^2 +
\sigma_v^2}}\phi\left(\frac{S\epsilon_i + \mu}{\sqrt{
\sigma_u^2 + \sigma_v^2}}\right)\Phi\left(\frac{
\mu_{i*}}{\sigma_*}\right)\Big/\Phi\left(\frac{
\mu}{\sigma_u}\right)
where
\mu_{i*}=\frac{\mu\\\sigma_v^2 -
S\epsilon_i\sigma_u^2}{\sigma_u^2 + \sigma_v^2}
and
\sigma_*^2 = \frac{\sigma_u^2
\sigma_v^2}{\sigma_u^2 + \sigma_v^2}
In the case of the half normal distribution the convolution is obtained by
setting \mu=0
.
sfacross
allows for the maximization of weighted log-likelihood.
When option weights
is specified and wscale = TRUE
, the weights
are scaled as:
new_{weights} = sample_{size} \times
\frac{old_{weights}}{\sum(old_{weights})}
For complex problems, non-gradient methods (e.g. nm
or sann
)
can be used to warm start the optimization and zoom in the neighborhood of
the solution. Then a gradient-based methods is recommended in the second
step. In the case of sann
, we recommend to significantly increase the
iteration limit (e.g. itermax = 20000
). The Conjugate Gradient
(cg
) can also be used in the first stage.
A set of extractor functions for fitted model objects is available for
objects of class 'sfacross'
including methods to the generic functions
print
,
summary
, coef
,
fitted
,
logLik
,
residuals
,
vcov
,
efficiencies
,
ic
,
marginal
,
skewnessTest
,
estfun
and
bread
(from the sandwich package),
lmtest::coeftest()
(from the lmtest package).
Value
sfacross
returns a list of class 'sfacross'
containing the following elements:
call |
The matched call. |
formula |
The estimated model. |
S |
The argument |
typeSfa |
Character string. 'Stochastic Production/Profit Frontier, e =
v - u' when |
Nobs |
Number of observations used for optimization. |
nXvar |
Number of explanatory variables in the production or cost frontier. |
nmuZUvar |
Number of variables explaining heterogeneity in the
truncated mean, only if |
scaling |
The argument |
logDepVar |
The argument |
nuZUvar |
Number of variables explaining heteroscedasticity in the one-sided error term. |
nvZVvar |
Number of variables explaining heteroscedasticity in the two-sided error term. |
nParm |
Total number of parameters estimated. |
udist |
The argument |
startVal |
Numeric vector. Starting value for M(S)L estimation. |
dataTable |
A data frame (tibble format) containing information on data
used for optimization along with residuals and fitted values of the OLS and
M(S)L estimations, and the individual observation log-likelihood. When
|
olsParam |
Numeric vector. OLS estimates. |
olsStder |
Numeric vector. Standard errors of OLS estimates. |
olsSigmasq |
Numeric. Estimated variance of OLS random error. |
olsLoglik |
Numeric. Log-likelihood value of OLS estimation. |
olsSkew |
Numeric. Skewness of the residuals of the OLS estimation. |
olsM3Okay |
Logical. Indicating whether the residuals of the OLS estimation have the expected skewness. |
CoelliM3Test |
Coelli's test for OLS residuals skewness. (See Coelli, 1995). |
AgostinoTest |
D'Agostino's test for OLS residuals skewness. (See D'Agostino and Pearson, 1973). |
isWeights |
Logical. If |
optType |
Optimization algorithm used. |
nIter |
Number of iterations of the ML estimation. |
optStatus |
Optimization algorithm termination message. |
startLoglik |
Log-likelihood at the starting values. |
mlLoglik |
Log-likelihood value of the M(S)L estimation. |
mlParam |
Parameters obtained from M(S)L estimation. |
gradient |
Each variable gradient of the M(S)L estimation. |
gradL_OBS |
Matrix. Each variable individual observation gradient of the M(S)L estimation. |
gradientNorm |
Gradient norm of the M(S)L estimation. |
invHessian |
Covariance matrix of the parameters obtained from the M(S)L estimation. |
hessianType |
The argument |
mlDate |
Date and time of the estimated model. |
simDist |
The argument |
Nsim |
The argument |
FiMat |
Matrix of random draws used for MSL, only if |
Note
For the Halton draws, the code is adapted from the mlogit package.
References
Aigner, D., Lovell, C. A. K., and Schmidt, P. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics, 6(1), 21–37.
Battese, G. E., and Coelli, T. J. 1995. A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empirical Economics, 20(2), 325–332.
Caudill, S. B., and Ford, J. M. 1993. Biases in frontier estimation due to heteroscedasticity. Economics Letters, 41(1), 17–20.
Caudill, S. B., Ford, J. M., and Gropper, D. M. 1995. Frontier estimation and firm-specific inefficiency measures in the presence of heteroscedasticity. Journal of Business & Economic Statistics, 13(1), 105–111.
Coelli, T. 1995. Estimators and hypothesis tests for a stochastic frontier function - a Monte-Carlo analysis. Journal of Productivity Analysis, 6:247–268.
D'Agostino, R., and E.S. Pearson. 1973. Tests for departure from normality.
Empirical results for the distributions of b_2
and \sqrt{b_1}
.
Biometrika, 60:613–622.
Greene, W. H. 2003. Simulated likelihood estimation of the normal-Gamma stochastic frontier function. Journal of Productivity Analysis, 19(2-3), 179–190.
Hadri, K. 1999. Estimation of a doubly heteroscedastic stochastic frontier cost function. Journal of Business & Economic Statistics, 17(3), 359–363.
Hajargasht, G. 2015. Stochastic frontiers with a Rayleigh distribution. Journal of Productivity Analysis, 44(2), 199–208.
Huang, C. J., and Liu, J.-T. 1994. Estimation of a non-neutral stochastic frontier production function. Journal of Productivity Analysis, 5(2), 171–180.
Kumbhakar, S. C., Ghosh, S., and McGuckin, J. T. 1991) A generalized production frontier approach for estimating determinants of inefficiency in U.S. dairy farms. Journal of Business & Economic Statistics, 9(3), 279–286.
Li, Q. 1996. Estimating a stochastic production frontier when the adjusted error is symmetric. Economics Letters, 52(3), 221–228.
Meeusen, W., and Vandenbroeck, J. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review, 18(2), 435–445.
Migon, H. S., and Medici, E. V. 2001. Bayesian hierarchical models for stochastic production frontier. Lacea, Montevideo, Uruguay.
Nguyen, N. B. 2010. Estimation of technical efficiency in stochastic frontier analysis. PhD dissertation, Bowling Green State University, August.
Papadopoulos, A. 2021. Stochastic frontier models using the generalized exponential distribution. Journal of Productivity Analysis, 55:15–29.
Reifschneider, D., and Stevenson, R. 1991. Systematic departures from the frontier: A framework for the analysis of firm inefficiency. International Economic Review, 32(3), 715–723.
Stevenson, R. E. 1980. Likelihood Functions for Generalized Stochastic Frontier Estimation. Journal of Econometrics, 13(1), 57–66.
Tsionas, E. G. 2007. Efficiency measurement with the Weibull stochastic frontier. Oxford Bulletin of Economics and Statistics, 69(5), 693–706.
Wang, K., and Ye, X. 2020. Development of alternative stochastic frontier models for estimating time-space prism vertices. Transportation.
Wang, H.J., and Schmidt, P. 2002. One-step and two-step estimation of the effects of exogenous variables on technical efficiency levels. Journal of Productivity Analysis, 18:129–144.
Wang, J. 2012. A normal truncated skewed-Laplace model in stochastic frontier analysis. Master thesis, Western Kentucky University, May.
See Also
print
for printing sfacross
object.
summary
for creating and printing
summary results.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
skewnessTest
for conducting residuals
skewness test.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
skewnessTest
for implementing skewness test.
Examples
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog (cost function) half normal with heteroscedasticity
tl_u_h <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs')
summary(tl_u_h)
# Translog (cost function) truncated normal with heteroscedasticity
tl_u_t <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, data = utility, S = -1, method = 'bhhh')
summary(tl_u_t)
# Translog (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1,
scaling = TRUE, method = 'mla')
summary(tl_u_ts)
## Using data on Philippine rice producers
# Cobb Douglas (production function) generalized exponential, and Weibull
# distributions
cb_p_ge <- sfacross(formula = log(PROD) ~ log(AREA) + log(LABOR) + log(NPK) +
log(OTHER), udist = 'genexponential', data = ricephil, S = 1, method = 'bfgs')
summary(cb_p_ge)
## Using data on U.S. electric utility industry
# Cost frontier Gamma distribution
tl_u_g <- sfacross(formula = log(cost/fprice) ~ log(output) + I(log(output)^2) +
I(log(lprice/fprice)) + I(log(cprice/fprice)), udist = 'gamma', uhet = ~ 1,
data = electricity, S = -1, method = 'bfgs', simType = 'halton', Nsim = 200,
hessianType = 2)
summary(tl_u_g)
Latent class stochastic frontier using cross-sectional data
Description
sfalcmcross
is a symbolic formula based function for the
estimation of the latent class stochastic frontier model (LCM) in the case
of cross-sectional or pooled cross-sectional data. The model is estimated
using maximum likelihood (ML). See Orea and Kumbhakar (2004), Parmeter and
Kumbhakar (2014, p282).
Only the half-normal distribution is possible for the one-sided error term. Eleven optimization algorithms are available.
The function also accounts for heteroscedasticity in both one-sided and two-sided error terms, as in Reifschneider and Stevenson (1991), Caudill and Ford (1993), Caudill et al. (1995) and Hadri (1999).
The model can estimate up to five classes.
Usage
sfalcmcross(
formula,
uhet,
vhet,
thet,
logDepVar = TRUE,
data,
subset,
weights,
wscale = TRUE,
S = 1L,
udist = "hnormal",
start = NULL,
whichStart = 2L,
initAlg = "nm",
initIter = 100,
lcmClasses = 2,
method = "bfgs",
hessianType = 1,
itermax = 2000L,
printInfo = FALSE,
tol = 1e-12,
gradtol = 1e-06,
stepmax = 0.1,
qac = "marquardt"
)
## S3 method for class 'sfalcmcross'
print(x, ...)
## S3 method for class 'sfalcmcross'
bread(x, ...)
## S3 method for class 'sfalcmcross'
estfun(x, ...)
Arguments
formula |
A symbolic description of the model to be estimated based on
the generic function |
uhet |
A one-part formula to account for heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to account for heteroscedasticity in the two-sided error variance (see section ‘Details’). |
thet |
A one-part formula to account for technological heterogeneity in the construction of the classes. |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted
log-likelihood. Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Distribution specification for the one-sided
error term. Only the half normal distribution |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
whichStart |
Integer. If |
initAlg |
Character string specifying the algorithm used for
initialization and obtain the starting values (when |
initIter |
Maximum number of iterations for initialization algorithm.
Default |
lcmClasses |
Number of classes to be estimated (default = |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class sfalcmcross (returned by the function
|
... |
additional arguments of frontier are passed to sfalcmcross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
Details
LCM is an estimation of a finite mixture of production functions:
y_i = \alpha_j + \mathbf{x_i^{\prime}}
\bm{\beta_j} + v_{i|j} - Su_{i|j}
\epsilon_{i|j} = v_{i|j} - Su_{i|j}
where i
is the observation, j
is the class, y
is the
output (cost, revenue, profit), x
is the vector of main explanatory
variables (inputs and other control variables), u
is the one-sided
error term with variance \sigma_{u}^2
, and v
is the two-sided
error term with variance \sigma_{v}^2
.
S = 1
in the case of production (profit) frontier function and
S = -1
in the case of cost frontier function.
The contribution of observation i
to the likelihood conditional on
class j
is defined as:
P(i|j) = \frac{2}{\sqrt{\sigma_{u|j}^2 +
\sigma_{v|j}^2}}\phi\left(\frac{S\epsilon_{i|j}}{\sqrt{
\sigma_{u|j}^2 +\sigma_{v|j}^2}}\right)\Phi\left(\frac{
\mu_{i*|j}}{\sigma_{*|j}}\right)
where
\mu_{i*|j}=\frac{- S\epsilon_{i|j}
\sigma_{u|j}^2}{\sigma_{u|j}^2 + \sigma_{v|j}^2}
and
\sigma_*^2 = \frac{\sigma_{u|j}^2
\sigma_{v|j}^2}{\sigma_{u|j}^2 + \sigma_{v|j}^2}
The prior probability of using a particular technology can depend on some covariates (namely the variables separating the observations into classes) using a logit specification:
\pi(i,j) = \frac{\exp{(\bm{\theta}_j'\mathbf{Z}_{hi})}}{
\sum_{m=1}^{J}\exp{(\bm{\theta}_m'\mathbf{Z}_{hi})}}
with \mathbf{Z}_h
the covariates, \bm{\theta}
the coefficients estimated for
the covariates, and \exp(\bm{\theta}_J'\mathbf{Z}_h)=1
.
The unconditional likelihood of observation i
is simply the average
over the J
classes:
P(i) = \sum_{m=1}^{J}\pi(i,m)P(i|m)
The number of classes to retain can be based on information criterion (see
for instance ic
).
Class assignment is based on the largest posterior probability. This
probability is obtained using Bayes' rule, as follows for class j
:
w\left(j|i\right)=\frac{P\left(i|j\right)
\pi\left(i,j\right)}{\sum_{m=1}^JP\left(i|m\right)
\pi\left(i, m\right)}
To accommodate heteroscedasticity in the variance parameters of the error
terms, a single part (right) formula can also be specified. To impose the
positivity on these parameters, the variances are modelled respectively as:
\sigma^2_{u|j} = \exp{(\bm{\delta}_j'\mathbf{Z}_u)}
and \sigma^2_{v|j} =
\exp{(\bm{\phi}_j'\mathbf{Z}_v)}
, where Z_u
and Z_v
are the
heteroscedasticity variables (inefficiency drivers in the case of \mathbf{Z}_u
)
and \bm{\delta}
and \bm{\phi}
the coefficients. 'sfalcmcross'
only
supports the half-normal distribution for the one-sided error term.
sfalcmcross
allows for the maximization of weighted log-likelihood.
When option weights
is specified and wscale = TRUE
, the weights
are scaled as:
new_{weights} = sample_{size} \times
\frac{old_{weights}}{\sum(old_{weights})}
For complex problems, non-gradient methods (e.g. nm
or
sann
) can be used to warm start the optimization and zoom in the
neighborhood of the solution. Then a gradient-based methods is recommended
in the second step. In the case of sann
, we recommend to significantly
increase the iteration limit (e.g. itermax = 20000
). The Conjugate
Gradient (cg
) can also be used in the first stage.
A set of extractor functions for fitted model objects is available for
objects of class 'sfalcmcross'
including methods to the generic functions
print
,
summary
,
coef
,
fitted
,
logLik
,
residuals
,
vcov
,
efficiencies
,
ic
,
marginal
,
estfun
and
bread
(from the sandwich package),
lmtest::coeftest()
(from the lmtest package).
Value
sfalcmcross
returns a list of class 'sfalcmcross'
containing the following elements:
call |
The matched call. |
formula |
Multi parts formula describing the estimated model. |
S |
The argument |
typeSfa |
Character string. 'Latent Class Production/Profit Frontier, e
= v - u' when |
Nobs |
Number of observations used for optimization. |
nXvar |
Number of main explanatory variables. |
nZHvar |
Number of variables in the logit specification of the finite mixture model (i.e. number of covariates). |
logDepVar |
The argument |
nuZUvar |
Number of variables explaining heteroscedasticity in the one-sided error term. |
nvZVvar |
Number of variables explaining heteroscedasticity in the two-sided error term. |
nParm |
Total number of parameters estimated. |
udist |
The argument |
startVal |
Numeric vector. Starting value for ML estimation. |
dataTable |
A data frame (tibble format) containing information on data
used for optimization along with residuals and fitted values of the OLS and
ML estimations, and the individual observation log-likelihood. When
|
initHalf |
When |
isWeights |
Logical. If |
optType |
The optimization algorithm used. |
nIter |
Number of iterations of the ML estimation. |
optStatus |
An optimization algorithm termination message. |
startLoglik |
Log-likelihood at the starting values. |
nClasses |
The number of classes estimated. |
mlLoglik |
Log-likelihood value of the ML estimation. |
mlParam |
Numeric vector. Parameters obtained from ML estimation. |
mlParamMatrix |
Double. Matrix of ML parameters by class. |
gradient |
Numeric vector. Each variable gradient of the ML estimation. |
gradL_OBS |
Matrix. Each variable individual observation gradient of the ML estimation. |
gradientNorm |
Numeric. Gradient norm of the ML estimation. |
invHessian |
The covariance matrix of the parameters obtained from the ML estimation. |
hessianType |
The argument |
mlDate |
Date and time of the estimated model. |
Note
In the case of panel data, sfalcmcross
estimates a pooled
cross-section where the probability of belonging to a class a priori is not
permanent (not fixed over time).
References
Aigner, D., Lovell, C. A. K., and P. Schmidt. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics, 6(1), 21–37.
Caudill, S. B., and J. M. Ford. 1993. Biases in frontier estimation due to heteroscedasticity. Economics Letters, 41(1), 17–20.
Caudill, S. B., Ford, J. M., and D. M. Gropper. 1995. Frontier estimation and firm-specific inefficiency measures in the presence of heteroscedasticity. Journal of Business & Economic Statistics, 13(1), 105–111.
Hadri, K. 1999. Estimation of a doubly heteroscedastic stochastic frontier cost function. Journal of Business & Economic Statistics, 17(3), 359–363.
Meeusen, W., and J. Vandenbroeck. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review, 18(2), 435–445.
Orea, L., and S.C. Kumbhakar. 2004. Efficiency measurement using a latent class stochastic frontier model. Empirical Economics, 29, 169–183.
Parmeter, C.F., and S.C. Kumbhakar. 2014. Efficiency analysis: A primer on recent advances. Foundations and Trends in Econometrics, 7, 191–385.
Reifschneider, D., and R. Stevenson. 1991. Systematic departures from the frontier: A framework for the analysis of firm inefficiency. International Economic Review, 32(3), 715–723.
See Also
print
for printing sfalcmcross
object.
summary
for creating and printing
summary results.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
Examples
## Using data on eighty-two countries production (GDP)
# LCM Cobb Douglas (production function) half normal distribution
# Intercept and initStat used as separating variables
cb_2c_h1 <- sfalcmcross(formula = ly ~ lk + ll + yr, thet = ~initStat,
data = worldprod)
summary(cb_2c_h1)
# summary of the initial ML model
summary(cb_2c_h1$InitHalf)
# Only the intercept is used as the separating variable
# and only variable initStat is used as inefficiency driver
cb_2c_h3 <- sfalcmcross(formula = ly ~ lk + ll + yr, uhet = ~initStat,
data = worldprod)
summary(cb_2c_h3)
Sample selection in stochastic frontier estimation using cross-section data
Description
sfaselectioncross
is a symbolic formula based function for the
estimation of the stochastic frontier model in the presence of sample
selection. The model accommodates cross-sectional or pooled cross-sectional data.
The model can be estimated using different quadrature approaches or
maximum simulated likelihood (MSL). See Greene (2010).
Only the half-normal distribution is possible for the one-sided error term. Eleven optimization algorithms are available.
The function also accounts for heteroscedasticity in both one-sided and two-sided error terms, as in Reifschneider and Stevenson (1991), Caudill and Ford (1993), Caudill et al. (1995) and Hadri (1999).
Usage
sfaselectioncross(
selectionF,
frontierF,
uhet,
vhet,
modelType = "greene10",
logDepVar = TRUE,
data,
subset,
weights,
wscale = TRUE,
S = 1L,
udist = "hnormal",
start = NULL,
method = "bfgs",
hessianType = 2L,
lType = "ghermite",
Nsub = 100,
uBound = Inf,
simType = "halton",
Nsim = 100,
prime = 2L,
burn = 10,
antithetics = FALSE,
seed = 12345,
itermax = 2000,
printInfo = FALSE,
intol = 1e-06,
tol = 1e-12,
gradtol = 1e-06,
stepmax = 0.1,
qac = "marquardt"
)
## S3 method for class 'sfaselectioncross'
print(x, ...)
## S3 method for class 'sfaselectioncross'
bread(x, ...)
## S3 method for class 'sfaselectioncross'
estfun(x, ...)
Arguments
selectionF |
A symbolic (formula) description of the selection equation. |
frontierF |
A symbolic (formula) description of the outcome (frontier) equation. |
uhet |
A one-part formula to consider heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to consider heteroscedasticity in the two-sided error variance (see section ‘Details’). |
modelType |
Character string. Model used to solve the selection bias. Only the model discussed in Greene (2010) is currently available. |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted log-likelihood.
Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Distribution specification for the one-sided
error term. Only the half normal distribution |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
lType |
Specifies the way the likelihood is estimated. Five possibilities are
available: |
Nsub |
Integer. Number of subdivisions/nodes used for quadrature approaches.
Default |
uBound |
Numeric. Upper bound for the inefficiency component when solving
integrals using quadrature approaches except Gauss-Hermite for which the upper
bound is automatically infinite ( |
simType |
Character string. If |
Nsim |
Number of draws for MSL (default 100). |
prime |
Prime number considered for Halton and Generalized-Halton
draws. Default = |
burn |
Number of the first observations discarded in the case of Halton
draws. Default = |
antithetics |
Logical. Default = |
seed |
Numeric. Seed for the random draws. |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
intol |
Numeric. Integration tolerance for quadrature approaches
( |
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class sfaselectioncross (returned by the function |
... |
additional arguments of frontier are passed to sfaselectioncross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
Details
The current model is an extension of Heckman (1976, 1979) sample selection model to nonlinear models particularly stochastic frontier model. The model has first been discussed in Greene (2010), and an application can be found in Dakpo et al. (2021). Practically, we have:
y_{1i} = \left\{ \begin{array}{ll}
1 & \mbox{if} \quad y_{1i}^* > 0 \\
0 & \mbox{if} \quad y_{1i}^* \leq 0 \\
\end{array}
\right.
where
y_{1i}^*=\mathbf{Z}_{si}^{\prime} \mathbf{\gamma} + w_i, \quad
w_i \sim \mathcal{N}(0, 1)
and
y_{2i} = \left\{ \begin{array}{ll}
y_{2i}^* & \mbox{if} \quad y_{1i}^* > 0 \\
NA & \mbox{if} \quad y_{1i}^* \leq 0 \\
\end{array}
\right.
where
y_{2i}^*=\mathbf{x_{i}^{\prime}} \mathbf{\beta} + v_i - Su_i, \quad
v_i = \sigma_vV_i \quad \wedge \quad V_i \sim \mathcal{N}(0, 1), \quad
u_i = \sigma_u|U_i| \quad \wedge \quad U_i \sim \mathcal{N}(0, 1)
y_{1i}
describes the selection equation while y_{2i}
represents
the frontier equation. The selection bias arises from the correlation
between the two symmetric random components v_i
and w_i
:
(v_i, w_i) \sim \mathcal{N}_2\left\lbrack(0,0), (1, \rho \sigma_v, \sigma_v^2) \right\rbrack
Conditionaly on |U_i|
, the probability associated to each observation is:
Pr \left\lbrack y_{1i}^* \leq 0 \right\rbrack^{1-y_{1i}} \cdot \left\lbrace
f(y_{2i}|y_{1i}^* > 0) \times Pr\left\lbrack y_{1i}^* > 0
\right\rbrack \right\rbrace^{y_{1i}}
Using the conditional probability formula:
P\left(A\cap B\right) = P(A) \cdot P(B|A) = P(B) \cdot P(A|B)
Therefore:
f(y_{2i}|y_{1i}^* \geq 0) \cdot Pr\left\lbrack y_{1i}^* \geq 0\right\rbrack =
f(y_{2i}) \cdot Pr(y_{1i}^* \geq 0|y_{2i})
Using the properties of a bivariate normal distribution, we have:
y_{i1}^* | y_{i2} \sim N\left(\mathbf{Z_{si}^{\prime}} \bm{\gamma}+\frac{\rho}{
\sigma_v}v_i, 1-\rho^2\right)
Hence conditionally on |U_i|
, we have:
f(y_{2i}|y_{1i}^* \geq 0) \cdot Pr\left\lbrack y_{1i}^* \geq 0\right\rbrack =
\frac{1}{\sigma_v}\phi\left(\frac{v_i}{\sigma_v}\right)\Phi\left(\frac{
\mathbf{Z_{si}^{\prime}} \bm{\gamma}+\frac{\rho}{\sigma_v}v_i}{
\sqrt{1-\rho^2}}\right)
The conditional likelihood is equal to:
L_i\big||U_i| = \Phi(-\mathbf{Z_{si}^{\prime}} \bm{\gamma})^{1-y_{1i}} \times
\left\lbrace \frac{1}{\sigma_v}\phi\left(\frac{y_{2i}-\mathbf{x_{i}^{\prime}}
\bm{\beta} + S\sigma_u|U_i|}{\sigma_v}\right)\Phi\left(\frac{
\mathbf{Z_{si}^{\prime}} \bm{\gamma}+\frac{\rho}{\sigma_v}\left(y_{2i}-
\mathbf{x_{i}^{\prime}} \bm{\beta} + S\sigma_u|U_i|\right)}{\sqrt{1-\rho^2}}
\right) \right\rbrace ^{y_{1i}}
Since the non-selected observations bring no additional information, the conditional likelihood to be considered is:
L_i\big||U_i| = \frac{1}{\sigma_v}\phi\left(\frac{y_{2i}-\mathbf{x_{i}^{\prime}}
\bm{\beta} + S\sigma_u|U_i|}{\sigma_v}\right) \Phi\left(\frac{\mathbf{Z_{si}^{\prime}}
\bm{\gamma}+\frac{\rho}{\sigma_v}\left(y_{2i}-\mathbf{x_{i}^{\prime}} \bm{\beta} +
S\sigma_u|U_i|\right)}{\sqrt{1-\rho^2}}\right)
The unconditional likelihood is obtained by integrating |U_i|
out of the conditional likelihood. Thus
L_i\\ = \int_{|U_i|} \frac{1}{\sigma_v}\phi\left(\frac{y_{2i}-\mathbf{x_{i}^{\prime}}
\bm{\beta} + S\sigma_u|U_i|}{\sigma_v}\right) \Phi\left(\frac{\mathbf{Z_{si}^{\prime}}
\bm{\gamma}+ \frac{\rho}{\sigma_v}\left(y_{2i}-\mathbf{x_{i}^{\prime}} \bm{\beta} +
S\sigma_u|U_i|\right)}{\sqrt{1-\rho^2}}\right)p\left(|U_i|\right)d|U_i|
To simplifiy the estimation, the likelihood can be estimated using a two-step approach.
In the first step, the probit model can be run and estimate of \gamma
can be obtained.
Then, in the second step, the following model is estimated:
L_i\\ = \int_{|U_i|} \frac{1}{\sigma_v}\phi\left(\frac{y_{2i}-\mathbf{x_{i}^{\prime}}
\bm{\beta} + S\sigma_u|U_i|}{\sigma_v}\right) \Phi\left(\frac{a_i +
\frac{\rho}{\sigma_v}\left(y_{2i}-\mathbf{x_{i}^{\prime}} \bm{\beta} +
S\sigma_u|U_i|\right)}{\sqrt{1-\rho^2}}\right)p\left(|U_i|\right)d|U_i|
where a_i = \mathbf{Z_{si}^{\prime}} \hat{\bm{\gamma}}
. This likelihood can be estimated using
five different approaches: Gauss-Kronrod quadrature, adaptive integration over hypercubes
(hcubature and pcubature), Gauss-Hermite quadrature, and
maximum simulated likelihood. We also use the BHHH estimator to obtain
the asymptotic standard errors for the parameter estimators.
sfaselectioncross
allows for the maximization of weighted log-likelihood.
When option weights
is specified and wscale = TRUE
, the weights
are scaled as:
new_{weights} = sample_{size} \times \frac{old_{weights}}{\sum(old_{weights})}
For complex problems, non-gradient methods (e.g. nm
or sann
) can be
used to warm start the optimization and zoom in the neighborhood of the
solution. Then a gradient-based methods is recommended in the second step. In the case
of sann
, we recommend to significantly increase the iteration limit
(e.g. itermax = 20000
). The Conjugate Gradient (cg
) can also be used
in the first stage.
A set of extractor functions for fitted model objects is available for objects of class
'sfaselectioncross'
including methods to the generic functions print
,
summary
, coef
,
fitted
, logLik
,
residuals
, vcov
,
efficiencies
, ic
,
marginal
,
estfun
and
bread
(from the sandwich package),
lmtest::coeftest()
(from the lmtest package).
Value
sfaselectioncross
returns a list of class 'sfaselectioncross'
containing the following elements:
call |
The matched call. |
selectionF |
The selection equation formula. |
frontierF |
The frontier equation formula. |
S |
The argument |
typeSfa |
Character string. 'Stochastic Production/Profit Frontier, e =
v - u' when |
Ninit |
Number of initial observations in all samples. |
Nobs |
Number of observations used for optimization. |
nXvar |
Number of explanatory variables in the production or cost frontier. |
logDepVar |
The argument |
nuZUvar |
Number of variables explaining heteroscedasticity in the one-sided error term. |
nvZVvar |
Number of variables explaining heteroscedasticity in the two-sided error term. |
nParm |
Total number of parameters estimated. |
udist |
The argument |
startVal |
Numeric vector. Starting value for M(S)L estimation. |
dataTable |
A data frame (tibble format) containing information on data
used for optimization along with residuals and fitted values of the OLS and
M(S)L estimations, and the individual observation log-likelihood. When argument |
lpmObj |
Linear probability model used for initializing the first step probit model. |
probitObj |
Probit model. Object of class |
ols2stepParam |
Numeric vector. OLS second step estimates for selection correction. Inverse Mills Ratio is introduced as an additional explanatory variable. |
ols2stepStder |
Numeric vector. Standard errors of OLS second step estimates. |
ols2stepSigmasq |
Numeric. Estimated variance of OLS second step random error. |
ols2stepLoglik |
Numeric. Log-likelihood value of OLS second step estimation. |
ols2stepSkew |
Numeric. Skewness of the residuals of the OLS second step estimation. |
ols2stepM3Okay |
Logical. Indicating whether the residuals of the OLS second step estimation have the expected skewness. |
CoelliM3Test |
Coelli's test for OLS residuals skewness. (See Coelli, 1995). |
AgostinoTest |
D'Agostino's test for OLS residuals skewness. (See D'Agostino and Pearson, 1973). |
isWeights |
Logical. If |
lType |
Type of likelihood estimated. See the section ‘Arguments’. |
optType |
Optimization algorithm used. |
nIter |
Number of iterations of the ML estimation. |
optStatus |
Optimization algorithm termination message. |
startLoglik |
Log-likelihood at the starting values. |
mlLoglik |
Log-likelihood value of the M(S)L estimation. |
mlParam |
Parameters obtained from M(S)L estimation. |
gradient |
Each variable gradient of the M(S)L estimation. |
gradL_OBS |
Matrix. Each variable individual observation gradient of the M(S)L estimation. |
gradientNorm |
Gradient norm of the M(S)L estimation. |
invHessian |
Covariance matrix of the parameters obtained from the M(S)L estimation. |
hessianType |
The argument |
mlDate |
Date and time of the estimated model. |
simDist |
The argument |
Nsim |
The argument |
FiMat |
Matrix of random draws used for MSL, only if |
gHermiteData |
List. Gauss-Hermite quadrature rule as provided by
|
Nsub |
Number of subdivisions used for quadrature approaches. |
uBound |
Upper bound for the inefficiency component when solving
integrals using quadrature approaches except Gauss-Hermite for which the upper
bound is automatically infinite ( |
intol |
Integration tolerance for quadrature approaches except Gauss-Hermite. |
Note
For the Halton draws, the code is adapted from the mlogit package.
References
Caudill, S. B., and Ford, J. M. 1993. Biases in frontier estimation due to heteroscedasticity. Economics Letters, 41(1), 17–20.
Caudill, S. B., Ford, J. M., and Gropper, D. M. 1995. Frontier estimation and firm-specific inefficiency measures in the presence of heteroscedasticity. Journal of Business & Economic Statistics, 13(1), 105–111.
Coelli, T. 1995. Estimators and hypothesis tests for a stochastic frontier function - a Monte-Carlo analysis. Journal of Productivity Analysis, 6:247–268.
D'Agostino, R., and E.S. Pearson. 1973. Tests for departure from normality.
Empirical results for the distributions of b_2
and \sqrt{b_1}
.
Biometrika, 60:613–622.
Dakpo, K. H., Latruffe, L., Desjeux, Y., Jeanneaux, P., 2022. Modeling heterogeneous technologies in the presence of sample selection: The case of dairy farms and the adoption of agri-environmental schemes in France. Agricultural Economics, 53(3), 422-438.
Greene, W., 2010. A stochastic frontier model with correction for sample selection. Journal of Productivity Analysis. 34, 15–24.
Hadri, K. 1999. Estimation of a doubly heteroscedastic stochastic frontier cost function. Journal of Business & Economic Statistics, 17(3), 359–363.
Heckman, J., 1976. Discrete, qualitative and limited dependent variables. Ann Econ Soc Meas. 4, 475–492.
Heckman, J., 1979. Sample Selection Bias as a Specification Error. Econometrica. 47, 153–161.
Reifschneider, D., and Stevenson, R. 1991. Systematic departures from the frontier: A framework for the analysis of firm inefficiency. International Economic Review, 32(3), 715–723.
See Also
print
for printing sfaselectioncross
object.
summary
for creating and printing
summary results.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
Examples
## Not run:
## Simulated example
N <- 2000 # sample size
set.seed(12345)
z1 <- rnorm(N)
z2 <- rnorm(N)
v1 <- rnorm(N)
v2 <- rnorm(N)
e1 <- v1
e2 <- 0.7071 * (v1 + v2)
ds <- z1 + z2 + e1
d <- ifelse(ds > 0, 1, 0)
u <- abs(rnorm(N))
x1 <- rnorm(N)
x2 <- rnorm(N)
y <- x1 + x2 + e2 - u
data <- cbind(y = y, x1 = x1, x2 = x2, z1 = z1, z2 = z2, d = d)
## Estimation using quadrature (Gauss-Kronrod)
selecRes1 <- sfaselectioncross(selectionF = d ~ z1 + z2, frontierF = y ~ x1 + x2,
modelType = 'greene10', method = 'bfgs',
logDepVar = TRUE, data = as.data.frame(data),
S = 1L, udist = 'hnormal', lType = 'kronrod', Nsub = 100, uBound = Inf,
simType = 'halton', Nsim = 300, prime = 2L, burn = 10, antithetics = FALSE,
seed = 12345, itermax = 2000, printInfo = FALSE)
summary(selecRes1)
## Estimation using maximum simulated likelihood
selecRes2 <- sfaselectioncross(selectionF = d ~ z1 + z2, frontierF = y ~ x1 + x2,
modelType = 'greene10', method = 'bfgs',
logDepVar = TRUE, data = as.data.frame(data),
S = 1L, udist = 'hnormal', lType = 'msl', Nsub = 100, uBound = Inf,
simType = 'halton', Nsim = 300, prime = 2L, burn = 10, antithetics = FALSE,
seed = 12345, itermax = 2000, printInfo = FALSE)
summary(selecRes2)
## End(Not run)
Skewness test for stochastic frontier models
Description
skewnessTest
computes skewness test for stochastic frontier
models (i.e. objects of class 'sfacross'
).
Usage
skewnessTest(object, test = "agostino")
Arguments
object |
An object of class |
test |
A character string specifying the test to implement. If
|
Value
skewnessTest
returns the results of either the D'Agostino's
or the Coelli's skewness test.
Note
skewnessTest
is currently only available for object of
class 'sfacross'
.
References
Coelli, T. 1995. Estimators and hypothesis tests for a stochastic frontier function - a Monte-Carlo analysis. Journal of Productivity Analysis, 6:247–268.
D'Agostino, R., and E.S. Pearson. 1973. Tests for departure from normality.
Empirical results for the distributions of b_2
and \sqrt{b_1}
.
Biometrika, 60:613–622.
Examples
## Not run:
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog SFA (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1,
scaling = TRUE, method = 'mla')
skewnessTest(tl_u_ts)
skewnessTest(tl_u_ts, test = 'coelli')
## End(Not run)
Summary of results for stochastic frontier models
Description
Create and print summary results for stochastic frontier models returned by
sfacross
, sfalcmcross
, or
sfaselectioncross
.
Usage
## S3 method for class 'sfacross'
summary(object, grad = FALSE, ci = FALSE, ...)
## S3 method for class 'summary.sfacross'
print(x, digits = max(3, getOption("digits") - 2), ...)
## S3 method for class 'sfalcmcross'
summary(object, grad = FALSE, ci = FALSE, ...)
## S3 method for class 'summary.sfalcmcross'
print(x, digits = max(3, getOption("digits") - 2), ...)
## S3 method for class 'sfaselectioncross'
summary(object, grad = FALSE, ci = FALSE, ...)
## S3 method for class 'summary.sfaselectioncross'
print(x, digits = max(3, getOption("digits") - 2), ...)
Arguments
object |
An object of either class |
grad |
Logical. Default = |
ci |
Logical. Default = |
... |
Currently ignored. |
x |
An object of either class |
digits |
Numeric. Number of digits displayed in values. |
Value
The summary
method returns a list of class
'summary.sfacross'
, 'summary.sfalcmcross'
, or
'summary.sfaselectioncross'
that contains the same elements as an object returned by sfacross
,
sfalcmcross
, or sfaselectioncross
with the
following additional elements:
AIC |
Akaike information criterion. |
BIC |
Bayesian information criterion. |
HQIC |
Hannan-Quinn information criterion. |
sigmavSq |
For |
sigmauSq |
For |
Varu |
For |
theta |
For |
Eu |
For |
Expu |
For |
olsRes |
For |
ols2StepRes |
For |
mlRes |
Matrix of ML estimates, their standard errors, z-values,
asymptotic P-values, and when |
chisq |
For |
df |
Degree of freedom for the inefficiency model. |
See Also
sfacross
, for the stochastic frontier analysis model
fitting function for cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function for cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function for cross-sectional or pooled data.
print
for printing sfacross
object.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
skewnessTest
for implementing skewness test.
Examples
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog SFA (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1,
scaling = TRUE, method = 'mla')
summary(tl_u_ts, grad = TRUE, ci = TRUE)
Data on Swiss railway companies
Description
This dataset is an unbalanced panel of 50 Swiss railway companies over the period 1985-1997.
Format
A data frame with 605 observations on the following 42 variables.
- ID
Firm identification.
- YEAR
Year identification.
- NI
Number of years observed.
- STOPS
Number of stops in network.
- NETWORK
Network length (in meters).
- NARROW_T
Dummy variable for railroads with narrow track.
- RACK
Dummy variable for ‘rack rail’ in network.
- TUNNEL
Dummy variable for network with tunnels over 300 meters on average.
- T
Time indicator, first year = 0.
- Q2
Passenger output – passenger km.
- Q3
Freight output – ton km.
- CT
Total cost (1,000 Swiss franc).
- PL
Labor price.
- PE
Electricity price.
- PK
Capital price.
- VIRAGE
1 for railroads with curvy tracks.
- LNCT
Log of
CT
/PE
.- LNQ2
Log of
Q2
.- LNQ3
Log of
Q3
.- LNNET
Log of
NETWORK
/1000.- LNPL
Log of
PL
/PE
.- LNPE
Log of
PE
.- LNPK
Log of
PK
/PE
.- LNSTOP
Log of
STOPS
.- MLNQ2
Mean of
LNQ2
.- MLNQ3
Mean of
LNQ3
.- MLNNET
Mean of
LNNET
.- MLNPL
Mean of
LNPL
.- MLNPK
Mean of
LNPK
.- MLNSTOP
Mean of
LNSTOP
.
Details
The dataset is extracted from the annual reports of the Swiss Federal Office of Statistics on public transport companies and has been used in Farsi et al. (2005).
Source
https://pages.stern.nyu.edu/~wgreene/Text/Edition7/tablelist8new.htm
References
Farsi, M., M. Filippini, and W. Greene. 2005. Efficiency measurement in network industries: Application to the Swiss railway companies. Journal of Regulatory Economics, 28:69–90.
Examples
str(swissrailways)
Data on U.S. electricity generating plants
Description
This dataset contains data on fossil fuel fired steam electric power generation plants in the United States between 1986 and 1996.
Format
A data frame with 791 observations on the following 11 variables.
- firm
Plant identification.
- year
Year identification.
- y
Net-steam electric power generation in megawatt-hours.
- regu
Dummy variable which takes a value equal to 1 if the power plant is in a state which enacted legislation or issued a regulatory order to implement retail access during the sample period, and 0 otherwise.
- k
Capital stock.
- labor
Labor and maintenance.
- fuel
Fuel.
- wl
Labor price.
- wf
Fuel price.
- wk
Capital price.
- tc
Total cost.
Details
The dataset has been used in Kumbhakar et al. (2014).
Source
https://sites.google.com/view/sfbook-stata/home
References
Kumbhakar, S.C., H.J. Wang, and A. Horncastle. 2014. A Practitioner's Guide to Stochastic Frontier Analysis Using Stata. Cambridge University Press.
Examples
str(utility)
summary(utility)
Compute variance-covariance matrix of stochastic frontier models
Description
vcov
computes the variance-covariance matrix of the maximum
likelihood (ML) coefficients from stochastic frontier models estimated with
sfacross
, sfalcmcross
,
or sfaselectioncross
.
Usage
## S3 method for class 'sfacross'
vcov(object, extraPar = FALSE, ...)
## S3 method for class 'sfalcmcross'
vcov(object, ...)
## S3 method for class 'sfaselectioncross'
vcov(object, extraPar = FALSE, ...)
Arguments
object |
A stochastic frontier model returned
by |
extraPar |
Logical. Only available for non heteroscedastic models
returned by
|
... |
Currently ignored |
Details
The variance-covariance matrix is obtained by the inversion of the
negative Hessian matrix. Depending on the distribution and the
'hessianType'
option, the analytical/numeric Hessian or the bhhh
Hessian is evaluated.
The argument extraPar
, is currently available only for objects of class
'sfacross'
and 'sfaselectioncross'
. When
'extraPar = TRUE'
, the variance-covariance of the additional
parameters is obtained using the delta method.
Value
The variance-covariance matrix of the maximum likelihood coefficients is returned.
See Also
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional data.
Examples
## Using data on Spanish dairy farms
# Cobb Douglas (production function) half normal distribution
cb_s_h <- sfacross(formula = YIT ~ X1 + X2 + X3 + X4, udist = 'hnormal',
data = dairyspain, S = 1, method = 'bfgs')
vcov(cb_s_h)
vcov(cb_s_h, extraPar = TRUE)
# Other variance-covariance matrices can be obtained using the sandwich package
# Robust variance-covariance matrix
requireNamespace('sandwich', quietly = TRUE)
sandwich::vcovCL(cb_s_h)
# Coefficients and standard errors can be obtained using lmtest package
requireNamespace('lmtest', quietly = TRUE)
lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL)
# Clustered standard errors
lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL, cluster = ~ FARM)
# Doubly clustered standard errors
lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL, cluster = ~ FARM + YEAR)
# BHHH standard errors
lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovOPG)
# Adjusted BHHH standard errors
lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovOPG, adjust = TRUE)
## Using data on eighty-two countries production (GDP)
# LCM Cobb Douglas (production function) half normal distribution
cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal',
data = worldprod, uhet = ~ initStat, S = 1)
vcov(cb_2c_h)
Data on world production
Description
This dataset provides information on production related variables for eighty-two countries over the period 1960–1987.
Format
A data frame with 2,296 observations on the following 12 variables.
- country
Country name.
- code
Country identification.
- yr
Year identification.
- y
GDP in 1987 U.S. dollars.
- k
Physical capital stock in 1987 U.S. dollars.
- l
Labor (number of individuals in the workforce between the age of 15 and 64).
- h
Human capital-adjusted labor.
- ly
Log of
y
.- lk
Log of
k
.- ll
Log of
l
.- lh
Log of
h
.- initStat
Log of the initial capital to labor ratio of each country,
lk
-ll
, measured at the beginning of the sample period.
Details
The dataset is from the World Bank STARS database and has been used in Kumbhakar et al. (2014).
Source
https://sites.google.com/view/sfbook-stata/home
References
Kumbhakar, S.C., H.J. Wang, and A. Horncastle. 2014. A Practitioner's Guide to Stochastic Frontier Analysis Using Stata. Cambridge University Press.
Examples
str(worldprod)
summary(worldprod)