Title: | Stochastic Frontier Analysis Routines |
---|---|
Description: | Maximum likelihood estimation for stochastic frontier analysis (SFA) of production (profit) and cost functions. The package includes the basic stochastic frontier for cross-sectional or pooled data with several distributions for the one-sided error term (i.e., Rayleigh, gamma, Weibull, lognormal, uniform, generalized exponential and truncated skewed Laplace), the latent class stochastic frontier model (LCM) as described in Dakpo et al. (2021) <doi:10.1111/1477-9552.12422>, for cross-sectional and pooled data, and the sample selection model as described in Greene (2010) <doi:10.1007/s11123-009-0159-1>, and applied in Dakpo et al. (2021) <doi:10.1111/agec.12683>. Several possibilities in terms of optimization algorithms are proposed. |
Authors: | K Hervé Dakpo [aut, cre], Yann Desjeux [aut], Arne Henningsen [aut], Laure Latruffe [aut] |
Maintainer: | K Hervé Dakpo <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0.1.9000 |
Built: | 2024-10-31 16:24:42 UTC |
Source: | https://github.com/hdakpo/sfar |
The sfaR package provides a set of tools (maximum likelihood - ML and maximum simulated likelihood - MSL) for various specifications of stochastic frontier analysis (SFA).
Three categories of functions are available: sfacross
,
sfalcmcross
, sfaselectioncross
,
which estimate different types of frontiers and offer eleven alternative
optimization algorithms (i.e., "bfgs", "bhhh", "nr", "nm", "cg", "sann",
"ucminf", "mla", "sr1", "sparse", "nlminb").
sfacross
estimates the basic stochastic
frontier analysis (SFA) for cross-sectional or pooled data and allows for
ten different distributions for the one-sided error term. These distributions
include the exponential, the gamma, the generalized exponential,
the half normal, the lognormal, the truncated normal, the truncated skewed
Laplace, the Rayleigh, the uniform, and the Weibull distributions.
In the case of the gamma, lognormal, and Weibull distributions, maximum
simulated likelihood (MSL) is used with the possibility of four specific
distributions to construct the draws: halton, generalized halton, sobol and
uniform. Heteroscedasticity in both error terms can be implemented, in
addition to heterogeneity in the truncated mean parameter in the case of the
truncated normal and lognormal distributions. In addition, in the case of the
truncated normal distribution, the scaling property can be estimated.
sfalcmcross
estimates latent class
stochastic frontier models (LCM) for cross-sectional or pooled data.
It accounts for technological heterogeneity by splitting the observations
into a maximum number of five classes. The classification operates based on
a logit functional form that can be specified using some covariates (namely,
the separating variables allowing the separation of observations in several
classes). Only the half normal distribution is available for the one-sided
error term. Heteroscedasticity in both error terms is possible. The choice of
the number of classes can be guided by several information criteria (i.e.,
AIC, BIC, or HQIC).
sfaselectioncross
estimates the
frontier for cross-sectional or pooled data in the presence of sample
selection. The model solves the selection bias due to the correlation
between the two-sided error terms in both the selection and the frontier
equations. The likelihood can be estimated using five different
possibilities: gauss-kronrod quadrature, adaptive integration over hypercubes
(hcubature and pcubature), gauss-hermite quadrature, and
maximum simulated likelihood. Only the half normal
distribution is available for the one-sided error term. Heteroscedasticity
in both error terms is possible.
Any bug or suggestion can be reported using the
sfaR
tracker facilities at:
https://github.com/hdakpo/sfaR/issues
K Hervé Dakpo, Yann Desjeux, Arne Henningsen and Laure Latruffe
From an object of class 'summary.sfacross'
,
'summary.sfalcmcross'
, or 'summary.sfaselectioncross'
,
coef
extracts the coefficients,
their standard errors, z-values, and (asymptotic) P-values.
From on object of class 'sfacross'
, 'sfalcmcross'
, or
'sfaselectioncross'
, it extracts only the estimated coefficients.
## S3 method for class 'sfacross' coef(object, extraPar = FALSE, ...) ## S3 method for class 'summary.sfacross' coef(object, ...) ## S3 method for class 'sfalcmcross' coef(object, extraPar = FALSE, ...) ## S3 method for class 'summary.sfalcmcross' coef(object, ...) ## S3 method for class 'sfaselectioncross' coef(object, extraPar = FALSE, ...) ## S3 method for class 'summary.sfaselectioncross' coef(object, ...)
## S3 method for class 'sfacross' coef(object, extraPar = FALSE, ...) ## S3 method for class 'summary.sfacross' coef(object, ...) ## S3 method for class 'sfalcmcross' coef(object, extraPar = FALSE, ...) ## S3 method for class 'summary.sfalcmcross' coef(object, ...) ## S3 method for class 'sfaselectioncross' coef(object, extraPar = FALSE, ...) ## S3 method for class 'summary.sfaselectioncross' coef(object, ...)
object |
A stochastic frontier model returned by |
extraPar |
Logical (default =
|
... |
Currently ignored. |
For objects of class 'summary.sfacross'
,
'summary.sfalcmcross'
, or 'summary.sfaselectioncross'
,
coef
returns a matrix with four columns. Namely, the
estimated coefficients, their standard errors, z-values,
and (asymptotic) P-values.
For objects of class 'sfacross'
, 'sfalcmcross'
, or
'sfaselectioncross'
, coef
returns a numeric vector of
the estimated coefficients. If extraPar = TRUE
, additional parameters,
detailed in the section ‘Arguments’, are also returned. In the case
of object of class 'sfalcmcross'
, each additional
parameter ends with '#'
that represents the class number.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') coef(tl_u_ts, extraPar = TRUE) coef(summary(tl_u_ts)) ## End(Not run)
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') coef(tl_u_ts, extraPar = TRUE) coef(summary(tl_u_ts)) ## End(Not run)
This dataset contains nine years (1998-2006) of information on Norwegian dairy farms.
A data frame with 2,727 observations on the following 23 variables.
Farm identification.
Year identification.
Milk sold (1000 liters).
Meat (1000 NOK).
Support payments (1000 NOK).
Other outputs (1000 NOK).
Milk price (NOK/liter).
Meat price (cattle index).
Support payments price (CP index).
Other outputs price index.
Land (decare (daa) = 0.1 ha).
Labour (1000 hours).
Purchase feed (1000 NOK).
Other variable costs (1000 NOK).
Cattle capital (1000 NOK).
Other capital (1000 NOK).
Land price (NOK/daa).
Labour price (NOK/hour).
Feed price index.
Other variable cost index.
Cattle capital rent.
Other capital rent and depreciation.
Total cost.
https://sites.google.com/view/sfbook-stata/home
Kumbhakar, S.C., H.J. Wang, and A. Horncastle. 2014. A Practitioner's Guide to Stochastic Frontier Analysis Using Stata. Cambridge University Press.
str(dairynorway) summary(dairynorway)
str(dairynorway) summary(dairynorway)
This dataset contains six years of observations on 247 dairy farms in northern Spain, drawn from 1993-1998. The original data consist in the farm and year identifications, plus measurements on one output (i.e. milk), and four inputs (i.e. cows, land, labor and feed).
A data frame with 1,482 observations on the following 29 variables.
Farm identification.
Age of the farmer.
Year identification.
Number of milking cows.
Agricultural area.
Milk production.
Labor.
Feed.
Log of MILK
.
Log of COWS
.
Log of LAND
.
Log
of LABOR
.
Log of FEED
.
1/2 *
X1
^2.
1/2 * X2
^2.
1/2 * X3
^2.
1/2 * X4
^2.
X1
* X2
.
X1
* X3
.
X1
* X4
.
X2
* X3
.
X2
* X4
.
X3
* X4
.
Dummy for YEAR =
1993
.
Dummy for YEAR = 1994
.
Dummy for
YEAR = 1995
.
Dummy for YEAR = 1996
.
Dummy for YEAR = 1997
.
Dummy for
YEAR = 1998
.
This dataset has been used in Alvarez et al. (2004). The data have been normalized so that the logs of the inputs sum to zero over the 1,482 observations.
https://pages.stern.nyu.edu/~wgreene/Text/Edition7/tablelist8new.htm
Alvarez, A., C. Arias, and W. Greene. 2004. Accounting for unobservables in production models: management and inefficiency. Econometric Society, 341:1–20.
str(dairyspain) summary(dairyspain)
str(dairyspain) summary(dairyspain)
efficiencies
returns (in-)efficiency estimates of models
estimated with sfacross
, sfalcmcross
, or
sfaselectioncross
.
## S3 method for class 'sfacross' efficiencies(object, level = 0.95, newData = NULL, ...) ## S3 method for class 'sfalcmcross' efficiencies(object, level = 0.95, newData = NULL, ...) ## S3 method for class 'sfaselectioncross' efficiencies(object, level = 0.95, newData = NULL, ...)
## S3 method for class 'sfacross' efficiencies(object, level = 0.95, newData = NULL, ...) ## S3 method for class 'sfalcmcross' efficiencies(object, level = 0.95, newData = NULL, ...) ## S3 method for class 'sfaselectioncross' efficiencies(object, level = 0.95, newData = NULL, ...)
object |
A stochastic frontier model returned
by |
level |
A number between between 0 and 0.9999 used for the computation
of (in-)efficiency confidence intervals (defaut = |
newData |
Optional data frame that is used to calculate the efficiency
estimates. If NULL (the default), the efficiency estimates are calculated
for the observations that were used in the estimation. In the case of object of
class |
... |
Currently ignored. |
In general, the conditional inefficiency is obtained following Jondrow et al. (1982) and the conditional efficiency is computed following Battese and Coelli (1988). In some cases the conditional mode is also returned (Jondrow et al. 1982). The confidence interval is computed following Horrace and Schmidt (1996), Hjalmarsson et al. (1996), or Berra and Sharma (1999) (see ‘Value’ section).
In the case of the half normal distribution for the one-sided error term,
the formulae are as follows (for notations, see the ‘Details’ section
of sfacross
or sfalcmcross
):
The conditional inefficiency is:
where
and
The Battese and Coelli (1988) conditional efficiency is obtained with:
The reciprocal of the Battese and Coelli (1988) conditional efficiency is obtained with:
The conditional mode is computed using:
and
The confidence intervals are obtained with:
with and
and
and
Thus
In the case of the sample selection, as underlined in Greene (2010), the conditional inefficiency could be computed using Jondrow et al. (1982). However, here the conditionanl (in)efficiency is obtained using the properties of the closed skew-normal (CSN) distribution (Lai, 2015). The conditional efficiency can be obtained using the moment generating functions of a CSN distribution (see Gonzalez-Farias et al. (2004)). We have:
where ,
,
,
,
.
The derivation of the efficiency and the reciprocal efficiency is obtained by replacing
and
, respectively. To obtain the inefficiency as
is more complicated as it requires the
derivation of a multivariate normal cdf. We have:
Then
where
A data frame that contains individual (in-)efficiency estimates. These are ordered in the same way as the corresponding observations in the dataset used for the estimation.
- For object of class 'sfacross'
the following elements are returned:
u |
Conditional inefficiency. In the case argument |
uLB |
Lower bound for conditional inefficiency. Only when the argument
|
uUB |
Upper bound for conditional inefficiency. Only when the argument
|
teJLMS |
|
m |
Conditional model. Only when the argument |
teMO |
|
teBC |
Battese and Coelli (1988) conditional efficiency. Only when, in
the function sfacross, |
teBC_reciprocal |
Reciprocal of Battese and Coelli (1988) conditional
efficiency. Similar to |
teBCLB |
Lower bound for Battese and Coelli (1988) conditional
efficiency. Only when, in the function sfacross, |
teBCUB |
Upper bound for Battese and Coelli (1988) conditional
efficiency. Only when, in the function sfacross, |
theta |
In the case |
- For object of class 'sfalcmcross'
the following elements are returned:
Group_c |
Most probable class for each observation. |
PosteriorProb_c |
Highest posterior probability. |
u_c |
Conditional inefficiency of the most probable class given the posterior probability. |
teJLMS_c |
|
teBC_c |
|
teBC_reciprocal_c |
|
PosteriorProb_c# |
Posterior probability of class #. |
PriorProb_c# |
Prior probability of class #. |
u_c# |
Conditional inefficiency associated to class #, regardless of
|
teBC_c# |
Conditional efficiency
( |
teBC_reciprocal_c# |
Reciprocal conditional efficiency
( |
ineff_c# |
Conditional inefficiency ( |
effBC_c# |
Conditional efficiency ( |
ReffBC_c# |
Reciprocal conditional efficiency ( |
theta_c# |
In the case |
- For object of class 'sfaselectioncross'
the following elements are returned:
u |
Conditional inefficiency. |
teJLMS |
|
teBC |
Battese and Coelli (1988) conditional efficiency. Only when, in
the function sfaselectioncross,
|
teBC_reciprocal |
Reciprocal of Battese and Coelli (1988) conditional
efficiency. Similar to |
Battese, G.E., and T.J. Coelli. 1988. Prediction of firm-level technical efficiencies with a generalized frontier production function and panel data. Journal of Econometrics, 38:387–399.
Bera, A.K., and S.C. Sharma. 1999. Estimating production uncertainty in stochastic frontier production function models. Journal of Productivity Analysis, 12:187-210.
Gonzalez-Farias, G., Dominguez-Molina, A., Gupta, A. K., 2004. Additive properties of skew normal random vectors. Journal of Statistical Planning and Inference. 126: 521-534.
Greene, W., 2010. A stochastic frontier model with correction for sample selection. Journal of Productivity Analysis. 34, 15–24.
Hjalmarsson, L., S.C. Kumbhakar, and A. Heshmati. 1996. DEA, DFA and SFA: A comparison. Journal of Productivity Analysis, 7:303-327.
Horrace, W.C., and P. Schmidt. 1996. Confidence statements for efficiency estimates from stochastic frontier models. Journal of Productivity Analysis, 7:257-282.
Jondrow, J., C.A.K. Lovell, I.S. Materov, and P. Schmidt. 1982. On the estimation of technical inefficiency in the stochastic frontier production function model. Journal of Econometrics, 19:233–238.
Lai, H. P., 2015. Maximum likelihood estimation of the stochastic frontier model with endogenous switching or sample selection. Journal of Productivity Analysis, 43: 105-117.
Nguyen, N.B. 2010. Estimation of technical efficiency in stochastic frontier analysis. PhD Dissertation, Bowling Green State University, August.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') eff.tl_u_ts <- efficiencies(tl_u_ts) head(eff.tl_u_ts) summary(eff.tl_u_ts) ## End(Not run)
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') eff.tl_u_ts <- efficiencies(tl_u_ts) head(eff.tl_u_ts) summary(eff.tl_u_ts) ## End(Not run)
This dataset is on electric power generation in the United States.
A data frame with 123 observations on the following 9 variables.
Firm identification.
Total cost in 1970, MM USD.
Output in million KwH.
Labor price.
Labor's cost share.
Capital price.
Capital's cost share.
Fuel price.
Fuel's cost share.
The dataset is from Christensen and Greene (1976) and has also been used in Greene (1990).
https://pages.stern.nyu.edu/~wgreene/Text/Edition7/tablelist8new.htm
Christensen, L.R., and W.H. Greene. 1976. Economies of scale in US electric power generation. The Journal of Political Economy, 84:655–676.
Greene, W.H. 1990. A Gamma-distributed stochastic frontier model. Journal of Econometrics, 46:141–163.
str(electricity) summary(electricity)
str(electricity) summary(electricity)
Extract coefficients and additional information for stochastic frontier models
returned by sfacross
, sfalcmcross
, or
sfaselectioncross
.
extract.sfacross(model, ...) extract.sfalcmcross(model, ...) extract.sfaselectioncross(model, ...)
extract.sfacross(model, ...) extract.sfalcmcross(model, ...) extract.sfaselectioncross(model, ...)
model |
objects of class |
... |
Currently ignored |
A texreg object representing the statistical model.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional data.
hlf <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs') trnorm <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, data = utility, S = -1, method = 'bfgs') tscal <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, method = 'bfgs', scaling = TRUE) expo <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'exponential', uhet = ~ regu, data = utility, S = -1, method = 'bfgs') texreg::screenreg(list(hlf, trnorm, tscal, expo))
hlf <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs') trnorm <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, data = utility, S = -1, method = 'bfgs') tscal <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, method = 'bfgs', scaling = TRUE) expo <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'exponential', uhet = ~ regu, data = utility, S = -1, method = 'bfgs') texreg::screenreg(list(hlf, trnorm, tscal, expo))
fitted
returns the fitted frontier values from stochastic
frontier models estimated with sfacross
, sfalcmcross
,
or sfaselectioncross
.
## S3 method for class 'sfacross' fitted(object, ...) ## S3 method for class 'sfalcmcross' fitted(object, ...) ## S3 method for class 'sfaselectioncross' fitted(object, ...)
## S3 method for class 'sfacross' fitted(object, ...) ## S3 method for class 'sfalcmcross' fitted(object, ...) ## S3 method for class 'sfaselectioncross' fitted(object, ...)
object |
A stochastic frontier model returned
by |
... |
Currently ignored. |
In the case of an object of class 'sfacross'
, or
'sfaselectioncross'
, a vector of fitted values is returned.
In the case of an object of class 'sfalcmcross'
, a data frame
containing the fitted values for each class is returned where each variable
ends with '_c#'
, '#'
being the class number.
The fitted values are ordered in the same way as the corresponding observations in the dataset used for the estimation.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
## Not run: ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod) fit.cb_2c_h <- fitted(cb_2c_h) head(fit.cb_2c_h) ## End(Not run)
## Not run: ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod) fit.cb_2c_h <- fitted(cb_2c_h) head(fit.cb_2c_h) ## End(Not run)
ic
returns information criterion from stochastic
frontier models estimated with sfacross
, sfalcmcross
,
or sfaselectioncross
.
## S3 method for class 'sfacross' ic(object, IC = "AIC", ...) ## S3 method for class 'sfalcmcross' ic(object, IC = "AIC", ...) ## S3 method for class 'sfaselectioncross' ic(object, IC = "AIC", ...)
## S3 method for class 'sfacross' ic(object, IC = "AIC", ...) ## S3 method for class 'sfalcmcross' ic(object, IC = "AIC", ...) ## S3 method for class 'sfaselectioncross' ic(object, IC = "AIC", ...)
object |
A stochastic frontier model returned
by |
IC |
Character string. Information criterion measure. Three criteria are available:
. |
... |
Currently ignored. |
The different information criteria are computed as follows:
AIC:
BIC:
HQIC:
where
is the maximum likelihood value,
the number of parameters
estimated and
the number of observations.
ic
returns the value of the information criterion
(AIC, BIC or HQIC) of the maximum likelihood coefficients.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
## Not run: ## Using data on Swiss railway # LCM (cost function) half normal distribution cb_2c_u <- sfalcmcross(formula = LNCT ~ LNQ2 + LNQ3 + LNNET + LNPK + LNPL, udist = 'hnormal', uhet = ~ 1, data = swissrailways, S = -1, method='ucminf') ic(cb_2c_u) ic(cb_2c_u, IC = 'BIC') ic(cb_2c_u, IC = 'HQIC') ## End(Not run)
## Not run: ## Using data on Swiss railway # LCM (cost function) half normal distribution cb_2c_u <- sfalcmcross(formula = LNCT ~ LNQ2 + LNQ3 + LNNET + LNPK + LNPL, udist = 'hnormal', uhet = ~ 1, data = swissrailways, S = -1, method='ucminf') ic(cb_2c_u) ic(cb_2c_u, IC = 'BIC') ic(cb_2c_u, IC = 'HQIC') ## End(Not run)
logLik
extracts the log-likelihood value(s) from stochastic
frontier models estimated with sfacross
, sfalcmcross
,
or sfaselectioncross
.
## S3 method for class 'sfacross' logLik(object, individual = FALSE, ...) ## S3 method for class 'sfalcmcross' logLik(object, individual = FALSE, ...) ## S3 method for class 'sfaselectioncross' logLik(object, individual = FALSE, ...)
## S3 method for class 'sfacross' logLik(object, individual = FALSE, ...) ## S3 method for class 'sfalcmcross' logLik(object, individual = FALSE, ...) ## S3 method for class 'sfaselectioncross' logLik(object, individual = FALSE, ...)
object |
A stochastic frontier model returned
by |
individual |
Logical. If |
... |
Currently ignored. |
logLik
returns either an object of class
'logLik'
, which is the log-likelihood value with the total number of
observations (nobs
) and the number of free parameters (df
) as
attributes, when individual = FALSE
, or a list of elements, containing
the log-likelihood of each observation (logLik
), the total number of
observations (Nobs
) and the number of free parameters (df
),
when individual = TRUE
.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') logLik(tl_u_ts) ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod, S = 1) logLik(cb_2c_h, individual = TRUE) ## End(Not run)
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') logLik(tl_u_ts) ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod, S = 1) logLik(cb_2c_h, individual = TRUE) ## End(Not run)
This function returns marginal effects of the inefficiency drivers from stochastic
frontier models estimated with sfacross
, sfalcmcross
,
or sfaselectioncross
.
## S3 method for class 'sfacross' marginal(object, newData = NULL, ...) ## S3 method for class 'sfalcmcross' marginal(object, newData = NULL, ...) ## S3 method for class 'sfaselectioncross' marginal(object, newData = NULL, ...)
## S3 method for class 'sfacross' marginal(object, newData = NULL, ...) ## S3 method for class 'sfalcmcross' marginal(object, newData = NULL, ...) ## S3 method for class 'sfaselectioncross' marginal(object, newData = NULL, ...)
object |
A stochastic frontier model returned
by |
newData |
Optional data frame that is used to calculate the marginal
effect of |
... |
Currently ignored. |
marginal
operates in the presence of exogenous
variables that explain inefficiency, namely the inefficiency drivers
( or
).
Two components are computed for each variable: the marginal effects on the
expected inefficiency () and
the marginal effects on the variance of inefficiency (
).
The model also allows the Wang (2002) parametrization of and
by the same vector of exogenous variables. This double
parameterization accounts for non-monotonic relationships between the
inefficiency and its drivers.
marginal
returns a data frame containing the marginal
effects of the variables on the expected inefficiency (each
variable has the prefix
'Eu_'
) and on the variance of the
inefficiency (each variable has the prefix 'Vu_'
).
In the case of the latent class stochastic frontier (LCM), each variable
ends with '_c#'
where '#'
is the class number.
Wang, H.J. 2002. Heteroscedasticity and non-monotonic efficiency effects of a stochastic frontier model. Journal of Productivity Analysis, 18:241–253.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu + wl, uhet = ~ regu + wl, data = utility, S = -1, scaling = TRUE, method = 'mla') marg.tl_u_ts <- marginal(tl_u_ts) summary(marg.tl_u_ts) ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod, uhet = ~ initStat + h, S = 1, method = 'mla') marg.cb_2c_h <- marginal(cb_2c_h) summary(marg.cb_2c_h) ## End(Not run)
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu + wl, uhet = ~ regu + wl, data = utility, S = -1, scaling = TRUE, method = 'mla') marg.tl_u_ts <- marginal(tl_u_ts) summary(marg.tl_u_ts) ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod, uhet = ~ initStat + h, S = 1, method = 'mla') marg.cb_2c_h <- marginal(cb_2c_h) summary(marg.cb_2c_h) ## End(Not run)
This function extracts the total number of 'observations' from a fitted frontier model.
## S3 method for class 'sfacross' nobs(object, ...) ## S3 method for class 'sfalcmcross' nobs(object, ...) ## S3 method for class 'sfaselectioncross' nobs(object, ...)
## S3 method for class 'sfacross' nobs(object, ...) ## S3 method for class 'sfalcmcross' nobs(object, ...) ## S3 method for class 'sfaselectioncross' nobs(object, ...)
object |
a |
... |
Currently ignored. |
nobs
gives the number of observations actually
used by the estimation procedure. It is not necessarily the number
of observations of the model frame (number of rows in the model
frame), because sometimes the model frame is further reduced by the
estimation procedure especially in the presence of NA. In the case of
sfaselectioncross
, nobs
returns the number of observations used in the
frontier equation.
A single number, normally an integer.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog (cost function) half normal with heteroscedasticity tl_u_h <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs') nobs(tl_u_h) ## End(Not run)
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog (cost function) half normal with heteroscedasticity tl_u_h <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs') nobs(tl_u_h) ## End(Not run)
This function returns the residuals' values from stochastic frontier models
estimated with sfacross
, sfalcmcross
, or
sfaselectioncross
.
## S3 method for class 'sfacross' residuals(object, ...) ## S3 method for class 'sfalcmcross' residuals(object, ...) ## S3 method for class 'sfaselectioncross' residuals(object, ...)
## S3 method for class 'sfacross' residuals(object, ...) ## S3 method for class 'sfalcmcross' residuals(object, ...) ## S3 method for class 'sfaselectioncross' residuals(object, ...)
object |
A stochastic frontier model returned
by |
... |
Currently ignored. |
When the object
is of class 'sfacross'
, or
'sfaselectioncross'
, residuals
returns a vector of
residuals values.
When the object
is of 'sfalcmcross'
,
residuals
returns a data frame containing the residuals values
for each latent class, where each variable ends with '_c#'
,
'#'
being the class number.
The residuals values are ordered in the same way as the corresponding observations in the dataset used for the estimation.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional or pooled data.
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') resid.tl_u_ts <- residuals(tl_u_ts) head(resid.tl_u_ts) ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod, S = 1) resid.cb_2c_h <- residuals(cb_2c_h) head(resid.cb_2c_h) ## End(Not run)
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') resid.tl_u_ts <- residuals(tl_u_ts) head(resid.tl_u_ts) ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod, S = 1) resid.cb_2c_h <- residuals(cb_2c_h) head(resid.cb_2c_h) ## End(Not run)
This dataset contains annual data collected from 43 smallholder rice producers in the Tarlac region of the Philippines between 1990 and 1997.
A data frame with 344 observations on the following 17 variables.
Time period (1= 1990, ..., 8 = 1997).
Farmer code (1, ..., 43).
Output (tonnes of freshly threshed rice).
Area planted (hectares).
Labor used (man-days of family and hired labor).
Fertiliser used (kg of active ingredients).
Other inputs used (Laspeyres index = 100 for Farm 17 in 1991).
Output price (pesos per kg).
Rental price of land (pesos per hectare).
Labor price (pesos per hired man-day).
Fertiliser price (pesos per kg of active ingredient).
Price of other inputs (implicit price index).
Age of the household head (years).
Education of the household head (years).
Household size.
Number of adults in the household.
Percentage of area classified as bantog (upland) fields.
This dataset is published as supplement to Coelli et al. (2005). While most variables of this dataset were supplied by the International Rice Research Institute (IRRI), some were calculated by Coelli et al. (2005, see p. 325–326). The survey is described in Pandey et al. (1999).
Coelli, T. J., Rao, D. S. P., O'Donnell, C. J., and Battese, G. E. 2005. An Introduction to Efficiency and Productivity Analysis, Springer, New York.
Pandey, S., Masciat, P., Velasco, L, and Villano, R. 1999. Risk analysis of a rainfed rice production system system in Tarlac, Central Luzon, Philippines. Experimental Agriculture, 35:225–237.
str(ricephil) summary(ricephil)
str(ricephil) summary(ricephil)
sfacross
is a symbolic formula-based function for the
estimation of stochastic frontier models in the case of cross-sectional or
pooled cross-sectional data, using maximum (simulated) likelihood - M(S)L.
The function accounts for heteroscedasticity in both one-sided and two-sided error terms as in Reifschneider and Stevenson (1991), Caudill and Ford (1993), Caudill et al. (1995) and Hadri (1999), but also heterogeneity in the mean of the pre-truncated distribution as in Kumbhakar et al. (1991), Huang and Liu (1994) and Battese and Coelli (1995).
Ten distributions are possible for the one-sided error term and eleven optimization algorithms are available.
The truncated normal - normal distribution with scaling property as in Wang and Schmidt (2002) is also implemented.
sfacross( formula, muhet, uhet, vhet, logDepVar = TRUE, data, subset, weights, wscale = TRUE, S = 1L, udist = "hnormal", scaling = FALSE, start = NULL, method = "bfgs", hessianType = 1L, simType = "halton", Nsim = 100, prime = 2L, burn = 10, antithetics = FALSE, seed = 12345, itermax = 2000, printInfo = FALSE, tol = 1e-12, gradtol = 1e-06, stepmax = 0.1, qac = "marquardt" ) ## S3 method for class 'sfacross' print(x, ...) ## S3 method for class 'sfacross' bread(x, ...) ## S3 method for class 'sfacross' estfun(x, ...)
sfacross( formula, muhet, uhet, vhet, logDepVar = TRUE, data, subset, weights, wscale = TRUE, S = 1L, udist = "hnormal", scaling = FALSE, start = NULL, method = "bfgs", hessianType = 1L, simType = "halton", Nsim = 100, prime = 2L, burn = 10, antithetics = FALSE, seed = 12345, itermax = 2000, printInfo = FALSE, tol = 1e-12, gradtol = 1e-06, stepmax = 0.1, qac = "marquardt" ) ## S3 method for class 'sfacross' print(x, ...) ## S3 method for class 'sfacross' bread(x, ...) ## S3 method for class 'sfacross' estfun(x, ...)
formula |
A symbolic description of the model to be estimated based on
the generic function |
muhet |
A one-part formula to consider heterogeneity in the mean of the pre-truncated distribution (see section ‘Details’). |
uhet |
A one-part formula to consider heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to consider heteroscedasticity in the two-sided error variance (see section ‘Details’). |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted
log-likelihood. Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Default =
|
scaling |
Logical. Only when |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
simType |
Character string. If |
Nsim |
Number of draws for MSL. Default 100. |
prime |
Prime number considered for Halton and Generalized-Halton
draws. Default = |
burn |
Number of the first observations discarded in the case of Halton
draws. Default = |
antithetics |
Logical. Default = |
seed |
Numeric. Seed for the random draws. |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class sfacross (returned by the function
|
... |
additional arguments of frontier are passed to sfacross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
The stochastic frontier model for the cross-sectional data is defined as:
with
where is the observation,
is the
output (cost, revenue, profit),
is the vector of main explanatory
variables (inputs and other control variables),
is the one-sided
error term with variance
, and
is the two-sided
error term with variance
.
S = 1
in the case of production (profit) frontier function and
S = -1
in the case of cost frontier function.
The model is estimated using maximum likelihood (ML) for most distributions
except the Gamma, Weibull and log-normal distributions for which maximum
simulated likelihood (MSL) is used. For this latter, several draws can be
implemented namely Halton, Generalized Halton, Sobol and uniform. In the
case of uniform draws, antithetics can also be computed: first Nsim/2
draws are obtained, then the Nsim/2
other draws are obtained as
counterpart of one (1-draw
).
To account for heteroscedasticity in the variance parameters of the error
terms, a single part (right) formula can also be specified. To impose the
positivity to these parameters, the variances are modelled as:
or
, where
and
are the heteroscedasticity
variables (inefficiency drivers in the case of
) and
and
the coefficients. In the case of heterogeneity in the
truncated mean
, it is modelled as
. The
scaling property can be applied for the truncated normal distribution:
where
follows a truncated normal
distribution
.
In the case of the truncated normal distribution, the convolution of
and
is:
where
and
In the case of the half normal distribution the convolution is obtained by
setting .
sfacross
allows for the maximization of weighted log-likelihood.
When option weights
is specified and wscale = TRUE
, the weights
are scaled as:
For complex problems, non-gradient methods (e.g. nm
or sann
)
can be used to warm start the optimization and zoom in the neighborhood of
the solution. Then a gradient-based methods is recommended in the second
step. In the case of sann
, we recommend to significantly increase the
iteration limit (e.g. itermax = 20000
). The Conjugate Gradient
(cg
) can also be used in the first stage.
A set of extractor functions for fitted model objects is available for
objects of class 'sfacross'
including methods to the generic functions
print
,
summary
, coef
,
fitted
,
logLik
,
residuals
,
vcov
,
efficiencies
,
ic
,
marginal
,
skewnessTest
,
estfun
and
bread
(from the sandwich package),
lmtest::coeftest()
(from the lmtest package).
sfacross
returns a list of class 'sfacross'
containing the following elements:
call |
The matched call. |
formula |
The estimated model. |
S |
The argument |
typeSfa |
Character string. 'Stochastic Production/Profit Frontier, e =
v - u' when |
Nobs |
Number of observations used for optimization. |
nXvar |
Number of explanatory variables in the production or cost frontier. |
nmuZUvar |
Number of variables explaining heterogeneity in the
truncated mean, only if |
scaling |
The argument |
logDepVar |
The argument |
nuZUvar |
Number of variables explaining heteroscedasticity in the one-sided error term. |
nvZVvar |
Number of variables explaining heteroscedasticity in the two-sided error term. |
nParm |
Total number of parameters estimated. |
udist |
The argument |
startVal |
Numeric vector. Starting value for M(S)L estimation. |
dataTable |
A data frame (tibble format) containing information on data
used for optimization along with residuals and fitted values of the OLS and
M(S)L estimations, and the individual observation log-likelihood. When
|
olsParam |
Numeric vector. OLS estimates. |
olsStder |
Numeric vector. Standard errors of OLS estimates. |
olsSigmasq |
Numeric. Estimated variance of OLS random error. |
olsLoglik |
Numeric. Log-likelihood value of OLS estimation. |
olsSkew |
Numeric. Skewness of the residuals of the OLS estimation. |
olsM3Okay |
Logical. Indicating whether the residuals of the OLS estimation have the expected skewness. |
CoelliM3Test |
Coelli's test for OLS residuals skewness. (See Coelli, 1995). |
AgostinoTest |
D'Agostino's test for OLS residuals skewness. (See D'Agostino and Pearson, 1973). |
isWeights |
Logical. If |
optType |
Optimization algorithm used. |
nIter |
Number of iterations of the ML estimation. |
optStatus |
Optimization algorithm termination message. |
startLoglik |
Log-likelihood at the starting values. |
mlLoglik |
Log-likelihood value of the M(S)L estimation. |
mlParam |
Parameters obtained from M(S)L estimation. |
gradient |
Each variable gradient of the M(S)L estimation. |
gradL_OBS |
Matrix. Each variable individual observation gradient of the M(S)L estimation. |
gradientNorm |
Gradient norm of the M(S)L estimation. |
invHessian |
Covariance matrix of the parameters obtained from the M(S)L estimation. |
hessianType |
The argument |
mlDate |
Date and time of the estimated model. |
simDist |
The argument |
Nsim |
The argument |
FiMat |
Matrix of random draws used for MSL, only if |
For the Halton draws, the code is adapted from the mlogit package.
Aigner, D., Lovell, C. A. K., and Schmidt, P. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics, 6(1), 21–37.
Battese, G. E., and Coelli, T. J. 1995. A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empirical Economics, 20(2), 325–332.
Caudill, S. B., and Ford, J. M. 1993. Biases in frontier estimation due to heteroscedasticity. Economics Letters, 41(1), 17–20.
Caudill, S. B., Ford, J. M., and Gropper, D. M. 1995. Frontier estimation and firm-specific inefficiency measures in the presence of heteroscedasticity. Journal of Business & Economic Statistics, 13(1), 105–111.
Coelli, T. 1995. Estimators and hypothesis tests for a stochastic frontier function - a Monte-Carlo analysis. Journal of Productivity Analysis, 6:247–268.
D'Agostino, R., and E.S. Pearson. 1973. Tests for departure from normality.
Empirical results for the distributions of and
.
Biometrika, 60:613–622.
Greene, W. H. 2003. Simulated likelihood estimation of the normal-Gamma stochastic frontier function. Journal of Productivity Analysis, 19(2-3), 179–190.
Hadri, K. 1999. Estimation of a doubly heteroscedastic stochastic frontier cost function. Journal of Business & Economic Statistics, 17(3), 359–363.
Hajargasht, G. 2015. Stochastic frontiers with a Rayleigh distribution. Journal of Productivity Analysis, 44(2), 199–208.
Huang, C. J., and Liu, J.-T. 1994. Estimation of a non-neutral stochastic frontier production function. Journal of Productivity Analysis, 5(2), 171–180.
Kumbhakar, S. C., Ghosh, S., and McGuckin, J. T. 1991) A generalized production frontier approach for estimating determinants of inefficiency in U.S. dairy farms. Journal of Business & Economic Statistics, 9(3), 279–286.
Li, Q. 1996. Estimating a stochastic production frontier when the adjusted error is symmetric. Economics Letters, 52(3), 221–228.
Meeusen, W., and Vandenbroeck, J. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review, 18(2), 435–445.
Migon, H. S., and Medici, E. V. 2001. Bayesian hierarchical models for stochastic production frontier. Lacea, Montevideo, Uruguay.
Nguyen, N. B. 2010. Estimation of technical efficiency in stochastic frontier analysis. PhD dissertation, Bowling Green State University, August.
Papadopoulos, A. 2021. Stochastic frontier models using the generalized exponential distribution. Journal of Productivity Analysis, 55:15–29.
Reifschneider, D., and Stevenson, R. 1991. Systematic departures from the frontier: A framework for the analysis of firm inefficiency. International Economic Review, 32(3), 715–723.
Stevenson, R. E. 1980. Likelihood Functions for Generalized Stochastic Frontier Estimation. Journal of Econometrics, 13(1), 57–66.
Tsionas, E. G. 2007. Efficiency measurement with the Weibull stochastic frontier. Oxford Bulletin of Economics and Statistics, 69(5), 693–706.
Wang, K., and Ye, X. 2020. Development of alternative stochastic frontier models for estimating time-space prism vertices. Transportation.
Wang, H.J., and Schmidt, P. 2002. One-step and two-step estimation of the effects of exogenous variables on technical efficiency levels. Journal of Productivity Analysis, 18:129–144.
Wang, J. 2012. A normal truncated skewed-Laplace model in stochastic frontier analysis. Master thesis, Western Kentucky University, May.
print
for printing sfacross
object.
summary
for creating and printing
summary results.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
skewnessTest
for conducting residuals
skewness test.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
skewnessTest
for implementing skewness test.
## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog (cost function) half normal with heteroscedasticity tl_u_h <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs') summary(tl_u_h) # Translog (cost function) truncated normal with heteroscedasticity tl_u_t <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, data = utility, S = -1, method = 'bhhh') summary(tl_u_t) # Translog (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') summary(tl_u_ts) ## Using data on Philippine rice producers # Cobb Douglas (production function) generalized exponential, and Weibull # distributions cb_p_ge <- sfacross(formula = log(PROD) ~ log(AREA) + log(LABOR) + log(NPK) + log(OTHER), udist = 'genexponential', data = ricephil, S = 1, method = 'bfgs') summary(cb_p_ge) ## Using data on U.S. electric utility industry # Cost frontier Gamma distribution tl_u_g <- sfacross(formula = log(cost/fprice) ~ log(output) + I(log(output)^2) + I(log(lprice/fprice)) + I(log(cprice/fprice)), udist = 'gamma', uhet = ~ 1, data = electricity, S = -1, method = 'bfgs', simType = 'halton', Nsim = 200, hessianType = 2) summary(tl_u_g)
## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog (cost function) half normal with heteroscedasticity tl_u_h <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs') summary(tl_u_h) # Translog (cost function) truncated normal with heteroscedasticity tl_u_t <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, data = utility, S = -1, method = 'bhhh') summary(tl_u_t) # Translog (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') summary(tl_u_ts) ## Using data on Philippine rice producers # Cobb Douglas (production function) generalized exponential, and Weibull # distributions cb_p_ge <- sfacross(formula = log(PROD) ~ log(AREA) + log(LABOR) + log(NPK) + log(OTHER), udist = 'genexponential', data = ricephil, S = 1, method = 'bfgs') summary(cb_p_ge) ## Using data on U.S. electric utility industry # Cost frontier Gamma distribution tl_u_g <- sfacross(formula = log(cost/fprice) ~ log(output) + I(log(output)^2) + I(log(lprice/fprice)) + I(log(cprice/fprice)), udist = 'gamma', uhet = ~ 1, data = electricity, S = -1, method = 'bfgs', simType = 'halton', Nsim = 200, hessianType = 2) summary(tl_u_g)
sfalcmcross
is a symbolic formula based function for the
estimation of the latent class stochastic frontier model (LCM) in the case
of cross-sectional or pooled cross-sectional data. The model is estimated
using maximum likelihood (ML). See Orea and Kumbhakar (2004), Parmeter and
Kumbhakar (2014, p282).
Only the half-normal distribution is possible for the one-sided error term. Eleven optimization algorithms are available.
The function also accounts for heteroscedasticity in both one-sided and two-sided error terms, as in Reifschneider and Stevenson (1991), Caudill and Ford (1993), Caudill et al. (1995) and Hadri (1999).
The model can estimate up to five classes.
sfalcmcross( formula, uhet, vhet, thet, logDepVar = TRUE, data, subset, weights, wscale = TRUE, S = 1L, udist = "hnormal", start = NULL, whichStart = 2L, initAlg = "nm", initIter = 100, lcmClasses = 2, method = "bfgs", hessianType = 1, itermax = 2000L, printInfo = FALSE, tol = 1e-12, gradtol = 1e-06, stepmax = 0.1, qac = "marquardt" ) ## S3 method for class 'sfalcmcross' print(x, ...) ## S3 method for class 'sfalcmcross' bread(x, ...) ## S3 method for class 'sfalcmcross' estfun(x, ...)
sfalcmcross( formula, uhet, vhet, thet, logDepVar = TRUE, data, subset, weights, wscale = TRUE, S = 1L, udist = "hnormal", start = NULL, whichStart = 2L, initAlg = "nm", initIter = 100, lcmClasses = 2, method = "bfgs", hessianType = 1, itermax = 2000L, printInfo = FALSE, tol = 1e-12, gradtol = 1e-06, stepmax = 0.1, qac = "marquardt" ) ## S3 method for class 'sfalcmcross' print(x, ...) ## S3 method for class 'sfalcmcross' bread(x, ...) ## S3 method for class 'sfalcmcross' estfun(x, ...)
formula |
A symbolic description of the model to be estimated based on
the generic function |
uhet |
A one-part formula to account for heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to account for heteroscedasticity in the two-sided error variance (see section ‘Details’). |
thet |
A one-part formula to account for technological heterogeneity in the construction of the classes. |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted
log-likelihood. Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Distribution specification for the one-sided
error term. Only the half normal distribution |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
whichStart |
Integer. If |
initAlg |
Character string specifying the algorithm used for
initialization and obtain the starting values (when |
initIter |
Maximum number of iterations for initialization algorithm.
Default |
lcmClasses |
Number of classes to be estimated (default = |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class sfalcmcross (returned by the function
|
... |
additional arguments of frontier are passed to sfalcmcross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
LCM is an estimation of a finite mixture of production functions:
where is the observation,
is the class,
is the
output (cost, revenue, profit),
is the vector of main explanatory
variables (inputs and other control variables),
is the one-sided
error term with variance
, and
is the two-sided
error term with variance
.
S = 1
in the case of production (profit) frontier function and
S = -1
in the case of cost frontier function.
The contribution of observation to the likelihood conditional on
class
is defined as:
where
and
The prior probability of using a particular technology can depend on some covariates (namely the variables separating the observations into classes) using a logit specification:
with the covariates,
the coefficients estimated for
the covariates, and
.
The unconditional likelihood of observation is simply the average
over the
classes:
The number of classes to retain can be based on information criterion (see
for instance ic
).
Class assignment is based on the largest posterior probability. This
probability is obtained using Bayes' rule, as follows for class :
To accommodate heteroscedasticity in the variance parameters of the error
terms, a single part (right) formula can also be specified. To impose the
positivity on these parameters, the variances are modelled respectively as:
and
, where
and
are the
heteroscedasticity variables (inefficiency drivers in the case of
)
and
and
the coefficients.
'sfalcmcross'
only
supports the half-normal distribution for the one-sided error term.
sfalcmcross
allows for the maximization of weighted log-likelihood.
When option weights
is specified and wscale = TRUE
, the weights
are scaled as:
For complex problems, non-gradient methods (e.g. nm
or
sann
) can be used to warm start the optimization and zoom in the
neighborhood of the solution. Then a gradient-based methods is recommended
in the second step. In the case of sann
, we recommend to significantly
increase the iteration limit (e.g. itermax = 20000
). The Conjugate
Gradient (cg
) can also be used in the first stage.
A set of extractor functions for fitted model objects is available for
objects of class 'sfalcmcross'
including methods to the generic functions
print
,
summary
,
coef
,
fitted
,
logLik
,
residuals
,
vcov
,
efficiencies
,
ic
,
marginal
,
estfun
and
bread
(from the sandwich package),
lmtest::coeftest()
(from the lmtest package).
sfalcmcross
returns a list of class 'sfalcmcross'
containing the following elements:
call |
The matched call. |
formula |
Multi parts formula describing the estimated model. |
S |
The argument |
typeSfa |
Character string. 'Latent Class Production/Profit Frontier, e
= v - u' when |
Nobs |
Number of observations used for optimization. |
nXvar |
Number of main explanatory variables. |
nZHvar |
Number of variables in the logit specification of the finite mixture model (i.e. number of covariates). |
logDepVar |
The argument |
nuZUvar |
Number of variables explaining heteroscedasticity in the one-sided error term. |
nvZVvar |
Number of variables explaining heteroscedasticity in the two-sided error term. |
nParm |
Total number of parameters estimated. |
udist |
The argument |
startVal |
Numeric vector. Starting value for ML estimation. |
dataTable |
A data frame (tibble format) containing information on data
used for optimization along with residuals and fitted values of the OLS and
ML estimations, and the individual observation log-likelihood. When
|
initHalf |
When |
isWeights |
Logical. If |
optType |
The optimization algorithm used. |
nIter |
Number of iterations of the ML estimation. |
optStatus |
An optimization algorithm termination message. |
startLoglik |
Log-likelihood at the starting values. |
nClasses |
The number of classes estimated. |
mlLoglik |
Log-likelihood value of the ML estimation. |
mlParam |
Numeric vector. Parameters obtained from ML estimation. |
mlParamMatrix |
Double. Matrix of ML parameters by class. |
gradient |
Numeric vector. Each variable gradient of the ML estimation. |
gradL_OBS |
Matrix. Each variable individual observation gradient of the ML estimation. |
gradientNorm |
Numeric. Gradient norm of the ML estimation. |
invHessian |
The covariance matrix of the parameters obtained from the ML estimation. |
hessianType |
The argument |
mlDate |
Date and time of the estimated model. |
In the case of panel data, sfalcmcross
estimates a pooled
cross-section where the probability of belonging to a class a priori is not
permanent (not fixed over time).
Aigner, D., Lovell, C. A. K., and P. Schmidt. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics, 6(1), 21–37.
Caudill, S. B., and J. M. Ford. 1993. Biases in frontier estimation due to heteroscedasticity. Economics Letters, 41(1), 17–20.
Caudill, S. B., Ford, J. M., and D. M. Gropper. 1995. Frontier estimation and firm-specific inefficiency measures in the presence of heteroscedasticity. Journal of Business & Economic Statistics, 13(1), 105–111.
Hadri, K. 1999. Estimation of a doubly heteroscedastic stochastic frontier cost function. Journal of Business & Economic Statistics, 17(3), 359–363.
Meeusen, W., and J. Vandenbroeck. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review, 18(2), 435–445.
Orea, L., and S.C. Kumbhakar. 2004. Efficiency measurement using a latent class stochastic frontier model. Empirical Economics, 29, 169–183.
Parmeter, C.F., and S.C. Kumbhakar. 2014. Efficiency analysis: A primer on recent advances. Foundations and Trends in Econometrics, 7, 191–385.
Reifschneider, D., and R. Stevenson. 1991. Systematic departures from the frontier: A framework for the analysis of firm inefficiency. International Economic Review, 32(3), 715–723.
print
for printing sfalcmcross
object.
summary
for creating and printing
summary results.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution # Intercept and initStat used as separating variables cb_2c_h1 <- sfalcmcross(formula = ly ~ lk + ll + yr, thet = ~initStat, data = worldprod) summary(cb_2c_h1) # summary of the initial ML model summary(cb_2c_h1$InitHalf) # Only the intercept is used as the separating variable # and only variable initStat is used as inefficiency driver cb_2c_h3 <- sfalcmcross(formula = ly ~ lk + ll + yr, uhet = ~initStat, data = worldprod) summary(cb_2c_h3)
## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution # Intercept and initStat used as separating variables cb_2c_h1 <- sfalcmcross(formula = ly ~ lk + ll + yr, thet = ~initStat, data = worldprod) summary(cb_2c_h1) # summary of the initial ML model summary(cb_2c_h1$InitHalf) # Only the intercept is used as the separating variable # and only variable initStat is used as inefficiency driver cb_2c_h3 <- sfalcmcross(formula = ly ~ lk + ll + yr, uhet = ~initStat, data = worldprod) summary(cb_2c_h3)
These functions are provided for compatibility with older versions of ‘sfaR’ only, and could be defunct at a future release.
lcmcross( formula, uhet, vhet, thet, logDepVar = TRUE, data, subset, weights, wscale = TRUE, S = 1L, udist = "hnormal", start = NULL, whichStart = 2L, initAlg = "nm", initIter = 100, lcmClasses = 2, method = "bfgs", hessianType = 1, itermax = 2000L, printInfo = FALSE, tol = 1e-12, gradtol = 1e-06, stepmax = 0.1, qac = "marquardt" ) ## S3 method for class 'lcmcross' print(x, ...) ## S3 method for class 'lcmcross' bread(x, ...) ## S3 method for class 'lcmcross' estfun(x, ...) ## S3 method for class 'lcmcross' coef(object, extraPar = FALSE, ...) ## S3 method for class 'summary.lcmcross' coef(object, ...) ## S3 method for class 'lcmcross' fitted(object, ...) ## S3 method for class 'lcmcross' ic(object, IC = "AIC", ...) ## S3 method for class 'lcmcross' logLik(object, individual = FALSE, ...) ## S3 method for class 'lcmcross' marginal(object, newData = NULL, ...) ## S3 method for class 'lcmcross' nobs(object, ...) ## S3 method for class 'lcmcross' residuals(object, ...) ## S3 method for class 'lcmcross' summary(object, grad = FALSE, ci = FALSE, ...) ## S3 method for class 'summary.lcmcross' print(x, digits = max(3, getOption("digits") - 2), ...) ## S3 method for class 'lcmcross' efficiencies(object, level = 0.95, newData = NULL, ...) ## S3 method for class 'lcmcross' vcov(object, ...)
lcmcross( formula, uhet, vhet, thet, logDepVar = TRUE, data, subset, weights, wscale = TRUE, S = 1L, udist = "hnormal", start = NULL, whichStart = 2L, initAlg = "nm", initIter = 100, lcmClasses = 2, method = "bfgs", hessianType = 1, itermax = 2000L, printInfo = FALSE, tol = 1e-12, gradtol = 1e-06, stepmax = 0.1, qac = "marquardt" ) ## S3 method for class 'lcmcross' print(x, ...) ## S3 method for class 'lcmcross' bread(x, ...) ## S3 method for class 'lcmcross' estfun(x, ...) ## S3 method for class 'lcmcross' coef(object, extraPar = FALSE, ...) ## S3 method for class 'summary.lcmcross' coef(object, ...) ## S3 method for class 'lcmcross' fitted(object, ...) ## S3 method for class 'lcmcross' ic(object, IC = "AIC", ...) ## S3 method for class 'lcmcross' logLik(object, individual = FALSE, ...) ## S3 method for class 'lcmcross' marginal(object, newData = NULL, ...) ## S3 method for class 'lcmcross' nobs(object, ...) ## S3 method for class 'lcmcross' residuals(object, ...) ## S3 method for class 'lcmcross' summary(object, grad = FALSE, ci = FALSE, ...) ## S3 method for class 'summary.lcmcross' print(x, digits = max(3, getOption("digits") - 2), ...) ## S3 method for class 'lcmcross' efficiencies(object, level = 0.95, newData = NULL, ...) ## S3 method for class 'lcmcross' vcov(object, ...)
formula |
A symbolic description of the model to be estimated based on
the generic function |
uhet |
A one-part formula to account for heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to account for heteroscedasticity in the two-sided error variance (see section ‘Details’). |
thet |
A one-part formula to account for technological heterogeneity in the construction of the classes. |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted
log-likelihood. Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Distribution specification for the one-sided
error term. Only the half normal distribution |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
whichStart |
Integer. If |
initAlg |
Character string specifying the algorithm used for
initialization and obtain the starting values (when |
initIter |
Maximum number of iterations for initialization algorithm.
Default |
lcmClasses |
Number of classes to be estimated (default = |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class lcmcross (returned by the function
|
... |
additional arguments of frontier are passed to lcmcross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
object |
an object of class lcmcross (returned by the function
|
extraPar |
Logical (default = |
IC |
Character string. Information criterion measure. Three criteria are available:
. |
individual |
Logical. If |
newData |
Optional data frame that is used to calculate the efficiency estimates. If NULL (the default), the efficiency estimates are calculated for the observations that were used in the estimation. |
grad |
Logical. Default = |
ci |
Logical. Default = |
digits |
Numeric. Number of digits displayed in values. |
level |
A number between between 0 and 0.9999 used for the computation
of (in-)efficiency confidence intervals (defaut = |
The following functions are deprecated and could be removed from sfaR in a near future. Use the replacement indicated below:
lcmcross: sfalcmcross
bread.lcmcross: bread.sfalcmcross
coef.lcmcross: coef.sfalcmcross
coef.summary.lcmcross: coef.summary.sfalcmcross
efficiencies.lcmcross: efficiencies.sfalcmcross
estfun.lcmcross: estfun.sfalcmcross
fitted.lcmcross: fitted.sfalcmcross
ic.lcmcross: ic.sfalcmcross
logLik.lcmcross: logLik.sfalcmcross
marginal.lcmcross: marginal.sfalcmcross
nobs.lcmcross: nobs.sfalcmcross
print.lcmcross: print.sfalcmcross
print.summary.lcmcross: print.summary.sfalcmcross
residuals.lcmcross: residuals.sfalcmcross
summary.lcmcross: summary.sfalcmcross
vcov.lcmcross: vcov.sfalcmcross
sfaselectioncross
is a symbolic formula based function for the
estimation of the stochastic frontier model in the presence of sample
selection. The model accommodates cross-sectional or pooled cross-sectional data.
The model can be estimated using different quadrature approaches or
maximum simulated likelihood (MSL). See Greene (2010).
Only the half-normal distribution is possible for the one-sided error term. Eleven optimization algorithms are available.
The function also accounts for heteroscedasticity in both one-sided and two-sided error terms, as in Reifschneider and Stevenson (1991), Caudill and Ford (1993), Caudill et al. (1995) and Hadri (1999).
sfaselectioncross( selectionF, frontierF, uhet, vhet, modelType = "greene10", logDepVar = TRUE, data, subset, weights, wscale = TRUE, S = 1L, udist = "hnormal", start = NULL, method = "bfgs", hessianType = 2L, lType = "ghermite", Nsub = 100, uBound = Inf, simType = "halton", Nsim = 100, prime = 2L, burn = 10, antithetics = FALSE, seed = 12345, itermax = 2000, printInfo = FALSE, intol = 1e-06, tol = 1e-12, gradtol = 1e-06, stepmax = 0.1, qac = "marquardt" ) ## S3 method for class 'sfaselectioncross' print(x, ...) ## S3 method for class 'sfaselectioncross' bread(x, ...) ## S3 method for class 'sfaselectioncross' estfun(x, ...)
sfaselectioncross( selectionF, frontierF, uhet, vhet, modelType = "greene10", logDepVar = TRUE, data, subset, weights, wscale = TRUE, S = 1L, udist = "hnormal", start = NULL, method = "bfgs", hessianType = 2L, lType = "ghermite", Nsub = 100, uBound = Inf, simType = "halton", Nsim = 100, prime = 2L, burn = 10, antithetics = FALSE, seed = 12345, itermax = 2000, printInfo = FALSE, intol = 1e-06, tol = 1e-12, gradtol = 1e-06, stepmax = 0.1, qac = "marquardt" ) ## S3 method for class 'sfaselectioncross' print(x, ...) ## S3 method for class 'sfaselectioncross' bread(x, ...) ## S3 method for class 'sfaselectioncross' estfun(x, ...)
selectionF |
A symbolic (formula) description of the selection equation. |
frontierF |
A symbolic (formula) description of the outcome (frontier) equation. |
uhet |
A one-part formula to consider heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to consider heteroscedasticity in the two-sided error variance (see section ‘Details’). |
modelType |
Character string. Model used to solve the selection bias. Only the model discussed in Greene (2010) is currently available. |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted log-likelihood.
Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Distribution specification for the one-sided
error term. Only the half normal distribution |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
lType |
Specifies the way the likelihood is estimated. Five possibilities are
available: |
Nsub |
Integer. Number of subdivisions/nodes used for quadrature approaches.
Default |
uBound |
Numeric. Upper bound for the inefficiency component when solving
integrals using quadrature approaches except Gauss-Hermite for which the upper
bound is automatically infinite ( |
simType |
Character string. If |
Nsim |
Number of draws for MSL (default 100). |
prime |
Prime number considered for Halton and Generalized-Halton
draws. Default = |
burn |
Number of the first observations discarded in the case of Halton
draws. Default = |
antithetics |
Logical. Default = |
seed |
Numeric. Seed for the random draws. |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
intol |
Numeric. Integration tolerance for quadrature approaches
( |
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class sfaselectioncross (returned by the function |
... |
additional arguments of frontier are passed to sfaselectioncross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
The current model is an extension of Heckman (1976, 1979) sample selection model to nonlinear models particularly stochastic frontier model. The model has first been discussed in Greene (2010), and an application can be found in Dakpo et al. (2021). Practically, we have:
where
and
where
describes the selection equation while
represents
the frontier equation. The selection bias arises from the correlation
between the two symmetric random components
and
:
Conditionaly on , the probability associated to each observation is:
Using the conditional probability formula:
Therefore:
Using the properties of a bivariate normal distribution, we have:
Hence conditionally on , we have:
The conditional likelihood is equal to:
Since the non-selected observations bring no additional information, the conditional likelihood to be considered is:
The unconditional likelihood is obtained by integrating out of the conditional likelihood. Thus
To simplifiy the estimation, the likelihood can be estimated using a two-step approach.
In the first step, the probit model can be run and estimate of can be obtained.
Then, in the second step, the following model is estimated:
where . This likelihood can be estimated using
five different approaches: Gauss-Kronrod quadrature, adaptive integration over hypercubes
(hcubature and pcubature), Gauss-Hermite quadrature, and
maximum simulated likelihood. We also use the BHHH estimator to obtain
the asymptotic standard errors for the parameter estimators.
sfaselectioncross
allows for the maximization of weighted log-likelihood.
When option weights
is specified and wscale = TRUE
, the weights
are scaled as:
For complex problems, non-gradient methods (e.g. nm
or sann
) can be
used to warm start the optimization and zoom in the neighborhood of the
solution. Then a gradient-based methods is recommended in the second step. In the case
of sann
, we recommend to significantly increase the iteration limit
(e.g. itermax = 20000
). The Conjugate Gradient (cg
) can also be used
in the first stage.
A set of extractor functions for fitted model objects is available for objects of class
'sfaselectioncross'
including methods to the generic functions print
,
summary
, coef
,
fitted
, logLik
,
residuals
, vcov
,
efficiencies
, ic
,
marginal
,
estfun
and
bread
(from the sandwich package),
lmtest::coeftest()
(from the lmtest package).
sfaselectioncross
returns a list of class 'sfaselectioncross'
containing the following elements:
call |
The matched call. |
selectionF |
The selection equation formula. |
frontierF |
The frontier equation formula. |
S |
The argument |
typeSfa |
Character string. 'Stochastic Production/Profit Frontier, e =
v - u' when |
Ninit |
Number of initial observations in all samples. |
Nobs |
Number of observations used for optimization. |
nXvar |
Number of explanatory variables in the production or cost frontier. |
logDepVar |
The argument |
nuZUvar |
Number of variables explaining heteroscedasticity in the one-sided error term. |
nvZVvar |
Number of variables explaining heteroscedasticity in the two-sided error term. |
nParm |
Total number of parameters estimated. |
udist |
The argument |
startVal |
Numeric vector. Starting value for M(S)L estimation. |
dataTable |
A data frame (tibble format) containing information on data
used for optimization along with residuals and fitted values of the OLS and
M(S)L estimations, and the individual observation log-likelihood. When argument |
lpmObj |
Linear probability model used for initializing the first step probit model. |
probitObj |
Probit model. Object of class |
ols2stepParam |
Numeric vector. OLS second step estimates for selection correction. Inverse Mills Ratio is introduced as an additional explanatory variable. |
ols2stepStder |
Numeric vector. Standard errors of OLS second step estimates. |
ols2stepSigmasq |
Numeric. Estimated variance of OLS second step random error. |
ols2stepLoglik |
Numeric. Log-likelihood value of OLS second step estimation. |
ols2stepSkew |
Numeric. Skewness of the residuals of the OLS second step estimation. |
ols2stepM3Okay |
Logical. Indicating whether the residuals of the OLS second step estimation have the expected skewness. |
CoelliM3Test |
Coelli's test for OLS residuals skewness. (See Coelli, 1995). |
AgostinoTest |
D'Agostino's test for OLS residuals skewness. (See D'Agostino and Pearson, 1973). |
isWeights |
Logical. If |
lType |
Type of likelihood estimated. See the section ‘Arguments’. |
optType |
Optimization algorithm used. |
nIter |
Number of iterations of the ML estimation. |
optStatus |
Optimization algorithm termination message. |
startLoglik |
Log-likelihood at the starting values. |
mlLoglik |
Log-likelihood value of the M(S)L estimation. |
mlParam |
Parameters obtained from M(S)L estimation. |
gradient |
Each variable gradient of the M(S)L estimation. |
gradL_OBS |
Matrix. Each variable individual observation gradient of the M(S)L estimation. |
gradientNorm |
Gradient norm of the M(S)L estimation. |
invHessian |
Covariance matrix of the parameters obtained from the M(S)L estimation. |
hessianType |
The argument |
mlDate |
Date and time of the estimated model. |
simDist |
The argument |
Nsim |
The argument |
FiMat |
Matrix of random draws used for MSL, only if |
gHermiteData |
List. Gauss-Hermite quadrature rule as provided by
|
Nsub |
Number of subdivisions used for quadrature approaches. |
uBound |
Upper bound for the inefficiency component when solving
integrals using quadrature approaches except Gauss-Hermite for which the upper
bound is automatically infinite ( |
intol |
Integration tolerance for quadrature approaches except Gauss-Hermite. |
For the Halton draws, the code is adapted from the mlogit package.
Caudill, S. B., and Ford, J. M. 1993. Biases in frontier estimation due to heteroscedasticity. Economics Letters, 41(1), 17–20.
Caudill, S. B., Ford, J. M., and Gropper, D. M. 1995. Frontier estimation and firm-specific inefficiency measures in the presence of heteroscedasticity. Journal of Business & Economic Statistics, 13(1), 105–111.
Coelli, T. 1995. Estimators and hypothesis tests for a stochastic frontier function - a Monte-Carlo analysis. Journal of Productivity Analysis, 6:247–268.
D'Agostino, R., and E.S. Pearson. 1973. Tests for departure from normality.
Empirical results for the distributions of and
.
Biometrika, 60:613–622.
Dakpo, K. H., Latruffe, L., Desjeux, Y., Jeanneaux, P., 2022. Modeling heterogeneous technologies in the presence of sample selection: The case of dairy farms and the adoption of agri-environmental schemes in France. Agricultural Economics, 53(3), 422-438.
Greene, W., 2010. A stochastic frontier model with correction for sample selection. Journal of Productivity Analysis. 34, 15–24.
Hadri, K. 1999. Estimation of a doubly heteroscedastic stochastic frontier cost function. Journal of Business & Economic Statistics, 17(3), 359–363.
Heckman, J., 1976. Discrete, qualitative and limited dependent variables. Ann Econ Soc Meas. 4, 475–492.
Heckman, J., 1979. Sample Selection Bias as a Specification Error. Econometrica. 47, 153–161.
Reifschneider, D., and Stevenson, R. 1991. Systematic departures from the frontier: A framework for the analysis of firm inefficiency. International Economic Review, 32(3), 715–723.
print
for printing sfaselectioncross
object.
summary
for creating and printing
summary results.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
## Not run: ## Simulated example N <- 2000 # sample size set.seed(12345) z1 <- rnorm(N) z2 <- rnorm(N) v1 <- rnorm(N) v2 <- rnorm(N) e1 <- v1 e2 <- 0.7071 * (v1 + v2) ds <- z1 + z2 + e1 d <- ifelse(ds > 0, 1, 0) u <- abs(rnorm(N)) x1 <- rnorm(N) x2 <- rnorm(N) y <- x1 + x2 + e2 - u data <- cbind(y = y, x1 = x1, x2 = x2, z1 = z1, z2 = z2, d = d) ## Estimation using quadrature (Gauss-Kronrod) selecRes1 <- sfaselectioncross(selectionF = d ~ z1 + z2, frontierF = y ~ x1 + x2, modelType = 'greene10', method = 'bfgs', logDepVar = TRUE, data = as.data.frame(data), S = 1L, udist = 'hnormal', lType = 'kronrod', Nsub = 100, uBound = Inf, simType = 'halton', Nsim = 300, prime = 2L, burn = 10, antithetics = FALSE, seed = 12345, itermax = 2000, printInfo = FALSE) summary(selecRes1) ## Estimation using maximum simulated likelihood selecRes2 <- sfaselectioncross(selectionF = d ~ z1 + z2, frontierF = y ~ x1 + x2, modelType = 'greene10', method = 'bfgs', logDepVar = TRUE, data = as.data.frame(data), S = 1L, udist = 'hnormal', lType = 'msl', Nsub = 100, uBound = Inf, simType = 'halton', Nsim = 300, prime = 2L, burn = 10, antithetics = FALSE, seed = 12345, itermax = 2000, printInfo = FALSE) summary(selecRes2) ## End(Not run)
## Not run: ## Simulated example N <- 2000 # sample size set.seed(12345) z1 <- rnorm(N) z2 <- rnorm(N) v1 <- rnorm(N) v2 <- rnorm(N) e1 <- v1 e2 <- 0.7071 * (v1 + v2) ds <- z1 + z2 + e1 d <- ifelse(ds > 0, 1, 0) u <- abs(rnorm(N)) x1 <- rnorm(N) x2 <- rnorm(N) y <- x1 + x2 + e2 - u data <- cbind(y = y, x1 = x1, x2 = x2, z1 = z1, z2 = z2, d = d) ## Estimation using quadrature (Gauss-Kronrod) selecRes1 <- sfaselectioncross(selectionF = d ~ z1 + z2, frontierF = y ~ x1 + x2, modelType = 'greene10', method = 'bfgs', logDepVar = TRUE, data = as.data.frame(data), S = 1L, udist = 'hnormal', lType = 'kronrod', Nsub = 100, uBound = Inf, simType = 'halton', Nsim = 300, prime = 2L, burn = 10, antithetics = FALSE, seed = 12345, itermax = 2000, printInfo = FALSE) summary(selecRes1) ## Estimation using maximum simulated likelihood selecRes2 <- sfaselectioncross(selectionF = d ~ z1 + z2, frontierF = y ~ x1 + x2, modelType = 'greene10', method = 'bfgs', logDepVar = TRUE, data = as.data.frame(data), S = 1L, udist = 'hnormal', lType = 'msl', Nsub = 100, uBound = Inf, simType = 'halton', Nsim = 300, prime = 2L, burn = 10, antithetics = FALSE, seed = 12345, itermax = 2000, printInfo = FALSE) summary(selecRes2) ## End(Not run)
skewnessTest
computes skewness test for stochastic frontier
models (i.e. objects of class 'sfacross'
).
skewnessTest(object, test = "agostino")
skewnessTest(object, test = "agostino")
object |
An object of class |
test |
A character string specifying the test to implement. If
|
skewnessTest
returns the results of either the D'Agostino's
or the Coelli's skewness test.
skewnessTest
is currently only available for object of
class 'sfacross'
.
Coelli, T. 1995. Estimators and hypothesis tests for a stochastic frontier function - a Monte-Carlo analysis. Journal of Productivity Analysis, 6:247–268.
D'Agostino, R., and E.S. Pearson. 1973. Tests for departure from normality.
Empirical results for the distributions of and
.
Biometrika, 60:613–622.
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') skewnessTest(tl_u_ts) skewnessTest(tl_u_ts, test = 'coelli') ## End(Not run)
## Not run: ## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') skewnessTest(tl_u_ts) skewnessTest(tl_u_ts, test = 'coelli') ## End(Not run)
Create and print summary results for stochastic frontier models returned by
sfacross
, sfalcmcross
, or
sfaselectioncross
.
## S3 method for class 'sfacross' summary(object, grad = FALSE, ci = FALSE, ...) ## S3 method for class 'summary.sfacross' print(x, digits = max(3, getOption("digits") - 2), ...) ## S3 method for class 'sfalcmcross' summary(object, grad = FALSE, ci = FALSE, ...) ## S3 method for class 'summary.sfalcmcross' print(x, digits = max(3, getOption("digits") - 2), ...) ## S3 method for class 'sfaselectioncross' summary(object, grad = FALSE, ci = FALSE, ...) ## S3 method for class 'summary.sfaselectioncross' print(x, digits = max(3, getOption("digits") - 2), ...)
## S3 method for class 'sfacross' summary(object, grad = FALSE, ci = FALSE, ...) ## S3 method for class 'summary.sfacross' print(x, digits = max(3, getOption("digits") - 2), ...) ## S3 method for class 'sfalcmcross' summary(object, grad = FALSE, ci = FALSE, ...) ## S3 method for class 'summary.sfalcmcross' print(x, digits = max(3, getOption("digits") - 2), ...) ## S3 method for class 'sfaselectioncross' summary(object, grad = FALSE, ci = FALSE, ...) ## S3 method for class 'summary.sfaselectioncross' print(x, digits = max(3, getOption("digits") - 2), ...)
object |
An object of either class |
grad |
Logical. Default = |
ci |
Logical. Default = |
... |
Currently ignored. |
x |
An object of either class |
digits |
Numeric. Number of digits displayed in values. |
The summary
method returns a list of class
'summary.sfacross'
, 'summary.sfalcmcross'
, or'summary.sfaselectioncross'
that contains the same elements as an object returned by sfacross
,
sfalcmcross
, or sfaselectioncross
with the
following additional elements:
AIC |
Akaike information criterion. |
BIC |
Bayesian information criterion. |
HQIC |
Hannan-Quinn information criterion. |
sigmavSq |
For |
sigmauSq |
For |
Varu |
For |
theta |
For |
Eu |
For |
Expu |
For |
olsRes |
For |
ols2StepRes |
For |
mlRes |
Matrix of ML estimates, their standard errors, z-values,
asymptotic P-values, and when |
chisq |
For |
df |
Degree of freedom for the inefficiency model. |
sfacross
, for the stochastic frontier analysis model
fitting function for cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function for cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function for cross-sectional or pooled data.
print
for printing sfacross
object.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
skewnessTest
for implementing skewness test.
## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') summary(tl_u_ts, grad = TRUE, ci = TRUE)
## Using data on fossil fuel fired steam electric power generation plants in the U.S. # Translog SFA (cost function) truncated normal with scaling property tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) + log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) + I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)), udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1, scaling = TRUE, method = 'mla') summary(tl_u_ts, grad = TRUE, ci = TRUE)
This dataset is an unbalanced panel of 50 Swiss railway companies over the period 1985-1997.
A data frame with 605 observations on the following 42 variables.
Firm identification.
Year identification.
Number of years observed.
Number of stops in network.
Network length (in meters).
Dummy variable for railroads with narrow track.
Dummy variable for ‘rack rail’ in network.
Dummy variable for network with tunnels over 300 meters on average.
Time indicator, first year = 0.
Passenger output – passenger km.
Freight output – ton km.
Total cost (1,000 Swiss franc).
Labor price.
Electricity price.
Capital price.
1 for railroads with curvy tracks.
Log of
CT
/PE
.
Log of Q2
.
Log of
Q3
.
Log of NETWORK
/1000.
Log of
PL
/PE
.
Log of PE
.
Log of
PK
/PE
.
Log of STOPS
.
Mean
of LNQ2
.
Mean of LNQ3
.
Mean of
LNNET
.
Mean of LNPL
.
Mean of
LNPK
.
Mean of LNSTOP
.
The dataset is extracted from the annual reports of the Swiss Federal Office of Statistics on public transport companies and has been used in Farsi et al. (2005).
https://pages.stern.nyu.edu/~wgreene/Text/Edition7/tablelist8new.htm
Farsi, M., M. Filippini, and W. Greene. 2005. Efficiency measurement in network industries: Application to the Swiss railway companies. Journal of Regulatory Economics, 28:69–90.
str(swissrailways)
str(swissrailways)
This dataset contains data on fossil fuel fired steam electric power generation plants in the United States between 1986 and 1996.
A data frame with 791 observations on the following 11 variables.
Plant identification.
Year identification.
Net-steam electric power generation in megawatt-hours.
Dummy variable which takes a value equal to 1 if the power plant is in a state which enacted legislation or issued a regulatory order to implement retail access during the sample period, and 0 otherwise.
Capital stock.
Labor and maintenance.
Fuel.
Labor price.
Fuel price.
Capital price.
Total cost.
The dataset has been used in Kumbhakar et al. (2014).
https://sites.google.com/view/sfbook-stata/home
Kumbhakar, S.C., H.J. Wang, and A. Horncastle. 2014. A Practitioner's Guide to Stochastic Frontier Analysis Using Stata. Cambridge University Press.
str(utility) summary(utility)
str(utility) summary(utility)
vcov
computes the variance-covariance matrix of the maximum
likelihood (ML) coefficients from stochastic frontier models estimated with
sfacross
, sfalcmcross
,
or sfaselectioncross
.
## S3 method for class 'sfacross' vcov(object, extraPar = FALSE, ...) ## S3 method for class 'sfalcmcross' vcov(object, ...) ## S3 method for class 'sfaselectioncross' vcov(object, extraPar = FALSE, ...)
## S3 method for class 'sfacross' vcov(object, extraPar = FALSE, ...) ## S3 method for class 'sfalcmcross' vcov(object, ...) ## S3 method for class 'sfaselectioncross' vcov(object, extraPar = FALSE, ...)
object |
A stochastic frontier model returned
by |
extraPar |
Logical. Only available for non heteroscedastic models
returned by
|
... |
Currently ignored |
The variance-covariance matrix is obtained by the inversion of the
negative Hessian matrix. Depending on the distribution and the
'hessianType'
option, the analytical/numeric Hessian or the bhhh
Hessian is evaluated.
The argument extraPar
, is currently available only for objects of class
'sfacross'
and 'sfaselectioncross'
. When
'extraPar = TRUE'
, the variance-covariance of the additional
parameters is obtained using the delta method.
The variance-covariance matrix of the maximum likelihood coefficients is returned.
sfacross
, for the stochastic frontier analysis model
fitting function using cross-sectional or pooled data.
sfalcmcross
, for the latent class stochastic frontier analysis
model fitting function using cross-sectional or pooled data.
sfaselectioncross
for sample selection in stochastic frontier
model fitting function using cross-sectional data.
## Using data on Spanish dairy farms # Cobb Douglas (production function) half normal distribution cb_s_h <- sfacross(formula = YIT ~ X1 + X2 + X3 + X4, udist = 'hnormal', data = dairyspain, S = 1, method = 'bfgs') vcov(cb_s_h) vcov(cb_s_h, extraPar = TRUE) # Other variance-covariance matrices can be obtained using the sandwich package # Robust variance-covariance matrix requireNamespace('sandwich', quietly = TRUE) sandwich::vcovCL(cb_s_h) # Coefficients and standard errors can be obtained using lmtest package requireNamespace('lmtest', quietly = TRUE) lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL) # Clustered standard errors lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL, cluster = ~ FARM) # Doubly clustered standard errors lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL, cluster = ~ FARM + YEAR) # BHHH standard errors lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovOPG) # Adjusted BHHH standard errors lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovOPG, adjust = TRUE) ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod, uhet = ~ initStat, S = 1) vcov(cb_2c_h)
## Using data on Spanish dairy farms # Cobb Douglas (production function) half normal distribution cb_s_h <- sfacross(formula = YIT ~ X1 + X2 + X3 + X4, udist = 'hnormal', data = dairyspain, S = 1, method = 'bfgs') vcov(cb_s_h) vcov(cb_s_h, extraPar = TRUE) # Other variance-covariance matrices can be obtained using the sandwich package # Robust variance-covariance matrix requireNamespace('sandwich', quietly = TRUE) sandwich::vcovCL(cb_s_h) # Coefficients and standard errors can be obtained using lmtest package requireNamespace('lmtest', quietly = TRUE) lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL) # Clustered standard errors lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL, cluster = ~ FARM) # Doubly clustered standard errors lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovCL, cluster = ~ FARM + YEAR) # BHHH standard errors lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovOPG) # Adjusted BHHH standard errors lmtest::coeftest(cb_s_h, vcov. = sandwich::vcovOPG, adjust = TRUE) ## Using data on eighty-two countries production (GDP) # LCM Cobb Douglas (production function) half normal distribution cb_2c_h <- sfalcmcross(formula = ly ~ lk + ll + yr, udist = 'hnormal', data = worldprod, uhet = ~ initStat, S = 1) vcov(cb_2c_h)
This dataset provides information on production related variables for eighty-two countries over the period 1960–1987.
A data frame with 2,296 observations on the following 12 variables.
Country name.
Country identification.
Year identification.
GDP in 1987 U.S. dollars.
Physical capital stock in 1987 U.S. dollars.
Labor (number of individuals in the workforce between the age of 15 and 64).
Human capital-adjusted labor.
Log of
y
.
Log of k
.
Log of l
.
Log of h
.
Log of the initial capital to
labor ratio of each country, lk
- ll
, measured at the
beginning of the sample period.
The dataset is from the World Bank STARS database and has been used in Kumbhakar et al. (2014).
https://sites.google.com/view/sfbook-stata/home
Kumbhakar, S.C., H.J. Wang, and A. Horncastle. 2014. A Practitioner's Guide to Stochastic Frontier Analysis Using Stata. Cambridge University Press.
str(worldprod) summary(worldprod)
str(worldprod) summary(worldprod)