README

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

R/nlpred

Description

nlpred is an R package for computing estimates of cross-validated prediction metrics. These estimates are tailored for superior performance in small samples. Several estimators are available including ones based cross-validated targeted minimum loss-based estimation, estimating equations, and one-step estimation.

Installation

install.packages("nlpred")

You can install the current release of nlpred from GitHub via devtools with:

devtools::install_github("benkeser/nlpred")

Usage

The main functions in the package are cv_auc and cv_scrnp, which are used to compute, respectively, the K-fold cross-validated area under the receiver operating characteristics curve (CVAUC) and the K-fold cross-validated sensitivity constrained rate of negative prediction. However, rather than using standard cross-validation estimators (where prediction algorithms are developed in a training sample and AUC/SCRNP estimated using the validation sample), we instead use techniques from efficiency theory to estimate these quantities. This allows us to use the training data both to develop the prediction algorithm, as well as key nuisance parameters needed to evaluate AUC/SCRNP. By reserving more data for estimation of these key parameters, we obtain improved performance in small samples.

# load package
library(nlpred)
#> Loading required package: data.table

# turn off messages from np package
options(np.messages=FALSE)

# simulate data
n <- 200
p <- 10
X <- data.frame(matrix(rnorm(n*p), nrow = n, ncol = p))
Y <- rbinom(n, 1, plogis(X[,1] + X[,10]))

# get cv auc estimates for logistic regression
logistic_cv_auc_ests <- cv_auc(Y = Y, X = X, K = 5, learner = "glm_wrapper")
logistic_cv_auc_ests
#>                est         se       cil       ciu
#> cvtmle   0.7598522 0.03223410 0.6966745 0.8230299
#> onestep  0.7601000 0.03252870 0.6963449 0.8238551
#> esteq    0.7557129 0.03252870 0.6919578 0.8194680
#> standard 0.7660940 0.03348094 0.7004726 0.8317154

# get cv auc estimates for random forest using nested 
# cross-validation for nuisance parameter estimation. nested
# cross-validation is unfortunately necessary when aggressive learners 
# are used. 
rf_cv_auc_ests <- cv_auc(Y = Y, X = X, K = 5, 
                         learner = "randomforest_wrapper", 
                         nested_cv = TRUE)
rf_cv_auc_ests
#>                est         se       cil       ciu
#> cvtmle   0.7305404 0.03606462 0.6598550 0.8012257
#> onestep  0.7308869 0.03625171 0.6598349 0.8019390
#> esteq    0.7281639 0.03625171 0.6571118 0.7992159
#> standard 0.7435551 0.03553040 0.6739168 0.8131934

# same examples for scrnp
logistic_cv_scrnp_ests <- cv_scrnp(Y = Y, X = X, K = 5, learner = "glm_wrapper")
logistic_cv_scrnp_ests
#>                est         se        cil       ciu
#> cvtmle   0.1099379 0.03873987 0.03400918 0.1858667
#> onestep  0.1237150 0.03857579 0.04810785 0.1993222
#> esteq    0.1237150 0.03857579 0.04810785 0.1993222
#> standard 0.1612586 0.03851825 0.08576425 0.2367530


rf_cv_scrnp_ests <- cv_scrnp(Y = Y, X = X, K = 5, 
                         learner = "randomforest_wrapper", 
                         nested_cv = TRUE)
rf_cv_scrnp_ests
#>                 est         se         cil       ciu
#> cvtmle   0.09331934 0.02851627 0.037428470 0.1492102
#> onestep  0.09642105 0.02851279 0.040536999 0.1523051
#> esteq    0.09642105 0.02851279 0.040536999 0.1523051
#> standard 0.08475865 0.04111922 0.004166465 0.1653508

Issues

If you encounter any bugs or have any specific feature requests, please file an issue.

Contributions

Interested contributors can consult our contribution guidelines prior to submitting a pull request.

Citation

License

The contents of this repository are distributed under the MIT license. See below for details:

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.

R/nlpred