The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
The JOUSBoost package implements under/oversampling with jittering for probability estimation. Its intent is to be used to improve probability estimates that come from boosting algorithms (such as AdaBoost), but is modular enough to be used with virtually any classification algorithm from machine learning. See Mease (2007) for more information.
You can install:
install.packages("JOUSBoost")
::install_github("molson2/JOUSBoost") devtools
The following example gives a useage case for JOUSBoost. This example illustrates the improvement in probability estimates on gets from applying the JOUS procedure to AdaBoost on a simulated data set. First, we’ll train AdaBoost applied to depth three decision trees, and then we’ll get the estimated probabilities.
# Generate data from Friedman model #
library(JOUSBoost)
set.seed(111)
= friedman_data(n = 1000, gamma = 0.5)
dat = sample(1:1000, 800)
train_index
# Train AdaBoost classifier using depth 3 decision tree
= adaboost(dat$X[train_index,], dat$y[train_index], tree_depth = 3, n_rounds = 400)
ada
# get probability estimate on test data
= predict(ada, dat$X[-train_index, ], type="prob") phat_ada
Next, we’ll compute probabilities by using the JOUS procedure.
# Apply jous to adaboost classifier
= function(X, y) adaboost(X, y, tree_depth = 3, n_rounds = 400)
class_func = function(fit_obj, X_test) predict(fit_obj, X_test)
pred_func
= jous(dat$X[train_index,], dat$y[train_index], class_func,
jous_fit type="under", delta=10, keep_models=TRUE)
pred_func,
# get probability estimate on test data
= predict(jous_fit, dat$X[-train_index, ], type="prob") phat_jous
Finally, we can see the benefit of using JOUSBoost!
# compare MSE of probability estimates
= dat$p[-train_index]
p_true mean((p_true - phat_jous)^2)
#> [1] 0.05455999
mean((p_true - phat_ada)^2)
#> [1] 0.1277416
Mease, D., Wyner, A., and Buja, A. 2007. “Costweighted Boosting with Jittering and over/under-Sampling. JOUS-Boost.” Journal of Machine Learning Research 8 409-439.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.