The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
The ml package implements the split-fit-evaluate-assess
workflow from Hastie, Tibshirani, and Friedman (2009), Chapter 7. The
key idea: keep a held-out test set sacred until you are done
experimenting, then assess once.
Formula interfaces are not supported. Pass the data
frame and target column name as a string:
ml_fit(data, "target", seed = 42).
Before modeling, understand what you have:
Three-way split (60/20/20). Stratified by default for classification.
Access partitions with $train, $valid,
$test. The $dev property combines train and
valid for final retraining.
Find candidates quickly before tuning:
Iterate freely on the validation set:
Gate your model before final assessment:
The final exam. Call this only when done experimenting.
All functions are also available via the ml$verb()
pattern, which mirrors Python’s import ml; ml.fit(...):
The same workflow applies to regression:
| Algorithm | Classification | Regression | Package |
|---|---|---|---|
| “logistic” | yes | – | base R (‘nnet’) |
| “xgboost” | yes | yes | ‘xgboost’ |
| “random_forest” | yes | yes | ‘ranger’ |
| “linear” (Ridge) | – | yes | ‘glmnet’ |
| “elastic_net” | – | yes | ‘glmnet’ |
| “svm” | yes | yes | ‘e1071’ |
| “knn” | yes | yes | ‘kknn’ |
| “naive_bayes” | yes | – | ‘naivebayes’ |
LightGBM support is planned for v1.1.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.