The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Here are presented, in a full and complete example, all main functions (starting with BIOMOD_[...]) of biomod2.
The data set used is the DataSTOC containing abundance data.
Similar examples are presented for binary data on the Main functions (bin) webpage.
library(biomod2)
library(terra)
# Load species abundances (20 species available)
data('DataSTOC')
head(DataSTOC)
# Divide into calibration/validation and evaluation
period1 <- which(DataSTOC[, 'period'] == '2006-2011')
period2 <- which(DataSTOC[, 'period'] == '2012-2017')
# Select the name of the studied species
myRespName <- 'Periparus.ater'
# Get corresponding count data
myResp <- as.numeric(DataSTOC[, myRespName])
# Get corresponding XY coordinates
myRespXY <- DataSTOC[, c('X_WGS84', 'Y_WGS84')]
# Get corresponding environmental variables
myExpl <- DataSTOC[, c('temp', 'precip', 'sdiv_hab',
'cover_agri', 'cover_water', 'cover_wet')]
Main difference when providing abundance data instead of binary is to define the data.type parameter.
The distribution used in the selected algorithms afterwards will be set accordingly, for example :
| Type | Data | Distribution |
|---|---|---|
binary |
Only 1, and 0 or NA | binomial |
count |
Positive integer values | poisson |
multiclass |
Categorical classes | classification |
ordinal |
Ordered categorical classes | classification |
relative |
Continuous values between 0 and 1 | beta |
abundance |
Positive continuous values | gaussian |
# Format data with count data type
myBiomodData <- BIOMOD_FormatingData(resp.name = myRespName,
resp.var = myResp[period1],
resp.xy = myRespXY[period1, ],
expl.var = myExpl[period1, ],
eval.resp.var = myResp[period2],
eval.resp.xy = myRespXY[period2, ],
eval.expl.var = myExpl[period2, ],
data.type = 'count')
myBiomodData
summary(myBiomodData)
plot(myBiomodData, plot.eval = TRUE)
Note that pseudo-absence selection is only possible when data.type = 'binary'.
Several cross-validation methods are available and can be selected with the BIOMOD_Modeling function, which calls the bm_CrossValidation function to do so. More examples are presented on the Auxiliary functions webpage.
When using the balance parameter with multiclass or ordinal data, proportions are balanced within each class as much as possible.
# # k-fold selection
# cv.k <- bm_CrossValidation(bm.format = myBiomodData,
# strategy = 'kfold',
# nb.rep = 2,
# k = 3)
#
# # stratified selection (geographic)
# cv.s <- bm_CrossValidation(bm.format = myBiomodData,
# strategy = 'strat',
# k = 2,
# balance = 'presences',
# strat = 'x')
# head(cv.k)
# head(cv.s)
Modeling options are automatically retrieved from selected models within the BIOMOD_Modeling function, which calls the bm_ModelingOptions function to do so. Model parameters can also be automatically tuned to a specific dataset, by calling the bm_Tuning function, however it can be quite long. More examples are presented on the Auxiliary functions webpage.
# # bigboss parameters
# opt.b <- bm_ModelingOptions(data.type = 'count',
# models = c('DNN', 'XGBOOST'),
# strategy = 'bigboss')
#
# # tuned parameters with formated data
# opt.t <- bm_ModelingOptions(data.type = 'count',
# models = c('DNN', 'XGBOOST'),
# strategy = 'tuned',
# bm.format = myBiomodData)
#
# opt.b
# opt.t
Not all algorithms are available when it comes to data type other than binary.
| ANN | CTA | DNN | FDA | GAM | GBM | GLM | MARS | MAXENT | MAXNET | RF | RFd | SRE | XGBOOST | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| binary | x | x | x | x | x | x | x | x | x | x | x | x | x | x |
| multiclass | x | x | x | x | x | x | ||||||||
| ordinal | x | x | x | x | x | x | x | x | ||||||
| count / relative / abundance | x | x | x | x | x | x | x | x |
New evaluation metrics have been added :
Accuracy, Recall, Precision and F1 for multiclass and ordinalRMSE, MSE, MAE, Max_error, Rsquared and Rsquared_aj for count, relative and abundanceas well as new ensemble models :
EMmode and EMfreq for multiclass and ordinal The selection of single models for the ensemble modeling is different for RMSE, MSE, MAE and Max_error metrics : are kept all models whose evaluation value is < (best value + the given threshold).
# Model single models
myBiomodModelOut <- BIOMOD_Modeling(bm.format = myBiomodData,
modeling.id = 'AllModels',
models = c('DNN', 'GBM', 'GLM', 'MARS', 'RF', 'XGBOOST'),
CV.strategy = 'block',
OPT.strategy = 'bigboss',
metric.eval = c('Rsquared', 'Rsquared_aj', 'RMSE'),
var.import = 3)
# seed.val = 123)
# nb.cpu = 8)
myBiomodModelOut
# Get evaluation scores & variables importance
get_evaluations(myBiomodModelOut)
get_variables_importance(myBiomodModelOut)
# Represent evaluation scores & variables importance
bm_PlotEvalMean(bm.out = myBiomodModelOut, dataset = 'calibration')
bm_PlotEvalMean(bm.out = myBiomodModelOut, dataset = 'validation')
bm_PlotEvalBoxplot(bm.out = myBiomodModelOut, dataset = 'calibration', group.by = c('algo', 'algo'))
bm_PlotEvalBoxplot(bm.out = myBiomodModelOut, dataset = 'calibration', group.by = c('algo', 'run'))
bm_PlotVarImpBoxplot(bm.out = myBiomodModelOut, group.by = c('expl.var', 'algo', 'algo'))
bm_PlotVarImpBoxplot(bm.out = myBiomodModelOut, group.by = c('expl.var', 'algo', 'run'))
bm_PlotVarImpBoxplot(bm.out = myBiomodModelOut, group.by = c('algo', 'expl.var', 'run'))
# Represent response curves
bm_PlotResponseCurves(bm.out = myBiomodModelOut,
models.chosen = get_built_models(myBiomodModelOut)[c(1:3, 7:9)],
fixed.var = 'median')
bm_PlotResponseCurves(bm.out = myBiomodModelOut,
models.chosen = get_built_models(myBiomodModelOut)[c(1:3, 7:9)],
fixed.var = 'min')
bm_PlotResponseCurves(bm.out = myBiomodModelOut,
models.chosen = get_built_models(myBiomodModelOut)[3],
fixed.var = 'median',
do.bivariate = TRUE)
# Explore models' outliers & residuals
bm_ModelAnalysis(bm.mod = myBiomodModelOut,
models.chosen = get_built_models(myBiomodModelOut)[c(1:3, 7:9)])
# Model ensemble models
myBiomodEM <- BIOMOD_EnsembleModeling(bm.mod = myBiomodModelOut,
models.chosen = 'all',
em.by = 'all',
em.algo = c('EMmedian', 'EMmean', 'EMwmean'),
metric.select = c('Rsquared_aj', 'RMSE'),
metric.select.thresh = c(0.4, 2),
metric.eval = c('Rsquared', 'Rsquared_aj', 'RMSE'),
var.import = 3)
myBiomodEM
# Get evaluation scores & variables importance
get_evaluations(myBiomodEM)
get_variables_importance(myBiomodEM)
# Represent evaluation scores & variables importance
bm_PlotEvalMean(bm.out = myBiomodEM, dataset = 'calibration', group.by = 'full.name')
bm_PlotEvalBoxplot(bm.out = myBiomodEM, dataset = 'calibration', group.by = c('full.name', 'full.name'))
bm_PlotVarImpBoxplot(bm.out = myBiomodEM, group.by = c('expl.var', 'full.name', 'full.name'))
bm_PlotVarImpBoxplot(bm.out = myBiomodEM, group.by = c('expl.var', 'algo', 'merged.by.run'))
bm_PlotVarImpBoxplot(bm.out = myBiomodEM, group.by = c('algo', 'expl.var', 'merged.by.run'))
# Represent response curves
bm_PlotResponseCurves(bm.out = myBiomodEM,
models.chosen = get_built_models(myBiomodEM)[c(2, 5)],
fixed.var = 'median')
bm_PlotResponseCurves(bm.out = myBiomodEM,
models.chosen = get_built_models(myBiomodEM)[c(2, 5)],
fixed.var = 'min')
bm_PlotResponseCurves(bm.out = myBiomodEM,
models.chosen = get_built_models(myBiomodEM)[5],
fixed.var = 'median',
do.bivariate = TRUE)
The digits parameter is used to define the number of digits after the decimal point for the predicted values.
For relative data, the on_0_1000 parameter can be used in the same way than with binary data.
Note that integer values are lighter when saved in memory than float (decimal values).
# Project single models
myBiomodProj <- BIOMOD_Projection(bm.mod = myBiomodModelOut,
proj.name = 'Period1',
new.env = myExpl[period1, ],
new.env.xy = myRespXY[period1, ],
models.chosen = 'all',
build.clamping.mask = TRUE,
digits = 1)
myBiomodProj
plot(myBiomodProj, coord = myRespXY[period1, ])
# Project ensemble models (from single projections)
myBiomodEMProj <- BIOMOD_EnsembleForecasting(bm.em = myBiomodEM,
bm.proj = myBiomodProj,
models.chosen = 'all')
# Project ensemble models (building single projections)
myBiomodEMProj <- BIOMOD_EnsembleForecasting(bm.em = myBiomodEM,
proj.name = 'Period1EM',
new.env = myExpl[period1, ],
models.chosen = 'all')
myBiomodEMProj
plot(myBiomodEMProj, coord = myRespXY[period1, ])
# Project onto future conditions
myBiomodProjectionFuture <- BIOMOD_Projection(bm.mod = myBiomodModelOut,
proj.name = 'Period2',
new.env = myExpl[period2, ],
new.env.xy = myRespXY[period2, ],
models.chosen = 'all',
build.clamping.mask = TRUE)
# Compute differences
myBiomodRangeSize <- BIOMOD_RangeSize(proj.current = myBiomodProj,
proj.future = myBiomodProjectionFuture,
metric.binary = 'TSS')
myBiomodRangeSize@Compt.By.Models
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.