The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

bean: an overview

bean reduces sampling bias in species occurrence data by thinning it in environmental space rather than in geographic space. The result is a cleaner training set for species distribution models (SDM / ENM).

The protocol is:

  1. Prepare raw occurrences with prepare_bean().
  2. Choose a grid resolution with find_env_resolution(), which selects a kernel-density bandwidth for each environmental variable.
  3. Thin occurrences with thin_env_nd() (stochastic) or thin_env_center() (deterministic).
  4. Fit a niche ellipsoid with fit_ellipsoid().
  5. Predict suitability with predict() on the fitted ellipsoid.
library(bean)

Quickstart

data(origin_dat_prepared, package = "bean")
env_vars <- c("bio_1", "bio_4", "bio_12", "bio_15")

# 1. Pick an objective grid resolution from the data
res <- find_env_resolution(origin_dat_prepared, env_vars = env_vars)
res
#> --- Bean environmental grid resolution ---
#> Bandwidth selector: sheather-jones
#> 
#>  variable  resolution
#>     bio_1 0.056162684
#>     bio_4 0.013438324
#>    bio_12 0.004615848
#>    bio_15 0.006501067

# 2. Thin in environmental space
thinned <- thin_env_nd(
  data = origin_dat_prepared,
  env_vars = env_vars,
  grid_resolution = res$suggested_resolution,
  seed = 1
)
thinned
#> --- Bean Stochastic Thinning Results ---
#> 
#> Thinned 1024 original points to 78 points.
#> This represents a retention of 7.6% of the data.
#> 
#> --------------------------------------

The remaining vignettes walk through each step in detail.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.