The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

bigsimr

bigsimr is an R package for simulating high-dimensional multivariate data with a target correlation and arbitrary marginal distributions via Gaussian copula. It utilizes Bigsimr.jl for its core routines. For full documentation and examples, please see the Bigsimr.jl docs.

Features

Installation

You can install the release version of the package from GitHub:

remotes::install_github("SchisslerGroup/r-bigsimr")

To get a bug fix or to use a new feature, you can install the development version from GitHub:

remotes::install_github("SchisslerGroup/r-bigsimr", ref="develop")

Note that the first invocation of bigsimr::bigsimr_setup() will install both Julia and the required packages if they are missing. If you wish to have it use an existing Julia binary, make sure that julia is found in the path. For more information see the julia_setup() function from JuliaCall.

Usage

library(bigsimr)
bs <- bigsimr_setup()
dist <- distributions_setup()

set.seed(2024-02-20)

Examples

Pearson matching

(target_corr <- bs$cor_randPD(3))
#>            [,1]       [,2]       [,3]
#> [1,]  1.0000000 -0.1770422  0.2197788
#> [2,] -0.1770422  1.0000000 -0.8153085
#> [3,]  0.2197788 -0.8153085  1.0000000

margins <- c(
  dist$Binomial(20, 0.2), 
  dist$Beta(2, 3), 
  dist$LogNormal(3, 1)
)

(adjusted_corr <- bs$pearson_match(target_corr, margins))
#>            [,1]       [,2]       [,3]
#> [1,]  1.0000000 -0.1820291  0.2874494
#> [2,] -0.1820291  1.0000000 -0.9941172
#> [3,]  0.2874494 -0.9941172  1.0000000

x <- bs$rvec(100000, adjusted_corr, margins)

bs$cor(x, bs$Pearson)
#>            [,1]       [,2]       [,3]
#> [1,]  1.0000000 -0.1712600  0.2117211
#> [2,] -0.1712600  1.0000000 -0.6679074
#> [3,]  0.2117211 -0.6679074  1.0000000

Spearman/Kendall matching

(spearman_corr <- bs$cor_randPD(3))
#>           [,1]      [,2]      [,3]
#> [1,] 1.0000000 0.4768102 0.8865416
#> [2,] 0.4768102 1.0000000 0.5318276
#> [3,] 0.8865416 0.5318276 1.0000000

(adjusted_corr <- bs$cor_convert(spearman_corr, bs$Spearman, bs$Pearson))
#>           [,1]      [,2]      [,3]
#> [1,] 1.0000000 0.4941437 0.8954010
#> [2,] 0.4941437 1.0000000 0.5497588
#> [3,] 0.8954010 0.5497588 1.0000000

x <- bs$rvec(100000, adjusted_corr, margins)

bs$cor(x, bs$Spearman)
#>           [,1]      [,2]      [,3]
#> [1,] 1.0000000 0.4663302 0.8746458
#> [2,] 0.4663302 1.0000000 0.5276638
#> [3,] 0.8746458 0.5276638 1.0000000

Nearest correlation matrix

s <- bs$cor_randPSD(200)
r <- bs$cor_convert(s, bs$Spearman, bs$Pearson)
bs$is_correlation(r)
#> [1] FALSE
p <- bs$cor_nearPD(r)
bs$is_correlation(p)
#> [1] TRUE

Fast approximate nearest correlation matrix

s <- bs$cor_randPSD(2000)
r <- bs$cor_convert(s, bs$Spearman, bs$Pearson)
bs$is_correlation(r)
#> [1] FALSE
p <- bs$cor_fastPD(r)
bs$is_correlation(p)
#> [1] TRUE

Issues

This package is just a wrapper for the Julia package. Please file any bug reports or feature requests over at the Bigsimr.jl package repo.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.