The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Data is synthesised by sampling from a multivariate cumulative distribution (Copula), using the simstudy
package.
Data can be synthesised from marginal distributions using the synthesise_data()
function:
library(RESIDE)
import_marginal_distributions()
marginals <- synthesise_data(marginals) simulated_data <-
User specified correlations can be added to the synthesised data by supplying a correlation matrix. An empty correlations matrix can be generated using the export_empty_cor_matrix()
function, supplying the marginals imported using ‘import_marginal_distributions’ and a folder path respectively:
library(RESIDE)
import_marginal_distributions()
marginals <-export_empty_cor_matrix(marginals, folder_path = tempdir())
The exported CSV file will be a symmetric table which looks like:
Correlations should then be added to the CSV file, without modifying the column / row names. Correlations should use rank order correlations. Categorical variables are represented as dummy variables named using the format variable name underscore category name e.g. SEX_F. Note the correlation matrix should be symmetrical and positive semi definite.
Once the correlations have been added to the CSV file, the correlations can be imported using the `import_cor_matrix’ function:
library(RESIDE)
import_cor_matrix() correlation_matrix <-
By default the filename for the correlation matrix is that of the exported filename (correlation_matrix.csv
) and is imported from the current working directory. This can be changed by specifying a file_path
using the corresponding parameter of the import_cor_matrix()
function, this file path should be a relative or absolute file path.
The import_cor_matrix()
function will produce and error if the matrix is not symmetrical and positive semi definite, or the file does not exist.
With a correlation matrix data can now be synthesised with the user specified correlations using the synthesise_data()
function, specifying the correlation matrix imported by the import_cor_matrix()
function:
library(RESIDE)
import_marginal_distributions()
marginals <-export_empty_cor_matrix(marginals)
import_cor_matrix()
correlation_matrix <- synthesise_data(
simulated_data <-
marginals,
correlation_matrix )
NB It is not possible to entirely maintain all the marginal distributions when specifying correlations, this is a known limitation and is not likely to change.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.