The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Ex. 2 - Understanding the elements in output

Yuan-Ling Liaw and Waldir Leoncio

library(lsasim)

packageVersion("lsasim")

[1] '2.1.6'

questionnaire_gen(n_obs, cat_prop = NULL, n_vars = NULL, n_X = NULL, n_W = NULL, cor_matrix = NULL,
    cov_matrix = NULL, c_mean = NULL, c_sd = NULL, theta = FALSE, family = NULL, full_output = FALSE,
    verbose = TRUE)

By default, the function returns a data.frame object where the first column (“subject”) is a \(1, \ldots, n\) ordered list of the \(n\) observations and the other columns correspond to the questionnaire answers. If theta = TRUE, the first column after “subject” will be the latent variable theta; in any case, the continuous variables always come before the categorical ones.

If the logical argument full_output is TRUE, output will be a list containing the questionnaire data as well as several objects that might be of interest for further analysis of the data, listed below:

bg: a data frame containing the background questionnaire answers (i.e., the same object output if full_output = FALSE).
c_mean: is a vector of population means for each continuous variable (\(Y\) and \(X\)).
c_sd: is a vector of population standard deviations for each continuous variable (\(Y\) and \(X\)).
cat_prop: list of cumulative proportions for each item. If theta = TRUE, the first element of cat_prop must be a scalar 1, which corresponds to theta.
cat_prop_W_p: a list containing the probabilities for each category of the categorical variables (cat_prop_W contains the cumulative probabilities).
cor_matrix: latent correlation matrix. The first row/column corresponds to the latent trait (\(Y\)). The other rows/columns correspond to the continuous (\(X\) or \(Z\)) or the discrete (\(W\)) background variables, in the same order as cat_prop.
cov_matrix: latent covariance matrix, formatted as cor_matrix.
family: distribution of the background variables. Can be NULL (default) or ‘gaussian’.
n_obs: number of observations to generate.
n_tot: named vector containing the number of total variables, the number of continuous background variables (i.e., the total number of background variables except theta) and the number of categorical variables.
n_W: vector containing the number of categorical variables.
n_X: vector containing the number of continuous variables (except theta).
sd_YXW: vector with the standard deviations of all the variables
sd_YXZ: vector containing the standard deviations of theta, the background continuous variables (\(X\)) and the Normally-distributed variables \(Z\) which will generate the background categorical variables (\(W\)).
theta: if TRUE, the first continuous variable will be labeled “theta”. Otherwise, it will be labeled q1.
var_W: list containing the variances of the categorical variables.
var_YX: list containing the variances of the continuous variables (including theta)
linear_regression: This list is printed only if theta = TRUE, family = "gaussian" and full_output = TRUE. It contains one vector named betas and one tabled named cov_YXW. The former displays the true linear regression coefficients of theta on the background questionnaire answers; the latter contains the covariance matrix between all these variables.

We generate one continuous and two ordinal covariates. We specify the covariance matrix between the numeric and ordinal variables. The data is generated from a multivariate normal distribution. And we set the logical argument full_output = TRUE.

The output is a list containing the following elements: bg, c_mean, c_sd, cat_prop, cat_prop_W_p, cor_matrix, cov_matrix, family, n_W, n_X, n_obs, n_tot, sd_YXW, sd_YXZ, theta, var_W, var_YX, verbose, linear_regression.

`?`(questionnaire_gen)

set.seed(1234)
(props <- list(1, c(0.25, 1), c(0.2, 0.8, 1)))

[[1]]
[1] 1

[[2]]
[1] 0.25 1.00

[[3]]
[1] 0.2 0.8 1.0

(yw_cov <- matrix(c(1, 0.5, 0.5, 0.5, 1, 0.8, 0.5, 0.8, 1), nrow = 3))

     [,1] [,2] [,3]
[1,]  1.0  0.5  0.5
[2,]  0.5  1.0  0.8
[3,]  0.5  0.8  1.0

questionnaire_gen(n_obs = 10, cat_prop = props, cov_matrix = yw_cov, theta = TRUE, family = "gaussian",
    full_output = TRUE)

$bg
   subject      theta q1 q2
1        1 -0.8440231  2  2
2        2 -2.0198262  2  2
3        3 -0.7921984  1  1
4        4 -1.1724355  1  1
5        5 -0.5099209  2  2
6        6 -0.4202077  1  1
7        7 -0.2292551  2  3
8        8 -0.4616903  2  2
9        9 -0.8524573  1  2
10      10 -1.1829590  2  1

$c_mean
[1] 0

$c_sd
[1] 1

$cat_prop
$cat_prop[[1]]
[1] 1

$cat_prop[[2]]
[1] 0.25 1.00

$cat_prop[[3]]
[1] 0.2 0.8 1.0


$cat_prop_W_p
$cat_prop_W_p[[1]]
[1] 0.25 0.75

$cat_prop_W_p[[2]]
[1] 0.2 0.6 0.2


$cor_matrix
      theta  q1  q2
theta   1.0 0.5 0.5
q1      0.5 1.0 0.8
q2      0.5 0.8 1.0

$cov_matrix
      theta  q1  q2
theta   1.0 0.5 0.5
q1      0.5 1.0 0.8
q2      0.5 0.8 1.0

$family
[1] "gaussian"

$n_W
[1] 2

$n_X
[1] 0

$n_obs
[1] 10

$n_tot
n_vars    n_X    n_W  theta 
     3      0      2      1 

$sd_YXW
[1] 1.0000000 0.4330127 0.4330127 0.4000000 0.4898979 0.4000000

$sd_YXZ
[1] 1 1 1

$theta
[1] TRUE

$var_W
$var_W[[1]]
[1] 0.1875 0.1875

$var_W[[2]]
[1] 0.16 0.24 0.16


$var_YX
[1] 1

$verbose
[1] TRUE

$linear_regression
$linear_regression$betas
     theta       q1.2       q2.2       q2.3 
-0.8218134  0.4547365  0.4450622  1.0686187 

$linear_regression$vcov_YXW
             theta       q1.2          q2.2        q2.3
theta 1.000000e+00 0.15888829  5.551115e-17  0.13998096
q1.2  1.588883e-01 0.18750000  4.710271e-02  0.04928003
q2.2  5.551115e-17 0.04710271  2.400000e-01 -0.12000000
q2.3  1.399810e-01 0.04928003 -1.200000e-01  0.16000000

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.