Type: | Package |
Title: | Structural Modeling for Multiple Latent Class Variables |
Version: | 1.3.0 |
Maintainer: | Youngsun Kim <yskstat@gmail.com> |
Description: | Provides comprehensive tools for the implementation of Structural Latent Class Models (SLCM), including Latent Transition Analysis (LTA; Linda M. Collins and Stephanie T. Lanza, 2009) <doi:10.1002/9780470567333>, Latent Class Profile Analysis (LCPA; Hwan Chung et al., 2010) <doi:10.1111/j.1467-985x.2010.00674.x>, and Joint Latent Class Analysis (JLCA; Saebom Jeon et al., 2017) <doi:10.1080/10705511.2017.1340844>, and any other extended models involving multiple latent class variables. |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.1 |
LinkingTo: | Rcpp |
Imports: | DiagrammeR, magrittr, MASS, Rcpp, stats |
Depends: | R (≥ 2.10) |
URL: | https://kim0sun.github.io/slca/ |
BugReports: | https://github.com/kim0sun/slca/issues |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | yes |
Packaged: | 2024-12-12 18:18:21 UTC; kim0sun |
Author: | Youngsun Kim |
Repository: | CRAN |
Date/Publication: | 2024-12-13 03:20:02 UTC |
slca: An R Package for Structural Modeling for Multiple Latent Class Variables
Description
The slca
package provides a comprehensive toolkit for the implementation of Structural Latent Class Models (SLCM). This includes tools for conducting Latent Transition Analysis (LTA), Latent Class Profile Analysis (LCPA), and Joint Latent Class Analysis (JLCA), facilitating the analysis of complex latent structures in statistical data.
Author(s)
Maintainer: Youngsun Kim yskstat@gmail.com (ORCID)
Authors:
Hwan Chung hwanch@korea.ac.kr (ORCID)
See Also
Useful links:
Adolescent Depression Data from the Add Health Study
Description
This dataset contains responses from the National Longitudinal Study of Adolescent Health (Add Health), focusing on adolescents' experiences with depression. The subjects, who were in Grades 10 and 11 during the 1994–1995 academic year, provided data on at least one measure of adolescent delinquency in Wave I.
These data can be used to replicate the latent class analysis conducted by Collins and Lanza (2009).
The dataset includes five covariates, notably grade level and sex of respondents, along with variables capturing depressive emotions: sadness (S1-S4
), feeling disliked (D1-D2
), and feelings of failure (F1-F2
).
Responses for these variables were initially categorized as "Never," "Sometimes," "Often," or "Most or All of the Time." In this dataset, responses have been recoded as "No" for "Never" and "Yes" for all other responses, providing a longitudinal perspective on adolescent depression across Waves I and II. Variables with the suffix "w1"
are from Wave I, while those with the suffix "w2"
are from Wave II.
Usage
addhealth
Format
A data frame with 2061 rows and 18 variables:
GRADE
Respondent's grade level at Wave I.
SEX
Respondent's sex
levels: (1)Male
, (2)Female
.S1w1
,S1w2
I felt that I could not shake off the blues even with help from my family and friends.
S2w1
,S2w2
I felt depressed.
S3w1
,S3w2
I felt lonely.
S4w1
,S4w2
I felt sad.
D1w1
,D1w2
People were unfriendly to me.
D2w1
,D2w2
I felt that people disliked me
F1w1
,F1w2
I thought my life had been a failure.
F2w1
,F2w2
I felt life was not worth living
Source
https://addhealth.cpc.unc.edu/data/#public-use
References
Collins, L.M., & Lanza, S.T. (2009). Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences.
J.R. Udry. The National Longitudinal Study of Adolescent Health (Add Health), Waves I & II, 1994-1996. Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 2003.
Examples
library(magrittr)
data <- addhealth[1:300,]
lta5 <- slca(
DEP1(5) ~ S1w1 + S2w1 + S3w1 + S4w1 + D1w1 + D2w1 + F1w1 + F2w1,
DEP2(5) ~ S1w2 + S2w2 + S3w2 + S4w2 + D1w2 + D2w2 + F1w2 + F2w2,
DEP1 ~ DEP2
) %>% estimate(data, control = list(em.tol = 1e-6))
lta5inv <- slca(
DEP1(5) ~ S1w1 + S2w1 + S3w1 + S4w1 + D1w1 + D2w1 + F1w1 + F2w1,
DEP2(5) ~ S1w2 + S2w2 + S3w2 + S4w2 + D1w2 + D2w2 + F1w2 + F2w2,
DEP1 ~ DEP2,
constraints = c("DEP1", "DEP2")
) %>% estimate(data, control = list(em.tol = 1e-6))
compare(lta5inv, lta5, test = "chisq")
lta5inv %>% param()
Compare Two Fitted slca
Models
Description
Conducts a relative model fit test between two fitted SLCM models using the deviance statistic.
Usage
compare(
model1,
model2,
test = c("none", "chisq", "boot"),
nboot = 50,
method = c("hybrid", "em", "nlm"),
plot = FALSE,
maxiter = 1000,
tol = 1e-08,
verbose = FALSE
)
Arguments
model1 |
an object of class |
model2 |
another object of class |
test |
a character string specifying the type of test to be conducted. If |
nboot |
an integer specifying the number of bootstrap iterations to perform (used only when |
method |
a character string specifying the estimation method for bootstrapping. |
plot |
a logical value indicating whether to display a histogram of G-squared statistics for the bootstrap samples (applicable only for |
maxiter |
an integer specifying the maximum number of iterations allowed during each bootstrap estimation round. The default is 100. |
tol |
numeric value setting the convergence tolerance for each bootstrap iteration. The default is |
verbose |
a logical value indicating whether to print progress updates on completed bootstrap iterations. The default is |
Value
A data.frame
containing the number of parameters (Df), loglikelihood, AIC, BIC, G-squared statistics, and the residual degree of freedom for each object.
If a statistical test is conducted (via test
), the resulting p-value for the comparison is also included.
See Also
Examples
library(magrittr)
data <- gss7677[gss7677$COHORT == "YOUNG", ]
stat2 <- slca(status(2) ~ PAPRES + PADEG + MADEG) %>%
estimate(data = data, control = list(verbose = FALSE))
stat3 <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>%
estimate(data = data, control = list(verbose = FALSE))
stat4 <- slca(status(4) ~ PAPRES + PADEG + MADEG) %>%
estimate(data = data, control = list(verbose = FALSE))
gof(stat2, stat3, stat4)
gof(stat2, stat3, stat4, test = "chisq")
gof(stat2, stat3, stat4, test = "boot")
compare(stat3, stat4)
compare(stat3, stat4, test = "chisq")
compare(stat3, stat4, test = "boot")
Confidence Intervals for Model Parameters
Description
Computes confidence intervals for one or more parameters of a fitted model.
Usage
## S3 method for class 'slcafit'
confint(object, parm, level = 0.95, type = c("param", "logit"), ...)
Arguments
object |
an object of class |
parm |
an integer or string specifying the parameters for which confidence intervals are to be computed. |
level |
a numeric value representing the confidence level for the intervals. The default is |
type |
a character string specifying the format in which the results should be returned. Options include |
... |
additional arguments. |
Value
A matrix
with two columns representing the confidence intervals for the selected parameters. The column names correspond to the specified confidence level:
-
100 * (level / 2)%
: The lower bound of the confidence interval. -
100 * (1 - level / 2)%
: The upper bound of the confidence interval.
The level
argument determines the confidence level, with common values being 0.95
for a 95% confidence interval and 0.99
for a 99% confidence interval.
Examples
param(nlsy_jlcpa, index = TRUE)
confint(nlsy_jlcpa)
confint(nlsy_jlcpa, 1:4)
Estimate Parameters of an slca
Object
Description
Estimates the parameters of a model created using the slca
function.
Usage
estimate(x, ...)
## S3 method for class 'slca'
estimate(x,
data,
method = c("em", "hybrid", "nlm"),
fix2zero = NULL,
control = slcaControl(), ...)
Arguments
x |
an |
... |
additional arguments passed to the estimation process. |
data |
a |
method |
a character string specifying the estimation method for SLCM parameters. The default is |
fix2zero |
a |
control |
a |
Details
The fix2zero
argument allows you to constrain specific parameters to zero. Each parameter is associated with a unique index, which can be identified using the param function with the argument index = TRUE
. To apply constraints, provide the relevant parameter indices in the fix2zero
arguments with vector.
Value
An object of class slcafit
containing the following components:
model |
a |
method |
the estimation method used. |
arg |
a brief description of the model used during estimation. |
mf |
the |
par |
the log of the estimated paramters. |
logit |
the log-odds of the estimated parameters. |
score |
the score function for the estimated parameters. |
posterior |
a |
convergence |
a logical indicator of whether convergence was achieved. |
loglikelihood |
the loglikelihood value of the estimated model. |
control |
the control settings used during the estimation process. |
The returned object can be further processed using the param function to extract the estimated parameters or their standard errors. The regress function allows for logistic regression analysis using a three-step approach to evaluate the effects of external variables on latent class variables. Additionally, several other methods are available, including predict.slcafit, reorder.slcafit, gof, and others.
See Also
Examples
m <- slca(lc[3] ~ y1 + y2 + y3 + y4)
pi <- rep(1 / 3, 3)
rho <- c(.9, .1, .9, .1, .9, .1, .9, .1, # class 1
.9, .1, .9, .1, .1, .9, .1, .9, # class 2
.1, .9, .1, .9, .1, .9, .1, .9) # class 3
dt <- simulate(m, 200, parm = c(pi, rho))
estimate(m, dt$response)
# Several estimation methods
estimate(m, dt$response, method = "em",
control = slcaControl(verbose = TRUE)) # default
estimate(m, dt$response, method = "nlm",
control = slcaControl(verbose = TRUE))
estimate(m, dt$response, method = "hybrid",
control = slcaControl(verbose = TRUE))
# Parameter restriction
mf <- estimate(m, dt$response)
param(mf, index = TRUE)
mf0 <- estimate(mf, fix2zero = c(4, 6, 8, 10))
param(mf0)
# Estimation control
estimate(m, dt$response, control = slcaControl(nrep = 3, verbose = TRUE))
estimate(m, dt$response, control = slcaControl(init.param = c(pi, rho)))
Goodness-of-Fit Test for Fitted slca
Model
Description
Computes the AIC, BIC, and deviance statistic (G-squared) for assessing the goodness-of-fit of a fitted slca
model. If the test
argument is specified, absolute model fit can be evaluated using deviance statistics.
Usage
gof(object, ...)
## S3 method for class 'slcafit'
gof(
object, ..., test = c("none", "chisq", "boot"),
nboot = 100, plot = FALSE,
maxiter = 100, tol = 1e-6, verbose = FALSE
)
## S3 method for class 'slcafit'
gof(
object,
...,
test = c("none", "chisq", "boot"),
nboot = 100,
plot = FALSE,
maxiter = 100,
tol = 1e-06,
verbose = FALSE
)
Arguments
object |
an object of class |
... |
additional objects of class |
test |
a character string specifying the type of test to be conducted. If |
nboot |
an integer specifying the number of bootstrap rounds to be performed. |
plot |
a logical value indicating whether to print histogram of G-squared statistics for boostrap samples, only for |
maxiter |
an integer specifying the maximum number of iterations allowed for the estimation process during each bootstrap iteration. The default is 100. |
tol |
a numeric value specifying the convergence tolerance for each bootstrap iteration. The default is |
verbose |
a logical value indicating whether to print progress updates on the number of bootstrapping rounds completed. |
Value
A data.frame
containing the number of parameters (Df), loglikelihood, AIC, BIC, G-squared statistics, and the residual degree of freedom for each object.
If a statistical test is performed (using test
), the result includes the corresponding p-value.
See Also
Examples
library(magrittr)
data <- gss7677[gss7677$COHORT == "YOUNG", ]
stat2 <- slca(status(2) ~ PAPRES + PADEG + MADEG) %>%
estimate(data = data, control = list(verbose = FALSE))
stat3 <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>%
estimate(data = data, control = list(verbose = FALSE))
stat4 <- slca(status(4) ~ PAPRES + PADEG + MADEG) %>%
estimate(data = data, control = list(verbose = FALSE))
gof(stat2, stat3, stat4)
gof(stat2, stat3, stat4, test = "chisq")
gof(stat2, stat3, stat4, test = "boot")
compare(stat3, stat4)
compare(stat3, stat4, test = "chisq")
compare(stat3, stat4, test = "boot")
GSS 1976-1977 Data on Social Status and Tolerance towards Minorities
Description
This dataset contains responses from the General Social Survey (GSS) for the years 1976 and 1977, focusing on social status and tolerance towards minorities.
The dataset can be used to replicate the analyses conducted in McCutcheon (1985) and Bakk et al. (2014).
It includes covariates such as interview year, age, sex, race, education level, and income. Social status-related variables include father's occupation and education level, as well as mother's education level. Tolerance towards minorities is measured by agreement with three questions: (1) allowing public speaking, (2) allowing teaching, and (3) allowing literature publication.
Usage
gss7677
Format
A data frame with 2942 rows and 14 variables:
YEAR
Interview year (1976, 1977).
COHORT
Respondent's age cohort.
Levels: (1)YOUNG
, (2)YOUNG-MIDDLE
, (4)MIDDLE
, (5)OLD
.SEX
Respondent's sex.
Levels: (1)MALE
, (2)FEMALE
.RACE
Respondent's race.
Levels: (1)WHITE
, (2)BLACK
, (3)OTHER
.DEGREE
Respondent's education level.
Levels: (1)LT HS
, (2)HIGH-SCH
, (3)HIGHER
.REALRINC
Respondent's income.
PAPRES
Father's occupational prestige.
Levels: (1)LOW
, (2)MEDIUM
, (3)HIGH
.PADEG
Father's education level.
Levels: (1)LT HS
, (2)HIGH-SCH
, (3)COLLEGE
, (4)BACHELOR
, (5)GRADUATE
.MADEG
Mother's education level.
Levels: (1)LT HS
, (2)HIGH-SCH
, (3)COLLEGE
, (4)BACHELOR
, (5)GRADUATE
.TOLRAC
Tolerance towards racists.
TOLCOM
Tolerance towards communists.
TOLHOMO
Tolerance towards homosexuals.
TOLATH
Tolerance towards atheists.
TOLMIL
Tolerance towards militarists.
Source
General Social Survey (GSS) 1976, 1977
References
Bakk Z, Kuha J. (2021) Relating latent class membership to external variables: An overview. Br J Math Stat Psychol. 74(2):340-362.
McCutcheon, A. L. (1985). A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly, 49, 474–488.
Examples
library(magrittr)
gss500 <- gss7677[1:500,] %>% na.omit
model_stat <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>%
estimate(data = gss500, control = list(em.tol = 1e-6))
summary(model_stat)
param(model_stat)
model_tol <- slca(tol(4) ~ TOLRAC + TOLCOM + TOLHOMO + TOLATH + TOLMIL) %>%
estimate(data = gss500, control = list(em.tol = 1e-6))
summary(model_tol)
param(model_tol)
model_lta <- slca(
status(3) ~ PAPRES + PADEG + MADEG,
tol(4) ~ TOLRAC + TOLCOM + TOLHOMO + TOLATH + TOLMIL,
status ~ tol
) %>% estimate(data = gss500, control = list(em.tol = 1e-6))
summary(model_lta)
param(model_lta)
regress(model_lta, status ~ SEX, gss500)
regress(model_lta, status ~ SEX, gss500, method = "BCH")
regress(model_lta, status ~ SEX, gss500, method = "ML")
NLSY97 Substance Use Data
Description
This dataset contains substance use behavior data from the National Longitudinal Survey of Youth 1997 (NLSY97) for three years: 1998, 2003, and 2008. The dataset focuses on youth born in 1984 and tracks three types of substance use behaviors: tobacco/cigarette smoking, alcohol drinking, and marijuana use.
Usage
nlsy97
Format
A data frame with 1004 rows and 38 columns:
SEX
Respondent's sex
RACE
Respondent's race
ESMK_98
,ESMK_03
,ESMK_08
(Ever smoked) Ever smoked in 1998, 2003, and 2008 (0: No, 1: Yes)
FSMK_98
,FSMK_03
,FSMK_08
(Frequent smoke) Monthly smoking in 1998, 2003, and 2008 (0: No, 1: Yes)
DSMK_98
,DSMK_03
,DSMK_08
(Daily smoke) Daily smoking in 1998, 2003, and 2008 (0: No, 1: Yes)
HSMK_98
,HSMK_03
,HSMK_08
(Heavy smoke) 10+ cigarettes per day in 1998, 2003, and 2008 (0: No, 1: Yes)
EDRK_98
,EDRK_03
,EDRK_08
(Ever drunk) Ever drunk in 1998, 2003, and 2008? (0: No, 1: Yes)
CDRK_98
,CDRK_03
,CDRK_08
(Current drinker) Monthly drinking in 1998, 2003, and 2008 (0: No, 1: Yes)
WDRK_98
,WDRK_03
,WDRK_08
(Weakly drinker) 5+ days drinking in a month in 1998, 2003, and 2008 (0: No, 1: Yes)
BDRK_98
,BDRK_03
,BDRK_08
(Binge drinker) 5+ drinks on the same day at least one time in the last 30 day (0: No, 1: Yes)
EMRJ_98
,EMRJ_03
,EMRJ_08
(Ever marijuana used) Have you ever used marijuana in 1998, 2003, and 2008? (0: No, 1: Yes)
CMRJ_98
,CMRJ_03
,CMRJ_08
(Current marijuana user) Monthly marijuana use in 1998, 2003, and 2008 (0: No, 1: Yes)
OMRJ_98
,OMRJ_03
,OMRJ_08
(Occasional marijuana user) 10+ days marijuana use in a month in 1998, 2003, and 2008 (0: No, 1: Yes)
SMRJ_98
,SMRJ_03
,SMRJ_08
(School/work marijuana user) Marijuana use before/during school or work in 1998, 2003, and 2008 (0: No, 1: Yes)
Source
National Longitudinal Survey of Youth 1997 (NLSY97)
References
Bureau of Labor Statistics, U.S. Department of Labor. National Longitudinal Survey of Youth 1997 cohort, 1997-2017 (rounds 1-18). Produced and distributed by the Center for Human Resource Research (CHRR), The Ohio State University. Columbus, OH: 2019.
Examples
library(magrittr)
nlsy_smoke <- slca(SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98) %>%
estimate(data = nlsy97, control = list(verbose = FALSE))
summary(nlsy_smoke)
# JLCA
model_jlca <- slca(
SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98,
DRK_98(3) ~ EDRK_98 + CDRK_98 + WDRK_98 + BDRK_98,
MRJ_98(3) ~ EMRJ_98 + CMRJ_98 + OMRJ_98 + SMRJ_98,
SUB_98(4) ~ SMK_98 + DRK_98 + MRJ_98
) %>% estimate(data = nlsy97, control = list(verbose = FALSE))
summary(model_jlca)
param(model_jlca)
# JLCPA
nlsy_jlcpa <- slca(
SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98,
DRK_98(3) ~ EDRK_98 + CDRK_98 + WDRK_98 + BDRK_98,
MRJ_98(3) ~ EMRJ_98 + CMRJ_98 + OMRJ_98 + SMRJ_98,
SUB_98(5) ~ SMK_98 + DRK_98 + MRJ_98,
SMK_03(3) ~ ESMK_03 + FSMK_03 + DSMK_03 + HSMK_03,
DRK_03(3) ~ EDRK_03 + CDRK_03 + WDRK_03 + BDRK_03,
MRJ_03(3) ~ EMRJ_03 + CMRJ_03 + OMRJ_03 + SMRJ_03,
SUB_03(5) ~ SMK_03 + DRK_03 + MRJ_03,
SMK_08(3) ~ ESMK_08 + FSMK_08 + DSMK_08 + HSMK_08,
DRK_08(3) ~ EDRK_08 + CDRK_08 + WDRK_08 + BDRK_08,
MRJ_08(3) ~ EMRJ_08 + CMRJ_08 + OMRJ_08 + SMRJ_08,
SUB_08(5) ~ SMK_08 + DRK_08 + MRJ_08,
PROF(4) ~ SUB_98 + SUB_03 + SUB_08,
constraints = list(
c("SMK_98", "SMK_03", "SMK_08"),
c("DRK_98", "DRK_03", "DRK_08"),
c("MRJ_98", "MRJ_03", "MRJ_08"),
c("SUB_98 ~ SMK_98", "SUB_03 ~ SMK_03", "SUB_08 ~ SMK_08"),
c("SUB_98 ~ DRK_98", "SUB_03 ~ DRK_03", "SUB_08 ~ DRK_08"),
c("SUB_98 ~ MRJ_98", "SUB_03 ~ MRJ_03", "SUB_08 ~ MRJ_08")
)
) %>% estimate(nlsy97, control = list(verbose = FALSE))
JLCPA Model Estimated with NLSY97 Data
Description
An slca
model estimated using the NLSY97 dataset.
Usage
nlsy_jlcpa
Format
An slcafit
object estimated for JLCPA model using nlsy97
dataset.
References
Bureau of Labor Statistics, U.S. Department of Labor. National Longitudinal Survey of Youth 1997 cohort, 1997-2017 (rounds 1-18). Produced and distributed by the Center for Human Resource Research (CHRR), The Ohio State University. Columbus, OH: 2019.
Jeon, S., Seo, T. S., Anthony, J. C., & Chung, H. (2022). Latent Class Analysis for Repeatedly Measured Multiple Latent Class Variables. Multivariate Behavioral Research, 57(2–3), 341–355.
See Also
Print Estimated Parameters of an slcafit
Object
Description
Prints the estimated parameters of an slca
model using an slcafit
object.
Usage
param(object, ...)
## S3 method for class 'slcafit'
param(
object, type = c("probs", "logit"),
se = FALSE, index = FALSE, ...
)
Arguments
object |
an object of class |
... |
additional arguments passed to other methods. |
type |
a character string specifying the format in which the estimated parameters should be displayed. The options are |
se |
a logical value indicating whether to display standard errors ( |
index |
a logical value indicating whether to include ( |
Value
A list
containing the requested estimated parameters or their standard errors (if se = TRUE
). The components of the list include:
pi |
Membership probabilities for the root latent variable. |
tau |
Conditional probabilities between latent class variables, represented with uppercase letters to account for measurement invariance. |
rho |
Item response probabilities for each measurement model, represented with lowercase letters to account for measurement invariance. |
Model Predictions for Estimated slca
Object
Description
Provides predicted class memberships or posterior probabilities for new data based on a fitted slca
model.
Usage
## S3 method for class 'slcafit'
predict(object, newdata, type = c("class", "posterior"), ...)
Arguments
object |
An object of class |
newdata |
A |
type |
A character string indicating the type of prediction. Use |
... |
Additional arguments passed to other methods. |
Value
A data.frame
or list
depending on the type
:
For
type = "class"
, adata.frame
is returned where rows represent observations and columns correspond to latent class variables.For
type = "posterior"
, alist
is returned containingdata.frame
s with posterior probabilities for each latent class variable.
Regress Exogenous Variables on Latent Variables
Description
Performs regression analysis to examine the influence of exogenous (external) variables on latent class variables in an estimated slca
model. The function uses logistic regression with a three-step approach to account for measurement error.
Usage
regress(object, ...)
## S3 method for class 'slcafit'
regress(
object, formula, data = parent.frame(),
imputation = c("modal", "prob"),
method = c("naive", "BCH", "ML"), ...
)
## S3 method for class 'slcafit'
regress(
object,
formula,
data = parent.frame(),
imputation = c("modal", "prob"),
method = c("naive", "BCH", "ML"),
...
)
Arguments
object |
an object of class |
... |
additional arguments. |
formula |
a formula specifying the regression model, including both latent class variables (from the estimated model) and exogenous variables. |
data |
an optional |
imputation |
a character string specifying the imputation method for latent class assignment. Options include:
|
method |
a character string specifying the method to adjust for bias in the three-step approach. Options include:
|
Value
A list
of class reg.slca
with the following components:
coefficients |
A matrix of regression coefficients representing the odds ratios for each latent class against the baseline class (the last class). |
std.err |
A matrix of standard errors corresponding to the regression coefficients. |
vcov |
The variance-covariance matrix of the regression coefficients. |
dim |
The dimensions of the coefficients matrix. |
ll |
The log-likelihood of the regression model. |
The summary
function can be used to display the regression coefficients, standard errors, Wald statistics, and p-values.
References
Vermunt, J. K. (2010). Latent Class Modeling with Covariates: Two Improved Three-Step Approaches. Political Analysis, 18(4), 450–469. http://www.jstor.org/stable/25792024
Examples
library(magrittr)
names(nlsy97)
nlsy_jlcpa %>% regress(SMK_98 ~ SEX, nlsy97)
nlsy_jlcpa %>% regress(PROF ~ SEX, nlsy97)
Reorder Latent Class Membership of Latent Class Variables
Description
Reorders the latent class membership for specified latent class variables in an slcafit
object.
Usage
## S3 method for class 'slcafit'
reorder(x, ...)
Arguments
x |
an object of class |
... |
additional arguments specifying the new order for the latent class variables. |
Value
A modified slcafit
object with the latent classes reordered according to the specified order.
Examples
library(magrittr)
nlsy_jlcpa %>% param
# Reorder the RHO parameters as ascending order
reordered1 <- nlsy_jlcpa %>%
reorder(SMK_98 = c(1, 3, 2),
DRK_98 = c(3, 2, 1),
MRJ_98 = c(3, 1, 2))
reordered1 %>% param
# Label class1: nonuse
# class2: lifetime use
# class3: current use
# Reorder the TAU parameters for joint classes as ascending order
reordered2 <- reordered1 %>%
reorder(SUB_98 = c(3, 4, 5, 1, 2))
reordered2 %>% param
# Label class1: nonuse
# class2: heavy drinking only
# class3: not heavy use
# class4: heavy drinking & smoking
# class5: heavy use
# Reorder the TAU paramters for profiles as ascending order
reordered3 <- reordered2 %>%
reorder(PROF = c(4, 1, 3, 2))
reordered3 %>% param
# Label class1: nonuse stayer
# class2: heavy drinking advancer
# class3: heavy drk & smk advancer
# class4: heavy use advancer
Simulate Data from an slca
Model
Description
Simulates data based on a specified slca
model. If the model parameters are not already estimated, they can either be provided by the user or generated randomly.
Usage
## S3 method for class 'slca'
simulate(object, nsim = 500, seed = NULL, parm, nlevel, ...)
Arguments
object |
an |
nsim |
an integer specifying the number of response observations to simulate. The default is 500. |
seed |
an integer specifying the random seed for reproducibility. If not provided, results will vary across runs. |
parm |
a user-specified set of parameters to guide the simulation. This is required if the model has not been previously estimated. |
nlevel |
an integer or integer vector specifying the number of levels for each manifest item in the model. If a single integer is provided, all manifest items will have the same number of levels. The default is 2. |
... |
Additional arguments passed to other methods. |
Value
A list
with the following components:
class |
A |
response |
A |
Examples
m1 <- slca(lc1[3] ~ x1 + x2 + x3 + x4 + x5,
lc2[4] ~ y1 + y2 + y3 + y4 + y5)
sim <- simulate(m1, 1000)
sapply(sim$class, table)
# simulate data with defined number of levels of manifest items
m2 <- slca(lc1[3] ~ x1 + x2 + x3 + x4)
sim <- simulate(m2, nlevel = c(3, 3, 3, 3))
d <- sim$response
sapply(d, table)
sim <- simulate(m2, nlevel = c(x1 = 2, x3 = 3, x4 = 4, x5 = 5))
d <- sim$response
sapply(d, table)
# simulate data with user-defined parameters
pi <- rep(1 / 3, 3)
rho <- c(.9, .1, .9, .1, .9, .1, .9, .1,
.9, .1, .9, .1, .1, .9, .1, .9,
.1, .9, .1, .9, .1, .9, .1, .9)
par <- c(pi, rho)
m3 <- slca(lc[3] ~ y1 + y2 + y3 + y4)
sim <- simulate(m3, parm = par)
mf <- estimate(m3, sim$response)
param(mf)
Construct Structural Latent Class Model
Description
Constructs a latent structure with multiple latent class variables.
Usage
slca(formula = NULL, ..., constraints = NULL)
Arguments
formula |
a formula specifying the latent structure. Detailed model specifications are provided under 'Details'. |
... |
additional formulae for defining the model structure. |
constraints |
a list of constraints to enforce measurement invariance. Detailed explanations of applying constraints are available under 'Details'. |
Details
The formula
can be categorized into three types, each serving a distinct purpose:
-
Defining Latent Class Variables with Manifest Indicators: Specify the relationship between a latent class variable and its manifest indicators. The latent class variable is on the left-hand side (lhs), denoted with square brackets
[]
or parentheses()
to indicate the number of classes, and manifest indicators are listed on the right-hand side (rhs). For example:LC1[k] ~ x1 + x2 + x3 LC2[k] ~ y1 + y2 + y3 LC3(k) ~ z1 + z2 + z3
Here,
k
denotes the number of latent classes for the variable. -
Relating Latent Class Variables to Each Other: Define relationships where one latent class variable influences another. For example:
LC2 ~ LC1
This formula implies that
LC2
is conditionally dependent onLC1
. -
Defining Higher-Level Latent Class Variables: Specify relationships where a latent class variable is measured by other latent class variables instead of manifest indicators. For example:
P[k] ~ LC1 + LC2 + LC3
This indicates that the latent variable
P
is measured by the latent class variablesLC1
,LC2
, andLC3
.
In all formulas, variables on the lhs influence those on the rhs.
The constraints
argument enforces specific conditions to ensure precise inference, such as measurement invariance. This is particularly useful for longitudinal analysis (eg. LTA or LCPA), where consistent meanings of latent classes across time are essential.
-
Measurement Invariance for the Measurement Model: Ensures probabilities associated with latent class variables remain consistent. For example:
c("LC1", "LC2", "LC3")
This ensures that
LC1
,LC2
, andLC3
have semantically consistent measurement probabilities.
' 2. Measurement Invariance for the Structural Model: Applies constraints to ensure consistent interpretations of transition probabilities between latent class variables. For example:
c("P ~ LC1", "P -> LC2")
This ensures that the transitions from P
to LC1
and P
to LC2
are consistent.
Value
An object of class slca
with the following components:
tree |
A |
latent |
A |
measure |
A |
struct |
A |
The printed model description is divided into four parts:
-
Latent variables: Lists the latent class variables and the number of classes for each variable. The root variable is marked with an asterisk (
*
). -
Measurement model: Displays manifest indicators for each latent class variable and any applied measurement constraints (lowercase letters indicate consistency).
-
Structural model: Describes the conditional relationships between latent class variables.
-
Dependency constraints: Outlines constraints applied to conditional dependencies, where uppercase letters represent consistent dependency structures.
Examples
# Standard LCA
slca(lc[3] ~ y1 + y2 + y3 + y4)
# Latent transition analysis (LTA)
slca(lx[3] ~ x1 + x2 + x3 + x4,
ly[2] ~ y1 + y2 + y3 + y4,
lx ~ ly)
# LTA with measurement invariance
slca(l1[3] ~ y11 + y21 + y31 + y41,
l2[3] ~ y12 + y22 + y32 + y42,
l1 ~ l2, constraints = c("l1", "l2"))
# Joint latent class analysis
slca(lx[2] ~ x1 + x2 + x3 + x4,
ly[3] ~ y1 + y2 + y3 + y4,
lz[2] ~ z1 + z2 + z3 + z4,
jc[3] ~ lx + ly + lz)
# Latent class profile analysis (with measurement invariance)
slca(l1[3] ~ x1 + x2 + x3 + x4,
l2[3] ~ y1 + y2 + y3 + y4,
l3[3] ~ z1 + z2 + z3 + z4,
pf[4] ~ l1 + l2 + l3,
constraints = c("l1", "l2", "l3"))
Control Parameters for slca
Estimation
Description
Specifies control parameters for estimating slca
model.
Usage
slcaControl(
em.iterlim = 5000,
em.tol = 1e-08,
nlm.iterlim = 1000,
nlm.tol = 1e-10,
init.param = NULL,
nrep = 1,
test.iter = 500,
na.rm = FALSE,
verbose = FALSE
)
Arguments
em.iterlim |
an integer specifying the maximum number of iterations allowed for the EM algorithm. The default is |
em.tol |
a numeric value setting the tolerance for convergence of the EM algorithm. The default is |
nlm.iterlim |
an integer specifying the maximum number of iterations allowed when using the |
nlm.tol |
a numeric value setting the tolerance for convergence of the |
init.param |
a numeric vector specifying the initial parameter values for estimation. |
nrep |
an integer specifying the number of estimation trials. The default is |
test.iter |
an integer specifying the maximum number of iterations allowed for parameter testing. The default is |
na.rm |
a logical value indicating whether to remove observations containing missing values ( |
verbose |
a logical value indicating whether to display progress updates during the estimation process. The default is A |