The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

bayesmsm for longitudinal data with informative right-censoring

Xiao Yan, Kuan Liu

Introduction

Simulated longitudinal observational data with right-censoring

In this vignette, we demonstrate how to simulate a longitudinal dataset that replicates features of real-world clinical studies with right-censoring using simData(). Here, we generate data for 200 individuals observed over 2 visits. At each visit, two normally distributed covariates (L1_j, L2_j) and a binary treatment assignment (A_j) are created. Right-censoring is induced at each visit via a logistic model (C_j), and once an individual is censored at visit j, all subsequent covariates, treatments, and the end-of-study outcome Y are set to NA. The outcome Y is binary, drawn from a logistic regression on the full covariate and treatment history.

Variable Description
L1_1, L2_1 Baseline covariates (continuous)
L1_2, L2_2 Time-varying covariates at visit 2
A1, A2 Binary treatments at visits 1 and 2
C1, C2 Right-censoring indicators at each visit
Y Binary end-of-study outcome (1 = event)
# 1) Define coefficient lists for 2 visits
amodel <- list(
  # Visit 1: logit P(A1=1) = -0.3 + 0.4*L1_1 - 0.2*L2_1
  c("(Intercept)" = -0.3, "L1_1" = 0.4, "L2_1" = -0.2),
  # Visit 2: logit P(A2=1) = -0.1 + 0.3*L1_2 - 0.1*L2_2 + 0.5*A_prev
  c("(Intercept)" = -0.1, "L1_2" = 0.3, "L2_2" = -0.1, "A_prev" = 0.5)
)

# 2) Define outcome model: logistic on treatments and last covariates
ymodel <- c(
  "(Intercept)" = -0.8,
  "A1"          = 0.2,
  "A2"          = 0.4,
  "L1_2"        = 0.3,
  "L2_2"        = -0.3
)

# 3) Define right-censoring models at each visit
cmodel <- list(
  # Censor at visit 1 based on baseline covariates and A1
  c("(Intercept)" = -1.5, "L1_1" = 0.2, "L2_1" = -0.2, "A" = 0.2),
  # Censor at visit 2 based on visit-2 covariates and A2
  c("(Intercept)" = -1.5, "L1_2" = 0.1, "L2_2" = -0.1, "A" = 0.3)
)

# 4) Load package and simulate data
simdat_cen <- simData(
  n                = 100,
  n_visits         = 2,
  covariate_counts = c(2, 2),
  amodel           = amodel,
  ymodel           = ymodel,
  y_type           = "binary",
  right_censor     = TRUE,
  cmodel           = cmodel,
  seed             = 123
)

# 5) Inspect first rows
head(simdat_cen)
#>          L1_1        L2_1 A1 C1       L1_2        L2_2 A2 C2  Y
#> 1 -0.56047565 -0.71040656  1  0 -0.7152422 -0.07355602  1  1 NA
#> 2 -0.23017749  0.25688371  0  0 -0.7526890 -1.16865142  1  0  1
#> 3  1.55870831 -0.24669188  0  0 -0.9385387 -0.63474826  0  0  0
#> 4  0.07050839 -0.34754260  1  0 -1.0525133 -0.02884155  0  0  1
#> 5  0.12928774 -0.95161857  0  0 -0.4371595  0.67069597  1  1 NA
#> 6  1.71506499 -0.04502772  1  0  0.3311792 -1.65054654  1  0  0

Bayesian treatment effect weight estimation using bayesweight_cen

Next, we use the bayesweight_cen function to estimate the weights with censoring. We specify the treatment and censoring models for each time point, including the relevant covariates.

weights_cen <- bayesweight_cen(
  trtmodel.list = list(
    A1 ~ L1_1 + L2_1,
    A2 ~ L1_2 + L2_2 + A1),
  cenmodel.list = list(
    C1 ~ L1_1 + L2_1 + A1,
    C2 ~ L1_2 + L2_2 + A2),
  data = simdat_cen,
  n.chains = 1,
  n.iter = 200,
  n.burnin = 100,
  n.thin = 1,
  seed = 890123,
  parallel = FALSE)
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 700
#>    Unobserved stochastic nodes: 21
#>    Total graph size: 2495
#> 
#> Initializing model

summary(weights_cen$weights)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
#>  0.6291  1.0823  1.7307  3.1605  2.5831 49.4854      25

Similarly, the function will automatically run MCMC with JAGS based on the specified treatment and censoring model inputs, generating a JAGS model string as part of the function output. The function returns a list containing the updated weights for subject-specific treatment and censoring effects as well as the JAGS model.

cat(weights_cen$model_string)
#> model{
#> 
#> for (i in 1:N1) {
#> 
#> # conditional model;
#> A1[i] ~ dbern(p1[i])
#> logit(p1[i]) <- b10 + b11*L1_1[i] + b12*L2_1[i]
#> C1[i] ~ dbern(cp1[i])
#> logit(cp1[i]) <- s10 + s11*L1_1[i] + s12*L2_1[i] + s13*A1[i]
#> 
#> # marginal model;
#> A1s[i] ~ dbern(p1s[i])
#> logit(p1s[i]) <- bs10
#> C1s[i] ~ dbern(cp1s[i])
#> logit(cp1s[i]) <- ts10
#> }
#> 
#> for (i in 1:N2) {
#> 
#> # conditional model;
#> A2[i] ~ dbern(p2[i])
#> logit(p2[i]) <- b20 + b21*L1_2[i] + b22*L2_2[i] + b23*A1[i]
#> C2[i] ~ dbern(cp2[i])
#> logit(cp2[i]) <- s20 + s21*L1_2[i] + s22*L2_2[i] + s23*A2[i]
#> 
#> # marginal model;
#> A2s[i] ~ dbern(p2s[i])
#> logit(p2s[i]) <- bs20 + bs21*A1s[i]
#> C2s[i] ~ dbern(cp2s[i])
#> logit(cp2s[i]) <- ts20 + ts21*A1s[i]
#> }
#> 
#> # Priors
#> b10 ~ dunif(-10, 10)
#> b11 ~ dunif(-10, 10)
#> b12 ~ dunif(-10, 10)
#> s10 ~ dunif(-10, 10)
#> s11 ~ dunif(-10, 10)
#> s12 ~ dunif(-10, 10)
#> s13 ~ dunif(-10, 10)
#> bs10 ~ dunif(-10, 10)
#> ts10 ~ dunif(-10, 10)
#> b20 ~ dunif(-10, 10)
#> b21 ~ dunif(-10, 10)
#> b22 ~ dunif(-10, 10)
#> b23 ~ dunif(-10, 10)
#> s20 ~ dunif(-10, 10)
#> s21 ~ dunif(-10, 10)
#> s22 ~ dunif(-10, 10)
#> s23 ~ dunif(-10, 10)
#> bs20 ~ dunif(-10, 10)
#> bs21 ~ dunif(-10, 10)
#> ts20 ~ dunif(-10, 10)
#> ts21 ~ dunif(-10, 10)
#> }

Bayesian non-parametric bootstrap to maximize the utility function with respect to the causal effect using bayesmsm

Using the weights estimated by bayesweight_cen, we now fit the Bayesian Marginal Structural Model and estimate the marginal treatment effects using the bayesmsm function as before. We specify the outcome model and other relevant parameters.

# Remove all NAs (censored observations) from the original dataset and weights
simdat_cen$weights <- weights_cen$weights
simdat_cen2 <- na.omit(simdat_cen)

model <- bayesmsm(ymodel = Y ~ A1 + A2,
  nvisit = 2,
  reference = c(rep(0,2)),
  comparator = c(rep(1,2)),
  family = "binomial",
  data = simdat_cen2,
  wmean = simdat_cen2$weights,
  nboot = 50,
  optim_method = "BFGS",
  parallel = TRUE,
  seed = 890123,
  ncore = 2)
str(model)
#> List of 12
#>  $ RD_mean    : num 0.206
#>  $ RR_mean    : num 3.79
#>  $ OR_mean    : num 6.75
#>  $ RD_sd      : num 0.209
#>  $ RR_sd      : num 3.48
#>  $ OR_sd      : num 8.36
#>  $ RD_quantile: Named num [1:2] -0.174 0.565
#>   ..- attr(*, "names")= chr [1:2] "2.5%" "97.5%"
#>  $ RR_quantile: Named num [1:2] 0.547 13.219
#>   ..- attr(*, "names")= chr [1:2] "2.5%" "97.5%"
#>  $ OR_quantile: Named num [1:2] 0.435 26.76
#>   ..- attr(*, "names")= chr [1:2] "2.5%" "97.5%"
#>  $ bootdata   :'data.frame': 50 obs. of  5 variables:
#>   ..$ effect_reference : num [1:50] -1.02 -1.09 -1.24 -1.04 -2.12 ...
#>   ..$ effect_comparator: num [1:50] -1.136 -1.852 -0.389 -1.593 -0.857 ...
#>   ..$ RD               : num [1:50] -0.0223 -0.1158 0.1789 -0.0912 0.1906 ...
#>   ..$ RR               : num [1:50] 0.916 0.54 1.795 0.65 2.778 ...
#>   ..$ OR               : num [1:50] 0.889 0.467 2.334 0.578 3.532 ...
#>  $ reference  : num [1:2] 0 0
#>  $ comparator : num [1:2] 1 1

The bayesmsm function returns a model object containing the following: the mean, standard deviation, and 95% credible interval of the Risk Difference (RD), Risk Ratio (RR), and Odds Ratio (OR). It also includes a data frame containing the bootstrap samples for the reference effect, comparator effect, RD, RR, and OR, as well as the reference and comparator levels chosen by the user.

summary_bayesmsm(model)
#>                  mean        sd       2.5%      97.5%
#> Reference  -1.8573457 0.7393902 -3.3904462 -0.4320041
#> Comparator -0.6155699 0.6582818 -1.7697885  0.5451637
#> RD          0.2063189 0.2092313 -0.1738570  0.5648371
#> RR          3.7886208 3.4781279  0.5468277 13.2191157
#> OR          6.7493770 8.3562757  0.4349579 26.7600428

Visualization functions: plot_ATE, plot_APO, plot_est_box

Similarly, we can use the built-in functions as well as summary_bayesmsm to visualize and summarize the results.

plot_ATE(model)

plot_APO(model, effect_type = "effect_comparator")

plot_APO(model, effect_type = "effect_reference")

plot_est_box(model)

Reference

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.