The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
The package FrailtyCompRisk is originally designed for statisticians and epidemiologists in order to analyze time-to-event data where individuals are nested within centers (e.g., hospitals or clinics), and where multiple causes of failure may occur.
The package is build upon the random-effects model for the subdistribution hazard presented in “Analysing multicentre competing risks data with a mixed proportional hazards model for the subdistribution” [Katsahian S, Resche-Rigon M, Chevret S, Porcher R.].
Consider \(N\) observations of a time-to-event variable \(T\), a cause of event \(\epsilon\), and a covariate \(Z\): \((T_i, \epsilon_i, Z_i)_{i=1,\dots,N}\). Let \(\epsilon = 1\) denote the cause of failure of interest.
The cumulative incidence function (CIF) for cause 1 is defined as: \[ F_1(t)=P(T \le t,\epsilon = 1) \] Fine and Gray proposed a proportional hazards model for the subdistribution hazard associated with \(F_1(t)\), defined as: \[ \alpha_1(t) = \frac{\frac{dF_1(t)}{dt}}{1 - F_1(t)} \] The subdistribution hazard for individual \(i\) is modeled as:
\[ \alpha_{1}(t | Z_i, u) = \alpha_{1,0}(t) e^{(Z_i^\top \beta)} \] where:
\(\alpha_{1,0}(t)\) is an unspecified baseline subdistribution hazard,
\(\beta\) is the vector of covariate effects.
This model assumes that all individuals are independent. To account for clustering (e.g., patients within centers), a random effect is added:
Let \(k(i) \in {1, \dots, K}\) be the cluster (center) to which subject \(i\) belongs. Then the subdistribution hazard becomes:
\[ \alpha_{1,k}(t | Z_i, u) = \alpha_{1,0}(t) e^{(Z_i^\top \beta \ + \ u_{k(i)})} \] where \(u_k \sim \mathcal{N}(0, \theta)\) are independent Gaussian random effects for each cluster \(k\), with variance \(\theta\).
The package uses a restricted maximum likelihood (REML) or maximum likelihood (ML) approach to estimate the parameters \(\beta\), \(\theta\), and \(u_k\). The estimation relies on a Newton-Raphson optimization routine applied to the (restricted) log-likelihood.
The package provides several main functions, \(\\\) Reml_CompRisk_frailty()
Estimates \(\beta\), \(\theta\), and \(u_k\) under the subdistribution hazard
model with random effects. Also returns the \(p\)-value for testing \(H_0: \theta = 0\).
Reml_CompRisk_frailty() can take 5 arguments:
data (mandatory), a data frame with the following
columns:
times : Observed times to event or
censoring.
status : Event type: 0 = censored, 1 = cause of
interest, >1 = other competing risks.
clusters : Cluster membership (e.g., centers or
subjects).
covariates : (optional) Columns from the 4th onward
are used as covariates.
cluster_censoring (optional): Logical. If TRUE,
accounts for center-specific censoring using a frailty model on
censoring times. Default is FALSE.
max_iter (optional): Maximum number of iterations
for the Newton-Raphson algorithm (default = 300).
tol (optional): Convergence tolerance for the
likelihood and frailty variance (default = 1e-6).
threshold (optional): Lower bound for the frailty
variance parameter \(\theta\). If the
estimated value falls below this threshold, frailty is considered
negligible (default = 1e-5).
\(\\\)
Ml_CompRisk() Estimates \(\beta\) for the subdistribution hazard
model without random effects.
Ml_CompRisk() can take 3 arguments:
data (mandatory), a data frame with the following
columns:
times : Observed times to event or
censoring.
status : Event type: 0 = censored, 1 = cause of
interest, >1 = other competing risks.
covariates : (optional) Columns from the 3th onward
are used as covariates.
max_iter (optional): Maximum number of iterations
for the Newton-Raphson algorithm (default = 100).
tol (optional): Convergence tolerance for the
likelihood and frailty variance (default = 1e-6).
\(\\\)
Reml_Cox_frailty() Estimates \(\beta\), \(\theta\), and \(u_k\) under a Cox model with random
effects, and provides a \(p_{value}\)
for testing \(\theta = 0\).
Reml_Cox_frailty() can take 4 arguments:
data (mandatory), a data frame with the following
columns:
times : Observed times to event or
censoring.
status : Event type: 0 = censored, 1 = cause of
interest
clusters : Cluster membership (e.g., centers or
subjects).
covariates : (optional) Columns from the 4th onward
are used as covariates.
max_iter (optional): Maximum number of iterations
for the Newton-Raphson algorithm (default = 300).
tol (optional): Convergence tolerance for the
likelihood and frailty variance (default = 1e-6).
threshold (optional): Lower bound for the frailty
variance parameter \(\theta\). If the
estimated value falls below this threshold, frailty is considered
negligible (default = 1e-5).
\(\\\)
Ml_Cox() Estimates \(\beta\) under a standard Cox proportional
hazards model.
Ml_Cox() can take 3 arguments:
data (mandatory), a data frame with the following
columns:
times : Observed times to event or
censoring.
status : Event type: 0 = censored, 1 = cause of
interest
covariates : (optional) Columns from the 3th onward
are used as covariates.
max_iter (optional): Maximum number of iterations
for the Newton-Raphson algorithm (default = 100).
tol (optional): Convergence tolerance for the
likelihood and frailty variance (default = 1e-6).
Parameters_estimation() Encapsulates the four previous
methods into one unified function.
Parameters_estimation() can take 6 arguments:
data (mandatory), a data frame with the following
columns:
times : Observed times to event or
censoring.
status : Event type: 0 = censored, 1 = cause of
interest, >1 = other competing risks.
clusters : Cluster membership (e.g., centers or
subjects).
covariates : (optional) Columns from the 4th onward
are used as covariates.
method (optional):
cluster_censoring (optional): Logical. If TRUE,
accounts for center-specific censoring using a frailty model on
censoring times. Default is FALSE.
max_iter (optional): Maximum number of iterations
for the Newton-Raphson algorithm (default = 100 or 300, depends on the
method chosen).
tol (optional): Convergence tolerance for the
likelihood and frailty variance (default = 1e-6).
threshold (optional): Lower bound for the frailty
variance parameter \(\theta\). If the
estimated value falls below this threshold, frailty is considered
negligible (default = 1e-5).
\(\\\)
simulate_data() Simulates clustered time-to-event data
with competing risks.
simulate_data() can take 8 arguments:
G : a vector of group or cluster identifiers (length
N, where N is the size sample). Each value indicates which cluster the
individual belongs to.
Z : a matrix of covariates (dimensions N x p), where
p is the size of the vector describing an individual). Can be set to
Matrix(0,0,N) if no covariates are used.
prop : proportion of individuals susceptible to
cause 1 (default: 0.6).
beta : a numeric vector of regression coefficients,
one per covariate (p).
theta : variance of the shared frailty term for
event times (cause 1).
cens : logical, indicating whether censoring should
be simulated (default: FALSE).
pcens : target censoring proportion (default:
0.25).
tau : variance of the shared frailty term for
censoring times (default: 0). \(\\\)
check_data_format() Verifies input data structure and
formatting.
check_data_format() can only take 1 argument, a data
frame. \(\\\)
set.seed(123)
n_cov = 2
n_per_cluster = 15
n_cluster = 20
n = n_cluster * n_per_cluster
G = rep(1:n_cluster, each = n_per_cluster)
Z = matrix(rnorm(n*n_cov,0,1),ncol = n_cov)
df = simulate_data(G,Z,prop = 0.6,beta = c(1,1.2),theta = 0.6,cens = TRUE)
#Estimate using REML with frailty
res <- Reml_CompRisk_frailty(df)
#Print estimated coefficients
res$beta
#> V1 V2
#> 0.8818009 1.0918708
res$theta
#> [1] 0.6087043The algorithm involves computing weighted matrices and solving penalized systems in each iteration. For \(N\) individuals, \(K\) clusters and \(p\) covariables describing each individual:
Sparse matrix multiplication and integration steps: ~ \(\mathcal{O}(N^2)\)
Newton-Raphson updates (inversion of Fisher information matrix): ~ \(\mathcal{O}((p + K)^3)\) in the worst case
In total, the complexity of the algorithm is ~\(\mathcal{O}(N^2 + Np + NK + (p + K)^3)\) in the worst case.
However, the implementation uses sparse matrices (Matrix
package) to significantly reduce practical computational cost.
The Matrix package
Katsahian S, Resche-Rigon M, Chevret S, Porcher R. (2006). Analysing multicentre competing risks data with a mixed proportional hazards model for the subdistribution. Statistics in Medicine, 25(26), 4267–4278.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.