The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Getting started with aftPenCDA
package
Overview
aftPenCDA is an R package for fitting penalized
accelerated failure time (AFT) models using induced smoothing. The
package supports variable selection for both right-censored and
clustered partly interval-censored survival data.
Several penalty functions are implemented, including broken adaptive
ridge (BAR), LASSO, adaptive LASSO (ALASSO), and SCAD. For variance
estimation, the package provides both a closed-form estimator and a
perturbation-based estimator.
Core computational routines are implemented in C++ via Rcpp
(RcppArmadillo backend) to ensure scalability for high-dimensional
settings.
Methodological background
The accelerated failure time (AFT) model with rank-based estimating
equations involves nonsmooth objective functions, which pose challenges
for numerical optimization.
Induced smoothing replaces the nonsmooth estimating equations with
smooth approximations, allowing the use of gradient-based methods. This
approach avoids direct optimization of nonsmooth rank-based estimating
equations, significantly improving computational efficiency.
This leads to a quadratic approximation of the objective function. By
applying a Cholesky decomposition, the problem is transformed into a
least-squares-type formulation, which enables efficient coordinate
descent updates for penalized estimation in high-dimensional
settings.
The resulting formulation enables efficient computation even when the
number of covariates is large relative to the sample size.
Installation
You can install the development version of aftPenCDA
from GitHub:
devtools::install_github("seonsy/aftPenCDA")
Main functions
The main functions in aftPenCDA are:
aftpen(): penalized AFT model for right-censored
data
aftpen_pic(): penalized AFT model for clustered partly
interval-censored data
Both functions support the following penalty types:
"BAR": Broken Adaptive Ridge
"LASSO": LASSO penalty
"ALASSO": Adaptive LASSO penalty
"SCAD": Smoothly Clipped Absolute Deviation
penalty
Example 1: Right-censored data
We simulate right-censored survival data under an AFT model and fit
the penalized estimator.
library(aftPenCDA)
set.seed(1)
n <- 100
p <- 10
beta0 <- c(1, 1, 1, rep(0, p - 3))
x <- matrix(rnorm(n * p), n, p)
T <- exp(x %*% beta0 + rnorm(n))
C <- rexp(n, rate = 0.5)
y <- pmin(T, C)
d <- as.numeric(T <= C)
dt <- data.frame(y = y, d = d, x)
We fit the model using the BAR penalty.
fit_bar <- aftpen(dt, lambda = 0.1, se = "CF", type = "BAR")
fit_bar$beta
Other penalties are also available.
fit_lasso <- aftpen(dt, lambda = 0.1, se = "CF", type = "LASSO")
fit_alasso <- aftpen(dt, lambda = 0.1, se = "CF", type = "ALASSO")
fit_scad <- aftpen(dt, lambda = 0.1, se = "CF", type = "SCAD")
Example 2: Clustered partly interval-censored data
We generate clustered partly interval-censored data and apply the
proposed method.
set.seed(1)
## simplified generator for clustered partly interval-censored data
n <- 100
p <- 2
beta0 <- c(1,1)
clu_rate <- 0.5
exactrates <- 0.8
left <- 0.001
right <- 0.01
## cluster-level frailty and informative cluster sizes
eta <- 1 / clu_rate
v <- rgamma(n, shape = eta, rate = eta)
m <- ifelse(v > median(v), 5, 3)
id <- rep(seq_len(n), m)
vi <- rep(v, m)
## subject-level covariates and failure times
N <- sum(m)
x <- matrix(rnorm(N * p), ncol = p)
colnames(x) <- paste0("x", seq_len(p))
T <- as.vector(exp(x %*% beta0 + vi * log(rexp(N))))
## build (L, R, delta)
L <- R <- delta <- numeric(N)
index <- rbinom(N, 1, exactrates)
for (i in seq_len(N)) {
if (index[i] == 1) {
L[i] <- T[i]
R[i] <- T[i]
delta[i] <- 1
} else {
U <- cumsum(c(1e-8, runif(10, left, right)))
LL <- U[-length(U)]
RR <- U[-1]
if (T[i] < min(LL)) {
L[i] <- 1e-8
R[i] <- min(LL)
delta[i] <- 0
} else if (T[i] > max(RR)) {
L[i] <- max(RR)
R[i] <- 1e8
delta[i] <- 0
} else {
idd <- which(T[i] > LL & T[i] < RR)
if (length(idd) == 1) {
L[i] <- LL[idd]
R[i] <- RR[idd]
delta[i] <- 0
} else {
L[i] <- T[i]
R[i] <- T[i]
delta[i] <- 1
}
}
}
}
dt_pic <- data.frame(
L = L,
R = R,
delta = delta,
id = id,
x1 = x[, 1],
x2 = x[, 2]
)
We fit the model using the BAR penalty.
fit_pic <- aftpen_pic(dt_pic, lambda = 0.0005, se = "CF", type = "BAR")
fit_pic$beta
Other penalties are also available for partly interval-censored
data.
fit_pic_lasso <- aftpen_pic(dt_pic, lambda = 0.001, se = "CF", type = "LASSO")
fit_pic_alasso <- aftpen_pic(dt_pic, lambda = 0.001, se = "CF", type = "ALASSO")
fit_pic_scad <- aftpen_pic(dt_pic, lambda = 0.001, se = "CF", type = "SCAD")
Variance estimation
The argument se specifies the variance estimation
method.
"CF": closed-form estimator
"ZL": perturbation-based estimator
For example:
fit_zl <- aftpen(dt, lambda = 0.1, se = "ZL", type = "BAR")
References
Wang, You-Gan, and Yudong Zhao (2008). “Weighted Rank Regression for
Clustered Data Analysis.” Biometrics 64(1),
39–45.
Dai, L., K. Chen, Z. Sun, Z. Liu, and G. Li (2018). “Broken Adaptive
Ridge Regression and Its Asymptotic Properties.” Journal of
Multivariate Analysis 168, 334–351.
Zeng, Donglin, and D. Y. Lin (2008).“Efficient Resampling Methods for
Nonsmooth Estimating Functions.” Biostatistics
9(2), 355–363.
Tibshirani, Robert (1996).“Regression Shrinkage and Selection via the
Lasso.” Journal of the Royal Statistical Society: Series B
58(1), 267–288.
Fan, Jianqing, and Runze Li (2001). “Variable Selection via
Nonconcave Penalized Likelihood and Its Oracle Properties.” Journal
of the American Statistical Association 96(456),
1348–1360.
Zou, Hui (2006).“The Adaptive Lasso and Its Oracle Properties.”
Journal of the American Statistical Association
101(476), 1418–1429.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.