Getting started with aftPenCDA package

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Getting started with aftPenCDA package

Overview

aftPenCDA is an R package for fitting penalized accelerated failure time (AFT) models using induced smoothing. The package supports variable selection for both right-censored and clustered partly interval-censored survival data.

Several penalty functions are implemented, including broken adaptive ridge (BAR), LASSO, adaptive LASSO (ALASSO), and SCAD. For variance estimation, the package provides both a closed-form estimator and a perturbation-based estimator.

Core computational routines are implemented in ‘C++’ via ‘Rcpp’ (‘RcppArmadillo’ backend) to ensure scalability for high-dimensional settings.

Methodological background

The accelerated failure time (AFT) model with rank-based estimating equations involves nonsmooth objective functions, which pose challenges for numerical optimization.

Induced smoothing replaces the nonsmooth estimating equations with smooth approximations, allowing the use of gradient-based methods. This approach avoids direct optimization of nonsmooth rank-based estimating equations, significantly improving computational efficiency.

This leads to a quadratic approximation of the objective function. By applying a Cholesky decomposition, the problem is transformed into a least-squares-type formulation, which enables efficient coordinate descent updates for penalized estimation in high-dimensional settings.

The resulting formulation enables efficient computation even when the number of covariates is large relative to the sample size.

Installation

You can install the development version of aftPenCDA from GitHub:

devtools::install_github("seonsy/aftPenCDA")

Main functions

The main functions in aftPenCDA are:

aftpen(): penalized AFT model for right-censored data
aftpen_pic(): penalized AFT model for clustered partly interval-censored data

Both functions support the following penalty types:

"BAR": Broken Adaptive Ridge
"LASSO": LASSO penalty
"ALASSO": Adaptive LASSO penalty
"SCAD": Smoothly Clipped Absolute Deviation penalty

library(aftPenCDA)

Example 1: Right-censored data

We use the example right-censored dataset included in the package and fit the penalized estimator.

data("simdat_rc")

We fit the model using the BAR penalty.

fit_bar <- aftpen(simdat_rc, lambda = 0.3, se = "CF", type = "BAR")
fit_bar$beta

Other penalties are also available.

fit_lasso  <- aftpen(simdat_rc, lambda = 0.1, se = "CF", type = "LASSO")
fit_alasso <- aftpen(simdat_rc, lambda = 0.1, se = "CF", type = "ALASSO")
fit_scad   <- aftpen(simdat_rc, lambda = 0.1, se = "CF", type = "SCAD")

Example 2: Clustered partly interval-censored data

We use the example clustered partly interval-censored dataset included in the package and apply the proposed method.

data("simdat_pic")

We fit the model using the BAR penalty.

fit_pic <- aftpen_pic(simdat_pic, lambda = 0.0005, se = "CF", type = "BAR")
fit_pic$beta

Other penalties are also available for partly interval-censored data.

fit_pic_lasso  <- aftpen_pic(simdat_pic, lambda = 0.001, se = "CF", type = "LASSO")
fit_pic_alasso <- aftpen_pic(simdat_pic, lambda = 0.001, se = "CF", type = "ALASSO")
fit_pic_scad   <- aftpen_pic(simdat_pic, lambda = 0.001, se = "CF", type = "SCAD")

Variance estimation

The argument se specifies the variance estimation method.

"CF": closed-form estimator
"ZL": perturbation-based estimator

For example:

fit_zl <- aftpen(simdat_rc, lambda = 0.1, se = "ZL", type = "BAR")

References

Wang, You-Gan, and Yudong Zhao (2008). “Weighted Rank Regression for Clustered Data Analysis.” Biometrics 64(1), 39–45.

Dai, L., K. Chen, Z. Sun, Z. Liu, and G. Li (2018). “Broken Adaptive Ridge Regression and Its Asymptotic Properties.” Journal of Multivariate Analysis 168, 334–351.

Zeng, Donglin, and D. Y. Lin (2008).“Efficient Resampling Methods for Nonsmooth Estimating Functions.” Biostatistics 9(2), 355–363.

Tibshirani, Robert (1996).“Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society: Series B 58(1), 267–288.

Fan, Jianqing, and Runze Li (2001). “Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties.” Journal of the American Statistical Association 96(456), 1348–1360.

Zou, Hui (2006).“The Adaptive Lasso and Its Oracle Properties.” Journal of the American Statistical Association 101(476), 1418–1429.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.