The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Clustering Individualized Survival Curves with unsurv

Imad EL BADISY

2026-03-12

The unsurv package clusters full survival trajectories rather than baseline covariates. This vignette walks through a minimal workflow: simulate curves, fit the model, choose the number of clusters automatically, predict new samples, and assess stability.

Simulate individualized survival curves

We create three prognosis groups with different exponential hazards and add noise so the curves are not perfectly smooth.

library(unsurv)

set.seed(2026)
n <- 150
Q <- 60
times <- seq(0, 5, length.out = Q)

group <- sample(1:3, n, TRUE, prob = c(0.35, 0.4, 0.25))
haz   <- c(0.18, 0.45, 0.8)[group]

S <- sapply(times, function(t) exp(-haz * t))
S <- S + matrix(rnorm(n * Q, 0, 0.02), nrow = n)
S[S < 0] <- 0
S[S > 1] <- 1

Fit clustering with automatic K selection

Leaving K = NULL lets unsurv pick the number of clusters using the mean silhouette over 2:K_max.

fit <- unsurv(S, times, K = NULL, K_max = 6, distance = "L2",
              enforce_monotone = TRUE, smooth_median_width = 5,
              standardize_cols = TRUE, eps_jitter = 0.0005)
fit
#> unsurv (PAM) fit
#>   K:3
#>   distance:L2 silhouette_mean:0.810
#>   n:150 Q:60

Key slots:

Visualize medoids

plot(fit)
Cluster medoid survival curves.
Cluster medoid survival curves.

For a ggplot2 version:

library(ggplot2)
autoplot(fit)
Medoid curves via ggplot2 autoplot.
Medoid curves via ggplot2 autoplot.

Predict cluster membership for new curves

New curves must use the same time grid as the fit. Preprocessing (clamping, monotonicity, smoothing, standardization) is reused automatically.

new_curves <- S[1:5, ]
predict(fit, new_curves)
#> [1] 1 1 2 2 1

Stability assessment

Resampling gives a sense of how stable the clustering is to perturbations.

stab <- unsurv_stability(
  S, times, fit,
  B = 20, frac = 0.7,
  mode = "subsample",
  jitter_sd = 0.01,
  weight_perturb = 0.05,
  return_distribution = TRUE
  )
stab$mean
#> [1] 1

Higher mean ARI indicates more reproducible clusters.

Tips and troubleshooting

Reproducibility

Set seed inside unsurv() for deterministic PAM initialization and silhouette selection. Vignette figures may differ slightly because of noise added to simulated curves.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.