The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
library(Colossus)
library(data.table)
For Cox Proportional Hazards regression, the model is generally assumed to be independent of event time. However for more complex models, Colossus has the capability to perform regression using covariates that change over time. These can be split into two general types of covariates, step functions changing with time and multiplicative interactions with time. Colossus generates the new dataset by splitting each row of the original dataset into smaller intervals. This assumes that over each interval the values of every covariate are approximately constant. For Cox Proportional Hazards, rows that do not contain a event time are not used for regression, so Colossus has an option to only use small intervals around each event time. With this option the time dependent covariate is evaluated only at event times. For data-sets with a small number of discrete event times this can save memory.
The simplest type of time dependent covariate is an interaction term between time and another covariate. Suppose we have a row in a dataset with a factor covariate “group” and some arbitrary endpoints to the time interval. Colossus starts by using a user provided function to calculate the value of the time dependent covariate at the endpoints. We assume that the value of “group” is constant over the interval and time is changing linearly. Colossus calculates the value of the time dependent covariate over intervals by linearly interpolating between the value at the endpoints. This process assumes that the interaction is linear or the interval is small enough for the interaction to be approximately linear.
data.table("x" = c(1, 2, 3), "y" = c(2, 5, 10))
dft <- ggplot2::ggplot(dft, ggplot2::aes(x = .data$x, y = .data$y)) +
g <- ggplot2::geom_point(color = "black") +
ggplot2::geom_line(color = "black", alpha = 1) +
ggplot2::labs(x = "age (days)", y = "Covariate Value")
seq(1, 3, by = 0.1)
x <- 1 + x^2
y <- data.table("x" = x, "y" = y)
dft <- g + ggplot2::geom_line(
g <-data = dft, ggplot2::aes(x = .data$x, y = .data$y),
color = "black", linetype = "dashed"
) g
\[ \begin{aligned} Y(x)=x^2 + 1 \end{aligned} \]
This is most helpful in a situation where the user has continuous data over a series of intervals and believes that the values can be interpolated within each interval.
The second type of time dependent covariate is one which changes based on conditional statements. One example is a covariate to split data into bins by time. Colossus uses a string to identify where to change value. The user inputs a string of the form “#l?” for a time value “#”, a condition “l”, and a question mark as a delimiter. Colossus allows for four conditions:
So the following would be equivalent to “\(0g?6g?12g?\)”
data.table("x" = c(-1, 1, 5, 8, 13), "y" = c(0, 1, 1, 2, 3))
dft <- ggplot2::ggplot(dft, ggplot2::aes(x = .data$x, y = .data$y)) +
g <- ggplot2::geom_point(color = "black")
data.table("x" = c(-1, -0.01, 0, 1, 5.99, 6, 11.99, 12, 13), "y" = c(0, 0, 1, 1, 1, 2, 2, 3, 3))
dft <- g + ggplot2::geom_line(data = dft, ggplot2::aes(x = .data$x, y = .data$y), color = "black") +
g <- ggplot2::labs(x = "age (days)", y = "Covariate Value")
g
\[ \begin{aligned} Y(x)=\begin{cases} 0 &(x < 0) \\ 1 & (x \ge 0) \\ 2 &(x \ge 6) \\ 3 &(x \ge 12) \end{cases}\\ \end{aligned} \]
Meanwhile the following is equivalent to “\(0g?6g?12l?\)”
data.table("x" = c(-1, 1, 5, 8, 13), "y" = c(1, 2, 2, 3, 2))
dft <- ggplot2::ggplot(dft, ggplot2::aes(x = .data$x, y = .data$y)) +
g <- ggplot2::geom_point(color = "black")
data.table("x" = c(-1, -0.01, 0, 1, 5.99, 6, 11.99, 12, 13), "y" = c(1, 1, 2, 2, 2, 3, 3, 2, 2))
dft <- g + ggplot2::geom_line(data = dft, ggplot2::aes(x = .data$x, y = .data$y), color = "black") +
g <- ggplot2::labs(x = "age (days)", y = "Covariate Value")
g
\[ \begin{aligned} Y(x)=\begin{cases} 1 &(x < 0) \\ 2 & (x \ge 0) \\ 3 &(x \ge 6) \\ 2 &(x \ge 12) \end{cases}\\ \end{aligned} \]
This is most helpful in situations where the user has reason to believe that the effect of a covariate on events is not uniform over time despite the covariate being constant over each interval. This allows the user to generate a list of factors to interact with any covariate of interest.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.