The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Here we provide a brief tutorial of the BayesChange
package. The BayesChange
package contains functions that
perform change points detection on univariate and multivariate time
series and functions that perform clustering of time series and survival
functions with common change points. Here we briefly show how to
implement the main functions.
Two functions of the package, detect_cp_uni
and
detect_cp_multi
, provide respectively a method for
detecting change points on univariate and multivariate time series. The
function detect_cp_uni
is based on the work Martínez and Mena (2014) and the function
detect_cp_multi
on the work Corradin, Danese, and Ongaro (2022).
In order to use the function detect_cp_uni
we need a
vector of observations. For example we can create a vector of 100
observations where the first 50 observations are sampled from a normal
distribution with mean 0 and variance 0.1 and the other 50 observations
still from a normal distribution with mean 0 but variance 0.25.
Now we can run the function detect_cp_uni
, as arguments
of the function we need to specify the number of iterations, the
probability q
of performing a split at each step, the
autoregressive coefficient phi
for the likelihood of the
data and the parameters a
, b
, c
for the priors.
out <- detect_cp_uni(data = data_uni,
n_iterations = 1000,
q = 0.25,
phi = 0.1, a = 1, b = 1, c = 0.1)
In order to get a point estimate of the final partition we can use
the function get_clust_VI
that choose as final estimate for
the partition the visited partition that minimizes the Variation of
Information loss function.
The function correctly detect a change point at time 50.
Remark: the function get_clust_VI
has been
included inside the package in order to give the user an easy way to
obtain a final estimate of the partition. In literature there are many
other methods based on other loss functions and other algorithms that
estimate a final partition from a sequence of visited partition by an
MCMC algorithm. See for example David B. Dahl and
Müller (2022)
Function detect_cp_multi
work similarly to its
univariate counterpart. Data must be a matrix where each row is a time
series.
data_multi <- matrix(NA, nrow = 3, ncol = 100)
data_multi[1,] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_multi[2,] <- as.numeric(c(rnorm(50,0,0.125), rnorm(50,1,0.225)))
data_multi[3,] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
Arguments k_0
, nu_0
, phi_0
,
m_0
, par_theta_c
, par_theta_d
and
prior_var_gamma
correspond to the parameters of the prior
distributions for the multivariate likelihood.
BayesChange
contains two functions,
clust_cp_uni
and clust_cp_multi
, that cluster
respectively univariate and multivariate time series with common change
points. Details about this methods can be found in Corradin et al. (2024).
In clust_cp_uni
data are a matrix where each row is a
time series.
data_mat <- matrix(NA, nrow = 5, ncol = 100)
data_mat[1,] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_mat[2,] <- as.numeric(c(rnorm(50,0,0.125), rnorm(50,1,0.225)))
data_mat[3,] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
data_mat[4,] <- as.numeric(c(rnorm(25,0,0.135), rnorm(75,1,0.225)))
data_mat[5,] <- as.numeric(c(rnorm(25,0,0.155), rnorm(75,1,0.280)))
Arguments that need to be specified in clust_cp_uni
are
the number of iterations n_iterations
, the number of
elements in the normalisation constant B
, the
split-and-merge step L
performed when a new partition is
proposed and the autoregressive coefficient gamma
for the
likelihood of the time series.
out <- clust_cp_uni(data = data_mat, n_iterations = 1000, B = 1000, L = 1, gamma = 0.5)
get_clust_VI(out$clust)
In clust_cp_multi
data must be an array, where each
element is a multivariate time series represented by a matrix. Each row
of the matrix is a component of the time series.
data_array <- array(data = NA, dim = c(3,100,5))
data_array[1,,1] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[2,,1] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[3,,1] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[1,,2] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[2,,2] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[3,,2] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[1,,3] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
data_array[2,,3] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
data_array[3,,3] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
data_array[1,,4] <- as.numeric(c(rnorm(25,0,0.135), rnorm(75,1,0.225)))
data_array[2,,4] <- as.numeric(c(rnorm(25,0,0.135), rnorm(75,1,0.225)))
data_array[3,,4] <- as.numeric(c(rnorm(25,0,0.135), rnorm(75,1,0.225)))
data_array[1,,5] <- as.numeric(c(rnorm(25,0,0.155), rnorm(75,1,0.280)))
data_array[2,,5] <- as.numeric(c(rnorm(25,0,0.155), rnorm(75,1,0.280)))
data_array[3,,5] <- as.numeric(c(rnorm(25,0,0.155), rnorm(75,1,0.280)))
Parameters for the algorithm are the same of
clust_cp_uni
, parameters of the multivariate likelihood
correspond to those in detect_cp_multi
Functions clust_cp_epi
provide a method for clustering
survival functions with common change points. Also here details can be
found in Corradin et al. (2024).
Data are a matrix where each row is the number of infected at each
time. Inside this package is included the function
sim_epi_data
that simulates infection times.
data_mat <- matrix(NA, nrow = 5, ncol = 50)
betas <- list(c(rep(0.45, 25),rep(0.14,25)),
c(rep(0.55, 25),rep(0.11,25)),
c(rep(0.50, 25),rep(0.12,25)),
c(rep(0.52, 10),rep(0.15,40)),
c(rep(0.53, 10),rep(0.13,40)))
inf_times <- list()
for(i in 1:5){
inf_times[[i]] <- sim_epi_data(S0 = 10000, I0 = 10, max_time = 50, beta_vec = betas[[i]], gamma_0 = 1/8)
vec <- rep(0,50)
names(vec) <- as.character(1:50)
for(j in 1:50){
if(as.character(j) %in% names(table(floor(inf_times[[i]])))){
vec[j] = table(floor(inf_times[[i]]))[which(names(table(floor(inf_times[[i]]))) == j)]
}
}
data_mat[i,] <- vec
}
In clust_cp_epi
we need to specify, besides the usual
parameters, the number of Monte Carlo replications M
for
the approximation of the integrated likelihood and the recovery rate
gamma
.
out <- clust_cp_epi(data = data_mat, n_iterations = 1000, M = 250, B = 1000, L = 1, q = 0.1, gamma = 1/8)
get_clust_VI(out$clust)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.