Type: | Package |
Title: | Nonparametric Change Point Detection for Multivariate Time Series |
Version: | 0.3.0 |
Depends: | R (≥ 4.1.0) |
Maintainer: | Euan T. McGonigle <e.t.mcgonigle@soton.ac.uk> |
License: | GPL (≥ 3) |
Description: | Implements the nonparametric moving sum procedure for detecting changes in the joint characteristic function (NP-MOJO) for multiple change point detection in multivariate time series. See McGonigle, E. T., Cho, H. (2025) <doi:10.1093/biomet/asaf024> for description of the NP-MOJO methodology. |
Encoding: | UTF-8 |
LinkingTo: | Rcpp |
Imports: | Rcpp, doParallel, parallel, parallelly, foreach, Rfast, iterators, stats |
URL: | https://github.com/EuanMcGonigle/CptNonPar |
BugReports: | https://github.com/EuanMcGonigle/CptNonPar/issues |
RoxygenNote: | 7.3.2 |
Suggests: | covr, testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | yes |
Packaged: | 2025-04-16 14:05:12 UTC; euanmcgonigle |
Author: | Euan T. McGonigle [aut, cre], Haeran Cho [aut] |
Repository: | CRAN |
Date/Publication: | 2025-04-16 14:30:21 UTC |
CptNonPar: Nonparametric Change Point Detection for Multivariate Time Series
Description
Implements the nonparametric moving sum procedure for detecting changes in the joint characteristic function (NP-MOJO) for multiple change point detection in multivariate time series. See McGonigle, E. T., Cho, H. (2025) doi:10.1093/biomet/asaf024 for description of the NP-MOJO methodology.
Author(s)
Maintainer: Euan T. McGonigle e.t.mcgonigle@soton.ac.uk
Authors:
Haeran Cho haeran.cho@bristol.ac.uk
See Also
np.mojo, np.mojo.multilag, multilag.cpts.merge
Examples
set.seed(1)
n <- 500
noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n)
signal <- c(rep(0, 100), rep(2, 400))
x <- signal + noise
x.c <- np.mojo.multilag(x, G = 83, lags = c(0, 1))
x.c$cpts
x.c$cpt.clusters
Merge Change Point Estimators from Multiple Lags
Description
Merges change point estimators from different lagged values into a final set of overall change point estimators.
Usage
multilag.cpts.merge(
x.c,
eta.merge = 1,
merge.type = c("sequential", "bottom-up")[1]
)
Arguments
x.c |
A |
eta.merge |
A positive numeric value for the minimal mutual distance of changes, relative to bandwidth, used to merge change point estimators across different lags. |
merge.type |
String indicating the method used to merge change point estimators from different lags. Possible choices are
|
Details
See McGonigle and Cho (2025) for further details.
Value
A list
object which contains the following fields
cpts |
A matrix with rows corresponding to final change point estimators, with estimated change point location and associated lag and importance score given in columns. |
cpt.clusters |
A |
References
McGonigle, E.T., Cho, H. (2025). Nonparametric data segmentation in multivariate time series via joint characteristic functions. Biometrika (to appear).
Messer M., Kirchner M., Schiemann J., Roeper J., Neininger R., Schneider G. (2014). A Multiple Filter Test for the Detection of Rate Changes in Renewal Processes with Varying Variance. The Annals of Applied Statistics, 8(4), 2027-2067.
See Also
Examples
set.seed(1)
n <- 500
noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n)
signal <- c(rep(0, 100), rep(2, 400))
x <- signal + noise
x.c0 <- np.mojo(x, G = 83, lag = 0)
x.c1 <- np.mojo(x, G = 83, lag = 1)
x.c <- multilag.cpts.merge(list(x.c0, x.c1))
x.c
Multiscale Nonparametric Multiple Lag Change Point Detection
Description
For a given set of bandwidths and lagged values of the time series, performs multiscale nonparametric change point detection of a possibly multivariate time series.
Usage
multiscale.np.mojo(
x,
G,
lags = c(0, 1),
kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1],
kern.par = 1,
data.driven.kern.par = TRUE,
threshold = c("bootstrap", "manual")[1],
threshold.val = NULL,
alpha = 0.1,
reps = 200,
boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)),
parallel = FALSE,
boot.method = c("mean.subtract", "no.mean.subtract")[1],
criterion = c("eta", "epsilon", "eta.and.epsilon")[3],
eta = 0.4,
epsilon = 0.02,
use.mean = FALSE,
eta.merge = 1,
merge.type = c("sequential", "bottom-up")[1],
eta.bottom.up = 0.8
)
Arguments
x |
Input data (a |
G |
A numeric vector containing the moving sum bandwidths;
all values in the vector |
lags |
A |
kernel.f |
String indicating which kernel function to use when calculating the NP-MOJO detector statistics; with
|
kern.par |
The tuning parameter that appears in the expression for the kernel function, which acts as a scaling parameter. |
data.driven.kern.par |
A |
threshold |
String indicating how the threshold is computed. Possible values are
|
threshold.val |
The value of the threshold used to declare change points, only to be used if |
alpha |
a numeric value for the significance level with
|
reps |
An integer value for the number of bootstrap replications performed, if |
boot.dep |
A positive value for the strength of dependence in the multiplier bootstrap sequence, if |
parallel |
A |
boot.method |
A string indicating the method for creating bootstrap replications. It is not recommended to change this. Possible choices are
|
criterion |
String indicating how to determine whether each point
|
eta |
A positive numeric value for the minimal mutual distance of
changes, relative to bandwidth (if |
epsilon |
a numeric value in (0,1] for the minimal size of exceeding
environments, relative to moving sum bandwidth (if |
use.mean |
|
eta.merge |
A positive numeric value for the minimal mutual distance of changes, relative to bandwidth, used to merge change point estimators across different lags. |
merge.type |
String indicating the method used to merge change point estimators from different lags. Possible choices are
|
eta.bottom.up |
A positive numeric value for the minimal mutual distance of changes, relative to bandwidth, for use in bottom-up merging of change point estimators across multiple bandwidths. |
Details
The multi-lag NP-MOJO algorithm for nonparametric change point detection is described in McGonigle, E. T. and Cho, H. (2025) Nonparametric data segmentation in multivariate time series via joint characteristic functions. Biometrika (to appear). The multiscale version uses bottom-up merging to combine the results of the multi-lag NP-MOJO algorithm performed over a given set of bandwidths.
Value
A list
object that contains the following fields:
G |
Set of moving window bandwidths |
lags |
Lags used to detect changes |
kernel.f , data.driven.kern.par , use.mean |
Input parameters |
threshold , alpha , reps , boot.dep , boot.method , parallel |
Input parameters |
criterion , eta , epsilon |
Input parameters |
cpts |
A matrix with rows corresponding to final change point estimators, with estimated change point location and associated detection bandwidth, lag and importance score given in columns. |
References
McGonigle, E.T., Cho, H. (2025). Nonparametric data segmentation in multivariate time series via joint characteristic functions. Biometrika (to appear).
Fan, Y., de Micheaux, P.L., Penev, S. and Salopek, D. (2017). Multivariate nonparametric test of independence. Journal of Multivariate Analysis, 153, pp.189-210.
Messer M., Kirchner M., Schiemann J., Roeper J., Neininger R., Schneider G. (2014). A Multiple Filter Test for the Detection of Rate Changes in Renewal Processes with Varying Variance. The Annals of Applied Statistics, 8(4), 2027-2067.
See Also
Examples
set.seed(1)
n <- 500
noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n)
signal <- c(rep(0, 100), rep(2, 400))
x <- signal + noise
x.c <- multiscale.np.mojo(x, G = c(50, 80), lags = c(0, 1))
x.c$cpts
Nonparametric Single Lag Change Point Detection
Description
For a given lagged value of the time series, performs nonparametric change point detection of a possibly multivariate
time series. If lag
\ell = 0
, then only marginal changes are detected.
If lag
\ell \neq 0
, then changes in the pairwise distribution of (X_t , X_{t+\ell})
are detected.
Usage
np.mojo(
x,
G,
lag = 0,
kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1],
kern.par = 1,
data.driven.kern.par = TRUE,
alpha = 0.1,
threshold = c("bootstrap", "manual")[1],
threshold.val = NULL,
reps = 200,
boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)),
parallel = FALSE,
boot.method = c("mean.subtract", "no.mean.subtract")[1],
criterion = c("eta", "epsilon", "eta.and.epsilon")[3],
eta = 0.4,
epsilon = 0.02,
use.mean = FALSE
)
Arguments
x |
Input data (a |
G |
An integer value for the moving sum bandwidth;
|
lag |
The lagged values of the time series used to detect changes. If |
kernel.f |
String indicating which kernel function to use when calculating the NP-MOJO detectors statistics; with
|
kern.par |
The tuning parameter that appears in the expression for the kernel function, which acts as a scaling parameter,
only to be used if |
data.driven.kern.par |
A |
alpha |
A numeric value for the significance level with
|
threshold |
String indicating how the threshold is computed. Possible values are
|
threshold.val |
The value of the threshold used to declare change points, only to be used if |
reps |
An integer value for the number of bootstrap replications performed, if |
boot.dep |
A positive value for the strength of dependence in the multiplier bootstrap sequence, if |
parallel |
A |
boot.method |
A string indicating the method for creating bootstrap replications. It is not recommended to change this. Possible choices are
|
criterion |
String indicating how to determine whether each point
|
eta |
A positive numeric value for the minimal mutual distance of
changes, relative to bandwidth (if |
epsilon |
a numeric value in (0,1] for the minimal size of exceeding
environments, relative to moving sum bandwidth (if |
use.mean |
|
Details
The single-lag NP-MOJO algorithm for nonparametric change point detection is described in McGonigle, E. T. and Cho, H. (2025) Nonparametric data segmentation in multivariate time series via joint characteristic functions. Biometrika (to appear).
Value
A list
object that contains the following fields:
x |
Input data |
G |
Moving window bandwidth |
lag |
Lag used to detect changes |
kernel.f , data.driven.kern.par , use.mean |
Input parameters |
kern.par |
The value of the kernel tuning parameter |
threshold , alpha , reps , boot.dep , boot.method , parallel |
Input parameters |
threshold.val |
Threshold value for declaring change points |
criterion , eta , epsilon |
Input parameters |
test.stat |
A vector containing the NP-MOJO detector statistics computed from the input data |
cpts |
A vector containing the estimated change point locations |
scores |
The corresponding importance scores of the estimated change points. The larger the score is, the more likely that there exists a change point close to the estimated location. If the bootstrap method is used, this a value between 0 and 1 corresponding to the proportion of times the observed detector statistic was larger than the bootstrapped detector statistics. Otherwise, the importance score is simply the value of the detector statistic at the estimated change point location (which is not necessarily less than 1). |
References
McGonigle, E.T., Cho, H. (2025). Nonparametric data segmentation in multivariate time series via joint characteristic functions. Biometrika (to appear).
Fan, Y., de Micheaux, P.L., Penev, S. and Salopek, D. (2017). Multivariate nonparametric test of independence. Journal of Multivariate Analysis, 153, pp.189-210.
See Also
Examples
set.seed(1)
n <- 500
noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n)
signal <- c(rep(0, 100), rep(2, 400))
x <- signal + noise
x.c <- np.mojo(x, G = 83, lag = 0)
x.c$cpts
x.c$scores
Nonparametric Multiple Lag Change Point Detection
Description
For a given set of lagged values of the time series, performs nonparametric change point detection of a possibly multivariate time series.
Usage
np.mojo.multilag(
x,
G,
lags = c(0, 1),
kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1],
kern.par = 1,
data.driven.kern.par = TRUE,
threshold = c("bootstrap", "manual")[1],
threshold.val = NULL,
alpha = 0.1,
reps = 200,
boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)),
parallel = FALSE,
boot.method = c("mean.subtract", "no.mean.subtract")[1],
criterion = c("eta", "epsilon", "eta.and.epsilon")[3],
eta = 0.4,
epsilon = 0.02,
use.mean = FALSE,
eta.merge = 1,
merge.type = c("sequential", "bottom-up")[1]
)
Arguments
x |
Input data (a |
G |
An integer value for the moving sum bandwidth;
|
lags |
A |
kernel.f |
String indicating which kernel function to use when calculating the NP-MOJO detector statistics; with
|
kern.par |
The tuning parameter that appears in the expression for the kernel function, which acts as a scaling parameter. |
data.driven.kern.par |
A |
threshold |
String indicating how the threshold is computed. Possible values are
|
threshold.val |
The value of the threshold used to declare change points, only to be used if |
alpha |
a numeric value for the significance level with
|
reps |
An integer value for the number of bootstrap replications performed, if |
boot.dep |
A positive value for the strength of dependence in the multiplier bootstrap sequence, if |
parallel |
A |
boot.method |
A string indicating the method for creating bootstrap replications. It is not recommended to change this. Possible choices are
|
criterion |
String indicating how to determine whether each point
|
eta |
A positive numeric value for the minimal mutual distance of
changes, relative to bandwidth (if |
epsilon |
a numeric value in (0,1] for the minimal size of exceeding
environments, relative to moving sum bandwidth (if |
use.mean |
|
eta.merge |
A positive numeric value for the minimal mutual distance of changes, relative to bandwidth, used to merge change point estimators across different lags. |
merge.type |
String indicating the method used to merge change point estimators from different lags. Possible choices are
|
Details
The multi-lag NP-MOJO algorithm for nonparametric change point detection is described in McGonigle, E. T. and Cho, H. (2025) Nonparametric data segmentation in multivariate time series via joint characteristic functions. Biometrika (to appear).
Value
A list
object that contains the following fields:
G |
Moving window bandwidth |
lags |
Lags used to detect changes |
kernel.f , data.driven.kern.par , use.mean |
Input parameters |
threshold , alpha , reps , boot.dep , boot.method , parallel |
Input parameters |
criterion , eta , epsilon |
Input parameters |
cpts |
A matrix with rows corresponding to final change point estimators, with estimated change point location and associated lag and importance score given in columns. |
cpt.clusters |
A |
References
McGonigle, E.T., Cho, H. (2025). Nonparametric data segmentation in multivariate time series via joint characteristic functions. Biometrika (to appear).
Fan, Y., de Micheaux, P.L., Penev, S. and Salopek, D. (2017). Multivariate nonparametric test of independence. Journal of Multivariate Analysis, 153, pp.189-210.
Messer M., Kirchner M., Schiemann J., Roeper J., Neininger R., Schneider G. (2014). A Multiple Filter Test for the Detection of Rate Changes in Renewal Processes with Varying Variance. The Annals of Applied Statistics, 8(4), 2027-2067.
See Also
Examples
set.seed(1)
n <- 500
noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n)
signal <- c(rep(0, 100), rep(2, 400))
x <- signal + noise
x.c <- np.mojo.multilag(x, G = 83, lags = c(0, 1))
x.c$cpts
x.c$cpt.clusters