Type: | Package |
Title: | Wasserstein Regression and Inference |
Version: | 0.2.0 |
Author: | Xi Liu [aut, cre], Chao Zhang [aut], Matthew Coleman [aut], Alexander Petersen [aut] |
Description: | Implementation of the methodologies described in 1) Alexander Petersen, Xi Liu and Afshin A. Divani (2021) <doi:10.1214/20-aos1971>, including global F tests, partial F tests, intrinsic Wasserstein-infinity bands and Wasserstein density bands, and 2) Chao Zhang, Piotr Kokoszka and Alexander Petersen (2022) <doi:10.1111/jtsa.12590>, including estimation, prediction, and inference of the Wasserstein autoregressive models. |
License: | GPL-2 |
Encoding: | UTF-8 |
LazyData: | true |
LazyDataCompression: | xz |
LinkingTo: | Rcpp, RcppArmadillo |
Depends: | R (≥ 3.6.0) |
Imports: | fdapace (≥ 0.2.0), fdadensity (≥ 0.1.2), Rfast (≥ 1.9.8), CVXR (≥ 0.99.7), expm (≥ 0.999-4), ggplot2 (≥ 3.2.1), gridExtra (≥ 2.3), stats, Rcpp (≥ 1.0.3), locfit (≥ 1.5-9.1), mvtnorm (≥ 1.1-0), locpol (≥ 0.7), modeest (≥ 2.4.0), methods, rlang, polynom |
RoxygenNote: | 7.2.0 |
Suggests: | knitr, rmarkdown, testthat (≥ 2.1.0) |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2022-07-08 22:56:17 UTC; czhang |
Maintainer: | Xi Liu <xiliu@ucsb.edu> |
Repository: | CRAN |
Date/Publication: | 2022-07-08 23:30:11 UTC |
Numerical implementation of the Exponential map
Description
This function implements the Exponential map to calculate \hat{F}_t(u)
for all u
in the cdf/pdf support
Usage
Exp_Map_Barycenter_Method(density.grid, forecast, cdf)
Arguments
density.grid |
The values where the cdf |
forecast |
A vector that contains the WAR(p) model forecast result |
cdf |
The quantile grid used in forecasting |
Value
A numeric vector that contains \hat{F}_t(u)
evaluated over density.grid
References
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
Numerical implementation of the pointwise Exponential map
Description
This function implements the Exponential map to calculate \hat{F}_t(u)
for a fixed u
Usage
Exp_Map_Barycenter_Method_pw(u, forecast, cdf)
Arguments
u |
The value where the cdf |
forecast |
A vector that contains the WAR(p) model forecast result |
cdf |
The quantile grid used in forecasting |
Value
A numeric value \hat{F}_t(u)
References
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
Function for calculating sample autocovariance
Description
Calculating sample autocovariance of a specified lag for a centered time series
Usage
Sample_ACV(ts, lag)
Arguments
ts |
A numeric vector consisting of sequentially collected data |
lag |
A positive integer indicates the lag of the autocovariance function |
Value
Sample autocovariance
WAR(p) models: estimation and forecast
Description
this function produces an object of the WARp class which includes WAR(p) model parameter estimates and relevant quantities (see output list)
Usage
WARp(quantile, quantile.grid, p)
Arguments
quantile |
A matrix containing all the sample quantile functions. Columns represent time indices and rows represent evaluation grid. |
quantile.grid |
A numeric vector, the grid over which quantile functions are evaluated. |
p |
A positive integer, the order of the fitted WAR(p) model. |
Details
This function takes in a density time series in the form of the corresponding quantile functions as the main input. If the quantile series is not readily available, a general practice is to estimate density functions from samples, then use dens2quantile
from the fdadensity
package to convert density time series to quantile series.
Value
A WARp
object of:
coef |
estimated AR parameters of the fitted WAR(p) model |
coef.cov |
covariance matrix of |
acvf |
Wasserstein autocovariance function values |
Wass.mean |
Wasserstein mean quantile function |
quantile |
a matrix containing all the sample quantile functions (columns represent time indices and rows represent evaluation grid) |
quantile.grid |
quantile function grid that is utilized in calculation |
order |
a positive integer, the order of the fitted WAR(p) model |
References
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
Examples
# Simulate a density time series represented in quantile functions
# warSimData$sample.ts: A sample TS of quantile functions of length 100, taken from
# the simulation experiments in Section 4 of Zhang et al. 2022.
# warSimData$quantile.grid: The grid over which quantile functions in sample.ts are evaluated.
warSimData <- warSim()
p <- 3
dSup <- seq(-2, 2, 0.02)
expSup <- seq(-2, 2, 0.1)
# Estimation: fit a WAR(3) model
WARp_obj <- WARp(warSimData$sample.ts, warSimData$quantile.grid, p)
# Forecast: one-step-ahead forecast
forecast_1 <- predict(WARp_obj) # dSup and expSup are chosen automatically
forecast_2 <- predict(WARp_obj, dSup, expSup) # dSup and expSup are chosen by user
# Plots
par(mfrow=c(1,2))
plot(forecast_1$dSup, forecast_1$pred.cdf, type="l", xlab="dSup", ylab="cdf")
plot(forecast_1$dSup, forecast_1$pred.pdf, type="l", xlab="dSup", ylab="pdf")
plot(forecast_2$dSup, forecast_2$pred.cdf, type="l", xlab="dSup", ylab="cdf")
plot(forecast_2$dSup, forecast_2$pred.pdf, type="l", xlab="dSup", ylab="pdf")
Function for calculating sample Wasserstein autocovariance functions
Description
This function uses a time series of quantile functions to calculate the sample Wasserstein autocovariance functions from order 0
to p
with a specified training window
Usage
WARp_acvfs(end.day, training.size, quantile, quantile.grid, p)
Arguments
end.day |
A positive integer, the last index of the training window. |
training.size |
A positive integer, the size of the training widnows. |
quantile |
A matrix containing all the available quantile functions. Columns represent time indices and rows represent evaluation grid. |
quantile.grid |
A numeric vector, the grid over which quantile functions are evaluated. |
p |
A positive integer, the maximum order of the sample Wasserstein autocovariance functions. |
Value
A list with
acvfs - The sample Wasserstein autocovariance functions from order
0
top
barycenter - The sample average of the quantile functions in the training window
quantile.pred - The quantile functions from
end.day - p + 1
toend.day
Forecast using WAR(p) models
Description
This is a simplified function that produces the one-step ahead forecasts from WAR(p) models in the tangent space
Usage
WARp_forecast_tangent(p, acvfs)
Arguments
p |
A positive integer, the order of the WAR(p) model. |
acvfs |
A list that is the output of |
Value
A numeric vector, the one-step ahead forecast produced by the WAR(p) model in the tangent space.
References
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
Confidence Bands for Wasserstein Regression
Description
Confidence Bands for Wasserstein Regression
Usage
confidenceBands(
wass_regress_res,
Xpred_df,
level = 0.95,
delta = 0.01,
type = "density",
figure = TRUE,
fig_num = NULL
)
Arguments
wass_regress_res |
an object returned by the |
Xpred_df |
k-by-p matrix (or dataframe, or named vector) used for prediction. Note that Xpred_df should have the same column names with Xfit_df used in wass_regress_res |
level |
confidence level |
delta |
boundary control value in density band computation. Must be a value in the interval (0, 1/2) (default: 0.01) |
type |
'density', 'quantile' or 'both'
|
figure |
logical; if TRUE, return a sampled plot (default: TRUE) |
fig_num |
the fig_num-th row of |
Details
This function computes intrinsic confidence bands for Xpred_df
if type
= 'quantile' and density bands if type
= 'density', and visualizes the confidence and/or density bands when figure
= TRUE.
Value
a list containing the following lists:
den_list: |
|
quan_list: |
|
cdf_list: |
|
Examples
alpha = 2
beta = 1
n = 50
x1 = runif(n)
t_vec = unique(c(seq(0, 0.05, 0.001), seq(0.05, 0.95, 0.05), seq(0.95, 1, 0.001)))
set.seed(1)
quan_obs = simulate_quantile_curves(x1, alpha, beta, t_vec)
Xfit_df = data.frame(x1 = x1)
res = wass_regress(rightside_formula = ~., Xfit_df = Xfit_df,
Ytype = 'quantile', Ymat = quan_obs, Sup = t_vec)
confidence_Band = confidenceBands(res, Xpred_df = data.frame(x1 = c(-0.5,0.5)),
type = 'both', fig_num = 2)
data(strokeCTdensity)
predictor = strokeCTdensity$predictors
dSup = strokeCTdensity$densitySupport
densityCurves = strokeCTdensity$densityCurve
xpred = predictor[2:3, ]
res = wass_regress(rightside_formula = ~., Xfit_df = predictor,
Ytype = 'density', Ymat = densityCurves, Sup = dSup)
confidence_Band = confidenceBands(res, Xpred_df = xpred, type = 'density', fig_num = 1)
convert density function to quantile and quantile density function
Description
convert density function to quantile and quantile density function
Usage
den2Q_qd(densityCurves, dSup, t_vec)
Arguments
densityCurves |
n-by-m matrix of density curves |
dSup |
length m vector contains the common support grid of the density curves |
t_vec |
common grid for quantile functions |
Calculating innovations in WAR(p) models
Description
This function calculates innovations in WAR(p) models
Usage
getInnovation(quantile, quantile.grid, p)
Arguments
quantile |
A matrix containing all the available quantile functions. Columns represent time indices and rows represent evaluation grid. |
quantile.grid |
A numeric vector, the grid over which quantile functions are evaluated. |
p |
A positive integer, the order of the fitted WAR(p) model. |
Value
A list with
innovation - The tangent space innovations evaluated over the quantile grid. Fitting a WAR(p) model for
n
observations will producen-2p
innovations.cov.surf - The covariance surface
References
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
An internal function used in bootstrap global F test
Description
An internal function used in bootstrap global F test
quick wasserstein regression fitting function with smoothed input
Usage
globalFstat(xfit, Qobs, t_vec)
short_wass_regress(Xfit_df, smoothY)
global F test for Wasserstein regression
Description
global F test for Wasserstein regression
Usage
globalFtest(
wass_regress_res,
alpha = 0.05,
permutation = FALSE,
numPermu = 200,
bootstrap = FALSE,
numBoot = 200
)
Arguments
wass_regress_res |
an object returned by the |
alpha |
type one error rate |
permutation |
logical; perform permutation global F test (default: FALSE) |
numPermu |
number of permutation samples if permutation = TRUE |
bootstrap |
logical; bootstrap global F test (default: FALSE) |
numBoot |
number of bootstrap samples if bootstrap = TRUE |
Details
four methods used to compute p value of global F test
truncated: asymptotic inference, p-value is obtained by truncating the infinite summation of eigenvalues into the first K terms, where the first K terms explain more than 99.99% of the variance.
satterthwaite: asymptotic inference, p-value is computed using Satterthwaite's approximation method of mixtures of chi-square.
permutation: resampling technique; Wasserstein SSR is used as the F statistic.
bootstrap: resampling technique; Wasserstein SSR is used as the F statistic.
Value
a list containing the following fields:
wasserstein.F_stat |
the Wasserstein F statistic value in Satterthwaite method . |
chisq_df |
the degree of freedom of the null chi-square distribution. |
summary_df |
a dataframe containing the following columns: |
method: methods used to compute p value, see details
statistic: the test statistics
critical_value: critical value
p_value: p value of global F test
Examples
data(strokeCTdensity)
predictor = strokeCTdensity$predictors
dSup = strokeCTdensity$densitySupport
densityCurves = strokeCTdensity$densityCurve
res = wass_regress(rightside_formula = ~., Xfit_df = predictor,
Ytype = 'density', Ymat = densityCurves, Sup = dSup)
globalF_res = globalFtest(res, alpha = 0.05, permutation = TRUE, numPermu = 200)
partial F test for Wasserstein regression
Description
partial F test for Wasserstein regression
Usage
partialFtest(reduced_res, full_res, alpha = 0.05)
Arguments
reduced_res |
a reduced model list returned by the |
full_res |
a full model list returned by the |
alpha |
type one error rate |
Details
two methods used to compute p value using asymptotic distribution of F statistic
truncated: asymptotic inference, p-value is obtained by truncating the infinite summation of eigenvalues into the first K terms, where the first K terms explain more than 99.99% of the variance.
satterthwaite: asymptotic inference, p-value is computed using Satterthwaite approximation method of mixtures of chi-square.
Value
a dataframe containing the following columns:
method |
methods used to compute p value, see details |
statistic |
the test statistics |
critical_value |
critical value |
p_value |
p value of global F test |
Examples
data(strokeCTdensity)
predictor = strokeCTdensity$predictors
dSup = strokeCTdensity$densitySupport
densityCurves = strokeCTdensity$densityCurve
full_res <- wass_regress(rightside_formula = ~., Xfit_df = predictor,
Ymat = densityCurves, Ytype = 'density', Sup = dSup)
reduced_res <- wass_regress(~ log_b_vol + b_shapInd + midline_shift + B_TimeCT, Xfit_df = predictor,
Ymat = densityCurves, Ytype = 'density', Sup = dSup)
partialFtable = partialFtest(reduced_res, full_res, alpha = 0.05)
Prediction by WAR(p) models
Description
a method of the WARp class which produces a one-step ahead prediction by WAR(p) models
Usage
## S3 method for class 'WARp'
predict(object, dSup, expSup, ...)
Arguments
object |
A WARp object, the output of |
dSup |
Optional, a numeric vector, the grid over which forecasted cdf/pdf is evaluated. Should be supplied/ignored with |
expSup |
Optional, a numeric vector, the grid over the Exponential map is applied, |
... |
Further arguments passed to or from other methods. |
Value
A list of:
pred.cdf |
predicted cdf |
pred.pdf |
predicted pdf |
dSup |
support of the predicted cdf/pdf |
References
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
See Also
print the summary of WRI object
Description
print the summary of WRI object
Usage
## S3 method for class 'summary.WRI'
print(x, ...)
Arguments
x |
a 'summary.WRI' object |
... |
further arguments passed to or from other methods. |
An internal function to do quadratic program in order to make Qfitted nondescreasing
Description
An internal function to do quadratic program in order to make Qfitted nondescreasing
Usage
quadraticQ(Qmat, t)
convert density function to quantile and quantile density function
Description
convert density function to quantile and quantile density function
Usage
quan2den_qd(quantileCurves, t_vec)
Arguments
quantileCurves |
n-by-m matrix of quantile curves |
t_vec |
length m vector contains the common support grid of the quantile curves |
Simulate quantile curves
Description
This function simulates quantile curves used as a toy example
Usage
simulate_quantile_curves(x1, alpha, beta, t_vec)
Arguments
x1 |
n-by-1 predictor vector |
alpha |
parameter in location transformation |
beta |
parameter in variance transformation |
t_vec |
a length m vector - common grid for all quantile functions |
Value
quan_obs n-by-m matrix of quantile functions
References
Wasserstein F-tests and confidence bands for the Frechet regression of density response curves, Alexander Petersen, Xi Liu and Afshin A. Divani, 2019
Examples
alpha = 2
beta = 1
n = 100
x1 = runif(n)
t_vec = unique(c(seq(0, 0.05, 0.001), seq(0.05, 0.95, 0.05), seq(0.95, 1, 0.001)))
quan_obs = simulate_quantile_curves(x1, alpha, beta, t_vec)
Stroke data: clinical, radiological scalar variables and density curves of the hematoma of 393 stroke patients
Description
Stroke data: clinical, radiological scalar variables and density curves of the hematoma of 393 stroke patients
Format
a list of the following three fields:
- densityCurve:
393-by-101 head CT hematoma densities as distributional response
- densitySupport:
length 101 common support vector
- predictors:
393-by-9 matrix containing 9 scalar predictors
References
Wasserstein F-tests and confidence bands for the Frechet regression of density response curves, Alexander Petersen, Xi Liu and Afshin A. Divani, 2019
Summary Function of Wasserstein Regression Model
Description
Summary Function of Wasserstein Regression Model
Usage
## S3 method for class 'WRI'
summary(object, ...)
Arguments
object |
an object returned by the |
... |
further arguments passed to or from other methods. |
Value
a list containing the following fields:
call |
function call of the Wasserstein regression |
r.square |
Wasserstein |
global_wasserstein_F_stat |
Wasserstein global F test statistic from the Satterthwaite method |
global_F_pvalue |
p value of global F test |
global_wasserstein_F_df |
degrees of freedom of satterthwaite approximated sampling distribution used in global F test |
partial_F_table |
Partial F test for individual effects |
Examples
data(strokeCTdensity)
predictor = strokeCTdensity$predictors
dSup = strokeCTdensity$densitySupport
densityCurves = strokeCTdensity$densityCurve
res <- wass_regress(rightside_formula = ~., Xfit_df = predictor,
Ymat = densityCurves, Ytype = 'density', Sup = dSup)
summary(res)
Generate simulation data
Description
Generate WAR(p) simulation data sets: samples simulated from a WAR(3) model similar to the specification in Section 4 of the referenced paper.
Usage
warSim()
Value
A list of:
sample.ts |
one simulation run chosen from |
sample.ts.full |
1000 simulation runs, each of which consists of a sample time series (of length 100) of quantile functions generated by a WAR(3) model as specified by the reference paper |
quantile.grid |
the grid over which the quantile functions in sample.ts.full are evaluated |
References
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
Compute Wasserstein Coefficient of Determination
Description
Compute Wasserstein Coefficient of Determination
Usage
wass_R2(wass_regress_res)
Arguments
wass_regress_res |
an object returned by the |
Value
Wasserstein R^2
, the Wasserstein coefficient of determination
References
Frechet regression for random objects with Euclidean predictors, Alexander Petersen and Hans-Georg Müller, 2019
Examples
data(strokeCTdensity)
predictor = strokeCTdensity$predictors
dSup = strokeCTdensity$densitySupport
densityCurves = strokeCTdensity$densityCurve
res = wass_regress(rightside_formula = ~., Xfit_df = predictor,
Ymat = densityCurves, Ytype = 'density', Sup = dSup)
wass_r2 = wass_R2(res)
Perform Frechet Regression with the Wasserstein Distance
Description
Perform Frechet Regression with the Wasserstein Distance
Usage
wass_regress(rightside_formula, Xfit_df, Ytype, Ymat, Sup = NULL)
Arguments
rightside_formula |
a right-side formula |
Xfit_df |
n-by-p matrix (or dataframe) of predictor values for fitting (do not include a column for the intercept) |
Ytype |
'quantile' or 'density' |
Ymat |
one of the following matrices:
|
Sup |
one of the following vectors:
|
Value
a list containing the following objects:
call |
function call |
rformula |
|
predictor_names |
names of predictors as the colnames given in the xfit matrix or dataframe. |
Qfit |
n-by-m matrix of fitted quantile functions. |
xfit |
design matrix in quantile fitting. |
Xfit_df |
n-by-p matrix (or dataframe) of predictor values for fitting |
Yobs |
a list containing the following matrices:
|
t_vec |
a length m vector - common grid for all quantile functions in Qobs. |
References
Wasserstein F-tests and confidence bands for the Frechet regression of density response curves, Alexander Petersen, Xi Liu and Afshin A. Divani, 2019
Examples
data(strokeCTdensity)
predictor = strokeCTdensity$predictors
dSup = strokeCTdensity$densitySupport
densityCurves = strokeCTdensity$densityCurve
res1 = wass_regress(rightside_formula = ~., Xfit_df = predictor,
Ytype = 'density', Ymat = densityCurves, Sup = dSup)
res2 = wass_regress(rightside_formula = ~ log_b_vol * weight, Xfit_df = predictor,
Ytype = 'density', Ymat = densityCurves, Sup = dSup)