The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

QuantileNPCI

Nicholas Hutson

8/20/2019

library(QuantileNPCI)
library(dplyr)
library(kableExtra)

quantCI()

This function can calculate nonparametric confidence intervals for quantiles using fractional order statistics, based on (Hutson 1999).

We use the flood data presented in Hutson 1999 as an example. The data were saved in the dataset flood in this package.

##The consecutive annual flood discharge rates of the Feather River at Oroville, CA
data1 <- flood[flood$loc=="Feather", "discharge"]

##The consecutive annual discharge rates of  the Blackstone River at Woonsocket, RI
data2 <- flood[flood$loc=="Blackstone", "discharge"]

Exact method

quant <- .5
alpha <- .05
q1 <- quantCI(data1,quant,alpha, method = "exact")
q1
#> $u1
#> [1] 0.3750191
#> 
#> $u2
#> [1] 0.6249809
#> 
#> $lower.ci
#> 37.5019086556212th percentile  
#>                          42400 
#> 
#> $qx
#> 50th percentile  
#>            59200 
#> 
#> $upper.ci
#> 62.4980913443788th percentile  
#>                       80699.31
q2 <- quantCI(data2,quant,alpha, method = "exact")
q2
#> $u1
#> [1] 0.3441421
#> 
#> $u2
#> [1] 0.6558579
#> 
#> $lower.ci
#> 34.4142116028878th percentile  
#>                       4511.548 
#> 
#> $qx
#> 50th percentile  
#>             5300 
#> 
#> $upper.ci
#> 65.5857883971122th percentile  
#>                       5763.746

Reproduce Table 8: The 95% confidence intervals for the median flood rates)

df <- cbind(as.data.frame(table(flood$loc)), 
            rbind(unlist(q1),unlist(q2))) %>% 
  dplyr::rename(River=1, n=2, u1=3, u2=4, lower=5, middle=6, upper=7)

df %>% 
  dplyr::mutate(u1=round(u1,5), u2=round(u2,5)) %>% 
  dplyr::mutate(CI=paste("(", round(lower,2), ", ", round(upper,2), ")", sep = "")) %>% 
  dplyr::select(River:u2, CI) %>% 
  knitr::kable(align=rep('c', 5)) %>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),full_width = F, position = "center",font_size = 10)
River n u1 u2 CI
Feather 59 0.37502 0.62498 (42400, 80699.31)
Blackstone 37 0.34414 0.65586 (4511.55, 5763.75)

Approximate Method

quantCI(data1,quant,alpha, method = "approximate")
#> $u1
#> [1] 0.3749825
#> 
#> $u2
#> [1] 0.6250175
#> 
#> $lower.ci
#> 37.4982549734976th percentile  
#>                          42400 
#> 
#> $qx
#> 50th percentile  
#>            59200 
#> 
#> $upper.ci
#> 62.5017450265024th percentile  
#>                       80700.63
quantCI(data2,quant,alpha, method = "approximate")
#> $u1
#> [1] 0.3439968
#> 
#> $u2
#> [1] 0.6560032
#> 
#> $lower.ci
#> 34.3996815665766th percentile  
#>                       4511.438 
#> 
#> $qx
#> 50th percentile  
#>             5300 
#> 
#> $upper.ci
#> 65.6003184334234th percentile  
#>                       5764.905

Method summary

quantCI

For the quantCI function, there are two methods that can be specified to calculate the confidence interval specified. The “exact” method solves for the percentiles numerically, while the “approximate” method uses an approximation that may be faster with large sets of data.

If the “approximate” method is specified, let \(n\) be the number of non-missing values for a variable, and let \(x_{1},x_{2},...,x_{n}\) represent the ordered values of the variable. Let the \(t^{th}\) percentile be \(y\), \(p = \frac{t}{100}\), and let \((n+1)p = j + g\), where \(j\) is the integer part of \(n(p+1)\), and \(g\) is the fractional part of \(n(p+1)\). Then:

\[y = (1-g)x_{j} + gx_{j+1}\]

If the “exact” method is specified, let \(u_{1}\) be the lower percentile, \(u_{2}\) be the upper percentile, \(0 < u_{1} < u_{2} < 1\), and \(n^{'} = n + 1\). \(I_{u}(a,b)\) is the incomplete beta function. Then:

\[I_{u}[n^{'}u_{1},n^{'}(1-u_{1})] = 1 - \alpha/2\]

\[I_{u}[n^{'}u_{2},n^{'}(1-u_{2})] = \alpha/2\]

\[y = (1-g)x_{j} + gx_{j+1}\]

The function returns a list of 5 values: the lower/upper confidence limit of the quantile, the estimated data value at the quantile and its lower/upper bound of the confidence interval.

References

SAS Institute (2013) https://support.sas.com/documentation/cdl/en/procstat/66703/HTML/default/viewer.htm#procstat_univariate_details13.htm The UNIVARIATE Procedure, Base SAS(R) 9.4 Procedures Guide: Statistical Procedures, Second Edition.

Hutson, Alan D. 1999. “Calculating Nonparametric Confidence Intervals for Quantiles Using Fractional Order Statistics.” Journal of Applied Statistics 26 (3): 343–53. https://doi.org/10.1080/02664769922458.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.