Randomness Tests for Linear Data

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Shriya Gehlot and Arnab Kumar Laha

2025-08-25

Introduction

The GTRT package aims to provide an efficient and user-friendly framework to conduct the randomness tests for linear and circular data as proposed in Gehlot and Laha (2025a) and Gehlot and Laha (2025b), respectively. This vignette is designed to explain the functioning of randomness tests for linear data. You can load the package as follows.

library(GTRT)

Statistical inference and decision-making often rely on the assumption of randomness or independence in data — but how can we verify this assumption? Gehlot and Laha (2025a) introduce two novel randomness tests based on Random Interval Graphs (RIGs), demonstrating high accuracy and strong practical relevance in real-world applications.

These tests leverage two key properties of RIGs: edge probability and vertex degree distribution. Using these properties, the authors develop two tests of randomness:

RIG-Edge Probability (RIG-EP)
RIG-Degree Distribution (RIG-DD)

This vignette describes the working of both tests in detail.

RIG-EP Test

Let \(G\) be an RIG formed using the given observations and \(p\) be the probability that an edge between two randomly chosen vertices in \(G\) does not exist. If the observations are mutually independent, then \(p=\frac{1}{3}\). Thus, we test \(H_0 : p=\frac{1}{3}\) against the alternative \(H_1: p \neq \frac{1}{3}\), to test for randomness. Let \(\hat{p}\) be the proportion of pairs whose vertices are not joined by an edge, i.e., the proportion of non-intersecting pairs (nip).

The nip.rig() function calculates the value of \(\hat{p}\) for a given set of observations. It takes following parameters:

s - Start points of intervals
t - End points of intervals
e1 - Vector of indices for the first interval in each pair.
e2 - Vector of indices for the second interval in each pair.

s <- runif(10,0,1) # Starting points of 10 interval
t <- runif(10,0,1) # End points of intervals
e1 <- c(2,10,6,1,5) # Indices for the first interval in 5 pairs formed unsing above 10 intervals.
e2 <- c(4,3,8,7,9) # Indices for the second interval in 5 pairs formed unsing above 10 intervals.
nip.rig(s,t,e1,e2)

The rigep.test() function takes a vector \(\vec{y}\) as input and performs the RIG-EP randomness test. It returns the estimated prob. of non-intersection (\(\hat{p}\)), cutoff \(C\) for the value of \(|\hat{p}-\frac{1}{3}|\) to reject the null hypothesis of randomness at the level of significance \(\alpha\) when \(|\hat{p}-\frac{1}{3}|\)>C and adjusted p-values obtained using Benjamini-Hochberg correction for multiple testing.

y <- arima.sim(model = list(ar=0.9), 1000) ## AR(1) model
rigep.test(y,0.05)

## 
## RIG-EP Test
## 
## data: y
## prob. of non-intersection: 0.836 
## Adj p-values: 8.866e-64 
## Cutoff: 0.05843

RIG-DD Test

Let \(G\) be an RIG formed using the given observations, \(\hat{F_n}\) be the empirical vertex degree distribution of this graph \(G\) and \(F^*\) be the theoretical vertex degree distribution of RIG with \(n\) vertices.

The cdf.rig() function takes number of observations (\(m\)) and calculates the theoretical vertex degree distribution (\(F^*\)) of RIG with \(n\) vertices. where \(m=2n\) or \(m=2n+1\).

cdf.rig(1000)

The deg.rig() function calculates the degrees of each vertex of the RIG \(G\) formed constructed from the given observations.

y <- arima.sim(model = list(ar=0.7), 1000) ## AR(1) model
deg.rig(y)

The rigdd.test() function takes a vector \(\vec{y}\) and performs the RIG-DD randomness test. It computes the distance between \(\hat{F_n}\) and \(F^*\) as the test statistic, following the procedure described in Gehlot and Laha (2025a). The function returns the value(s) of the test statistic and rejects the null hypoethesis of randomness if any of the test statistics exceeds \(C_\alpha\), where \(C_\alpha\) is the threshold at the level of significance \(\alpha\), caluclated using the thrsd.rigdd() function.

y <- arima.sim(model = list(ar=c(0.7,0.2)), 1000) ## AR(2) model
rigdd.test(y)

## 
## RIG-DD Test
## 
## Statistic = 6.8129 
## Reject null hypothesis of randomness if the value(s) of any of the test statistic > C.
##     Calculate C using thrsd.rigdd() function.

The thrsd.rigdd() function calculates the value of threshold \(C_\alpha\) for the RIG-DD test at the level of significance \(\alpha\) using simulations. It takes parameters:

m - number of observations
n_iter - number of simulation iterations
alpha - level of significance

A table for threshold values (\(C_\alpha\)) of the RIG-DD test at level of significance \(\alpha\) for various sample sizes \(m\) can be found in Gehlot and Laha (2025a).

thrsd.rigdd(500,1000,0.05)

Real World Example

We use the elecdaily_mts dataset from the timeSeriesDataSets package (Rossi, 2024), which reports daily electricity demand (in megawatts, MW) for Victoria, Australia, in 2014, and apply both the RIG-EP and RIG-DD tests to this data.

library(timeSeriesDataSets)
data(elecdaily_mts)
x <- elecdaily_mts[,1]
rigep.test(x,0.05)

## 
## RIG-EP Test
## 
## data: x
## prob. of non-intersection: 0.67032967032967, 0.681318681318681 
## Adj p-values: 9.1362e-12, 3.7936e-12 
## Cutoff: 0.09685

The output of the rigep.test() includes the estimated probability of non-intersection (\(\hat{p}\)), the cutoff value \(C\) for rejecting the null hypothesis of randomness when \(|\hat{p} - \tfrac{1}{3}| > C\), and the corresponding adjusted p-values. For a detailed explanation, see Section RIG-EP Test.

rigdd.test(x)

## 
## RIG-DD Test
## 
## Statistic = 3.0851, 2.9029 
## Reject null hypothesis of randomness if the value(s) of any of the test statistic > C.
##     Calculate C using thrsd.rigdd() function.

The output of the rigdd.test() includes the value(s) of the test statistic and the corresponding decision rule for rejecting the null hypothesis of randomness. For a detailed explanation, see Section RIG-DD Test.

thrsd.rigdd(length(x),500,c(0.05,0.01))

##       95%       99% 
## 0.6767826 0.8558129

References

Gehlot, S. and Laha, A. K. (2025a). Evaluating randomness assumption: A novel graph theoretic approach. arXiv preprint arXiv:2506.21157.
Gehlot, S. and Laha, A. K. (2025b). New tests of randomness for circular data. arXiv preprint arXiv:2506.23522.
Rossi, R. C. (2024). timeSeriesDataSets: Time Series Data Sets for R. R package version 0.1.0, URL: https://github.com/lightbluetitan/timeseriesdatasets_R.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.