The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

EDOIF demo

C. Amornbunchornvej

2025-04-28

EXAMPLE#1 Simple Simulation & ordering inference

In the first step, we generate a simple dataset. where C1 and C2 are dominated by C3, C3 is dominated by C4, and is C4 dominated by C5. There is no dominant-distribution relation between C1 and C2.

# Simulation section
nInv<-100
initMean=10
stepMean=20
std=8
simData1<-c()
simData1$Values<-rnorm(nInv,mean=initMean,sd=std)
simData1$Group<-rep(c("C1"),times=nInv)
simData1$Values<-c(simData1$Values,rnorm(nInv,mean=initMean,sd=std) )
simData1$Group<-c(simData1$Group,rep(c("C2"),times=nInv))
simData1$Values<-c(simData1$Values,rnorm(nInv,mean=initMean+2*stepMean,sd=std) )
simData1$Group<-c(simData1$Group,rep(c("C3"),times=nInv) )
simData1$Values<-c(simData1$Values,rnorm(nInv,mean=initMean+3*stepMean,sd=std) )
simData1$Group<-c(simData1$Group, rep(c("C4"),times=nInv) )
simData1$Values<-c(simData1$Values,rnorm(nInv,mean=initMean+4*stepMean,sd=std) )
simData1$Group<-c(simData1$Group, rep(c("C5"),times=nInv) )

The framework is used to analyze the data below.

# Simple ordering inference section
library(EDOIF)
## Loading required package: boot
# parameter setting
bootT=1000 # Number of times of sampling with replacement
alpha=0.05 # significance  significance level

#======= input
Values=simData1$Values
Group=simData1$Group
#=============
A1<-EDOIF(Values,Group,bootT = bootT, alpha=alpha )

We print the result of our framework below.

print(A1) # print results in text
## EDOIF (Empirical Distribution Ordering Inference Framework)
## =======================================================
## Alpha = 0.050000, Number of bootstrap resamples = 1000, CI type = perc
## Using Mann-Whitney test to report whether A ≺ B
## A dominant-distribution network density:0.900000
## Distribution: C2
## Mean:9.482822 95CI:[ 8.006487,11.059085]
## Distribution: C1
## Mean:10.602326 95CI:[ 9.098160,12.081722]
## Distribution: C3
## Mean:51.147217 95CI:[ 49.619133,52.593155]
## Distribution: C4
## Mean:70.054967 95CI:[ 68.498561,71.702038]
## Distribution: C5
## Mean:89.786445 95CI:[ 88.274864,91.203818]
## =======================================================
## Mean difference of C1 (n=100) minus C2 (n=100): C2 ⊀ C1
##  :p-val 0.1459
## Mean Diff:1.119504 95CI:[ -0.893697,3.413313]
## 
## Mean difference of C3 (n=100) minus C2 (n=100): C2 ≺ C3
##  :p-val 0.0000
## Mean Diff:41.664395 95CI:[ 39.696096,43.744448]
## 
## Mean difference of C4 (n=100) minus C2 (n=100): C2 ≺ C4
##  :p-val 0.0000
## Mean Diff:60.572145 95CI:[ 58.410548,62.736461]
## 
## Mean difference of C5 (n=100) minus C2 (n=100): C2 ≺ C5
##  :p-val 0.0000
## Mean Diff:80.303623 95CI:[ 78.221824,82.315763]
## 
## Mean difference of C3 (n=100) minus C1 (n=100): C1 ≺ C3
##  :p-val 0.0000
## Mean Diff:40.544891 95CI:[ 38.556311,42.745753]
## 
## Mean difference of C4 (n=100) minus C1 (n=100): C1 ≺ C4
##  :p-val 0.0000
## Mean Diff:59.452641 95CI:[ 57.287014,61.693098]
## 
## Mean difference of C5 (n=100) minus C1 (n=100): C1 ≺ C5
##  :p-val 0.0000
## Mean Diff:79.184119 95CI:[ 77.030834,81.361752]
## 
## Mean difference of C4 (n=100) minus C3 (n=100): C3 ≺ C4
##  :p-val 0.0000
## Mean Diff:18.907750 95CI:[ 16.712224,21.072367]
## 
## Mean difference of C5 (n=100) minus C3 (n=100): C3 ≺ C5
##  :p-val 0.0000
## Mean Diff:38.639228 95CI:[ 36.470177,40.608937]
## 
## Mean difference of C5 (n=100) minus C4 (n=100): C4 ≺ C5
##  :p-val 0.0000
## Mean Diff:19.731478 95CI:[ 17.459604,21.995552]

The first plot is the plot of mean-difference confidence intervals

plot(A1,options =1)

plot of chunk Fig1

The second plot is the plot of mean confidence intervals

plot(A1,options =2)

plot of chunk Fig2 The third plot is a dominant-distribution network.

out<-plot(A1,options =3)

plot of chunk Fig3

EXAMPLE#2 Non-normal-Distribution Simulation & ordering inference

We generate more complicated dataset of mixture distributions. C1, C2, C3, and C4 are dominated by C5. There is no dominant-distribution relation among C1, C2, C3, and C4.

library(EDOIF)
# parameter setting
bootT=1000
alpha=0.05
nInv<-1200

start_time <- Sys.time()
#======= input
simData3<-SimNonNormalDist(nInv=nInv,noisePer=0.01)
Values=simData3$Values
Group=simData3$Group
#=============
A3<-EDOIF(Values,Group, bootT=bootT, alpha=alpha, methodType ="perc")
A3
## EDOIF (Empirical Distribution Ordering Inference Framework)
## =======================================================
## Alpha = 0.050000, Number of bootstrap resamples = 1000, CI type = perc
## Using Mann-Whitney test to report whether A ≺ B
## A dominant-distribution network density:0.400000
## Distribution: C1
## Mean:81.595459 95CI:[ 78.669686,84.224662]
## Distribution: C2
## Mean:82.015780 95CI:[ 79.979614,83.526797]
## Distribution: C3
## Mean:82.717153 95CI:[ 81.161702,84.228813]
## Distribution: C4
## Mean:83.856352 95CI:[ 79.893810,88.627838]
## Distribution: C5
## Mean:141.992477 95CI:[ 140.284726,143.506157]
## =======================================================
## Mean difference of C2 (n=1200) minus C1 (n=1200): C1 ⊀ C2
##  :p-val 0.1943
## Mean Diff:0.420322 95CI:[ -2.686553,3.689832]
## 
## Mean difference of C3 (n=1200) minus C1 (n=1200): C1 ⊀ C3
##  :p-val 0.2110
## Mean Diff:1.121694 95CI:[ -1.892079,4.309357]
## 
## Mean difference of C4 (n=1200) minus C1 (n=1200): C1 ⊀ C4
##  :p-val 0.7774
## Mean Diff:2.260893 95CI:[ -2.429935,7.494195]
## 
## Mean difference of C5 (n=1200) minus C1 (n=1200): C1 ≺ C5
##  :p-val 0.0000
## Mean Diff:60.397018 95CI:[ 57.479791,63.729265]
## 
## Mean difference of C3 (n=1200) minus C2 (n=1200): C2 ⊀ C3
##  :p-val 0.5158
## Mean Diff:0.701372 95CI:[ -1.581631,3.069064]
## 
## Mean difference of C4 (n=1200) minus C2 (n=1200): C2 ⊀ C4
##  :p-val 0.9496
## Mean Diff:1.840572 95CI:[ -2.506555,6.869621]
## 
## Mean difference of C5 (n=1200) minus C2 (n=1200): C2 ≺ C5
##  :p-val 0.0000
## Mean Diff:59.976697 95CI:[ 57.627067,62.462598]
## 
## Mean difference of C4 (n=1200) minus C3 (n=1200): C3 ⊀ C4
##  :p-val 0.9441
## Mean Diff:1.139199 95CI:[ -2.865575,6.386407]
## 
## Mean difference of C5 (n=1200) minus C3 (n=1200): C3 ≺ C5
##  :p-val 0.0000
## Mean Diff:59.275324 95CI:[ 56.992640,61.560018]
## 
## Mean difference of C5 (n=1200) minus C4 (n=1200): C4 ≺ C5
##  :p-val 0.0000
## Mean Diff:58.136125 95CI:[ 52.704495,62.225454]
plot(A3)

plot of chunk Fig4plot of chunk Fig4plot of chunk Fig4

end_time <- Sys.time()
end_time - start_time
## Time difference of 1.545051 secs

Uniform noise

Generating \(A\) dominates \(B\) with different degrees of uniform noise

library(ggplot2)

nInv<-1000
simData3<-SimNonNormalDist(nInv=nInv,noisePer=0.01)
#plot(density(simData3$V3))

dat <- data.frame(dens = c(simData3$V3, simData3$V5)
                   , lines = rep(c("B", "A"), each = nInv))
#Plot.
p1<-ggplot(dat, aes(x = dens, fill = lines)) + geom_density(alpha = 0.5) +xlim(-400, 400)+ ylim(0, 0.07) + ylab("Density [0,1]") +xlab("Values") + theme( axis.text.x = element_text(face="bold",  
                                      size=12) )
theme_update(text = element_text(face="bold", size=12)  )
p1$labels$fill<-"Categories"
plot(p1)
## Warning: Removed 4 rows containing non-finite outside the scale range
## (`stat_density()`).

plot of chunk Fig5

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.