The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Introduction to RoBTT

Maximilian Maier & František Bartoš

2024-11-09

Introduction and Example

This vignette introduces the basic functionality of the RoBTT package using a classic example from Ronald Fishers Design of Experiments (Fisher, 1971). Fisher illustrates the Student’s \(t\)-test on an example of self fertilized and cross fertilized plants as coded by Darwin. In other words, one group of plants was fertilized by gametes (sex-cells) from the same individual (self), while the other group of plants was fertilized by gametes from the same species (cross). Darwin himself concludes:

I may premise that if we took by chance a dozen men belonging to two nations and measured them, it would I presume be very rash to form any judgment from such small number on their average heights. But the case is somewhat different with crossed and self-fertilized plants, as they were exactly the same age, were subjected from first to last to the same conditions, and were descended from the same parents. p. 28 (Fisher, 1971).

Not knowing much about statistics he gives his data to Francis Galton, who proceeds to order the plants by height as a method of statistical analysis and concludes:

The observations as I received them […] certainly have no prima facie appearance of regularity. But as soon as we arrange them in the order of their magnitudes […] the case is materially altered. We now see, with few exceptions, that the largest plant on the crossed side in each pot exceeds the largest plant on the self-fertilised side, that the second exceeds the second, the third the third, and so on. . p. 29 (Fisher, 1971).

Galton then forwarded the data to Fisher, who later analyzed it with a frequentist Student’s \(t\)-test and states:

The observed value of \(t\), 2.148, thus just exceeds the 5 per cent point, and the experimental result may be judged significant, though barely so. .

We can load the included data with the code below

library(RoBTT)
data("fertilization", package = "RoBTT")
head(fertilization)
#>   Crossed   Self
#> 1  23.500 17.375
#> 2  12.000 20.375
#> 3  21.000 20.000
#> 4  22.000 20.000
#> 5  19.125 18.375
#> 6  21.500 18.625

Equal Variance t-test

We reanalyze the data using a Bayesian equal variance \(t\)-test as follows.

## fit equal variance t-test
fit_01 <- RoBTT(
    x1       = fertilization$Self,
    x2       = fertilization$Crossed,
    parallel = TRUE,
    prior_delta = prior("cauchy", list(0, 1/sqrt(2))),
    prior_rho   = NULL, # this indicates no prior on the variance allocation factor -> equal variance test 
    prior_nu    = NULL, # this indicates no prior on the degrees of freedom -> normal distribution test
    seed        = 0
)

We can retrieve a summary of the estimates with the code below. This function gives us the posterior mean and median estimate for the effect in terms of Cohen’s \(d\) and 95% central credible interval. In addition, it shows the estimates for mean and variance in each of the groups.

# get overall summary
summary(fit_01)
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2))), prior_rho = NULL, prior_nu = NULL, parallel = TRUE, 
#>     seed = 0)
#> 
#> Robust Bayesian t-test
#> Components summary:
#>        Models Prior prob. Post. prob. Inclusion BF
#> Effect    1/2       0.500       0.744        2.902
#> 
#> Model-averaged estimates:
#>        Mean Median 0.025 0.975
#> delta 0.529  0.537 0.000 1.398

We can also look at the estimates for the individual models and retrieve their prior distributions. The Bayesian equal variance \(t\)-test uses two models: One assuming absence of an effect (as indicated by prior \(\delta = \text{Spike}(0)\)) assuming presence of an effect (as indicated by prior \(\delta = \text{Cauchy}(0, 0.71)\)). Both models assume equal variances. The inclusion Bayes factor shows the evidence for the inclusion of the individual models. In this case the Bayes factor for model 2 (assuming the difference in mean) is equivalent to the Bayes factor for the presence of an effect.

# get individual model summaries
summary(fit_01, type = "models")
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2))), prior_rho = NULL, prior_nu = NULL, parallel = TRUE, 
#>     seed = 0)
#> 
#> Robust Bayesian t-test
#> Models overview:
#>  Model Distribution   Prior delta   Prior prob. log(marglik) Post. prob.
#>      1       normal        Spike(0)       0.500       -76.82       0.256
#>      2       normal Cauchy(0, 0.71)       0.500       -75.76       0.744
#>  Inclusion BF
#>         0.345
#>         2.902

Finally, we can use the conditional = TRUE argument to retrieve the estimates only for the models assuming an effect (i.e., without the spike).

summary(fit_01, conditional = TRUE)
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2))), prior_rho = NULL, prior_nu = NULL, parallel = TRUE, 
#>     seed = 0)
#> 
#> Robust Bayesian t-test
#> Components summary:
#>        Models Prior prob. Post. prob. Inclusion BF
#> Effect    1/2       0.500       0.744        2.902
#> 
#> Model-averaged estimates:
#>        Mean Median 0.025 0.975
#> delta 0.529  0.537 0.000 1.398
#> 
#> Conditional estimates:
#>        Mean Median 0.025 0.975
#> delta 0.713  0.703 0.008 1.454

Unequal Variance t-test

We can also analyze the example using an unequal variance \(t\)-test as follows.

## fit equal unvariance t-test

fit_10 <- RoBTT(
    x1       = fertilization$Self,
    x2       = fertilization$Crossed,
    parallel = TRUE,
    prior_delta    = prior("cauchy", list(0, 1/sqrt(2))),
    prior_rho      = prior("beta",   list(3, 3)), #prior on variance allocation
    prior_rho_null = NULL, # remove models assuming equal variance
    prior_nu       = NULL,
    seed           = 0
)

We again retrieve the summary; however, this time we also get a Bayes factor for unequal variances (heterogeneity) and an estimate for \(\rho\) (the precision proportion).

summary(fit_10)
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2))), prior_rho = prior("beta", list(3, 3)), 
#>     prior_nu = NULL, prior_rho_null = NULL, parallel = TRUE, 
#>     seed = 0)
#> 
#> Robust Bayesian t-test
#> Components summary:
#>               Models Prior prob. Post. prob. Inclusion BF
#> Effect           1/2       0.500       0.746        2.940
#> Heterogeneity    2/2       1.000       1.000          Inf
#> 
#> Model-averaged estimates:
#>        Mean Median 0.025 0.975
#> delta 0.539  0.546 0.000 1.421
#> rho   0.682  0.693 0.458 0.851

We can again receive the individual model estimates. The Prior rho column now contains a beta distribution rather than spike at zero. This indicates that we used the models assuming differences in variances.

summary(fit_10, type = "models")
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2))), prior_rho = prior("beta", list(3, 3)), 
#>     prior_nu = NULL, prior_rho_null = NULL, parallel = TRUE, 
#>     seed = 0)
#> 
#> Robust Bayesian t-test
#> Models overview:
#>  Model Distribution   Prior delta    Prior rho Prior prob. log(marglik)
#>      1       normal        Spike(0) Beta(3, 3)       0.500       -75.96
#>      2       normal Cauchy(0, 0.71) Beta(3, 3)       0.500       -74.88
#>  Post. prob. Inclusion BF
#>        0.254        0.340
#>        0.746        2.940

Model-Averaging Over Equal and Unequal Variances

The benefit of RoBTT in comparison to other commonly used \(t\)-tests is that it does not require us to rely on a single model but instead allows researchers to take different models into account at the same time. In other words, RoBTT uses Bayesian model-averaging to allow the data to guide the inference to be based most strongly on those models that predicted the data best. We can fit a \(t\)-test that model-averages over equal and unequal variances with the code provided below.

# fit t-test model averaging over equal and unequal variances

fit_1 <- RoBTT(
    x1       = fertilization$Self,
    x2       = fertilization$Crossed,
    parallel = TRUE,
    prior_delta = prior("cauchy", list(0, 1/sqrt(2))),
    prior_rho   = prior("beta",   list(3, 3)),
    prior_nu    = NULL, 
    seed        = 1,
    control     = set_control(adapt_delta = 0.95)
)

We retrieve the overall estimates as before using the summary function.

summary(fit_1)
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2))), prior_rho = prior("beta", list(3, 3)), 
#>     prior_nu = NULL, parallel = TRUE, control = set_control(adapt_delta = 0.95), 
#>     seed = 1)
#> 
#> Robust Bayesian t-test
#> Components summary:
#>               Models Prior prob. Post. prob. Inclusion BF
#> Effect           2/4       0.500       0.746        2.936
#> Heterogeneity    2/4       0.500       0.705        2.393
#> 
#> Model-averaged estimates:
#>        Mean Median 0.025 0.975
#> delta 0.540  0.553 0.000 1.414
#> rho   0.630  0.632 0.486 0.845

When looking at the individual models, we see that four models (instead of two) are incorporated. These are the two equal variance models (from the first test in this vignette) and the two unequal variance models (from the second test in this vignette). The inclusion Bayes factors tell us the evidence for each of these individual models. We see that the model assuming an effect and unequal variances has the highest predictive accuracy.

summary(fit_1, type = "models")
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2))), prior_rho = prior("beta", list(3, 3)), 
#>     prior_nu = NULL, parallel = TRUE, control = set_control(adapt_delta = 0.95), 
#>     seed = 1)
#> 
#> Robust Bayesian t-test
#> Models overview:
#>  Model Distribution   Prior delta    Prior rho Prior prob. log(marglik)
#>      1       normal        Spike(0) Spike(0.5)       0.250       -76.82
#>      2       normal        Spike(0) Beta(3, 3)       0.250       -75.96
#>      3       normal Cauchy(0, 0.71) Spike(0.5)       0.250       -75.76
#>      4       normal Cauchy(0, 0.71) Beta(3, 3)       0.250       -74.88
#>  Post. prob. Inclusion BF
#>        0.076        0.245
#>        0.179        0.652
#>        0.219        0.842
#>        0.527        3.338

Model-Averaging Over Robust t-likelihoods

A common problem for t-tests that assume the normality of residuals is outliers. When outliers are present, the estimates of these methods can be driven overly strongly by those outliers. A common way to accommodatee outliers is by also model-averaging over robust \(t\)-likelihoods. The t-distribution has thicker tails than the normal distribution; therefore, the estimate under the \(t\)-likelihood will be affected less due to outliers than the estimate using a normal likelihood. We can fit RoBTT that also model-averages over \(t\)-likelihoods as follows.


fit_2 <- RoBTT(
    x1       = fertilization$Self,
    x2       = fertilization$Crossed,
    parallel = TRUE,
    prior_delta = prior("cauchy", list(0, 1/sqrt(2))),
    prior_rho   = prior("beta",   list(3, 3)),
    prior_nu    = prior("exp",    list(1)), # prior on degrees of freedom
    seed        = 2
)

When looking at the individual model estimates incorporating \(t\)-likelihoods, we find that the four models from the previous t-test are incorporated a second time, however, with t-likelihood in the likelihood column. In other words, we now use eight models overall.

summary(fit_2, type = "models")
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2))), prior_rho = prior("beta", list(3, 3)), 
#>     prior_nu = prior("exp", list(1)), parallel = TRUE, seed = 2)
#> 
#> Robust Bayesian t-test
#> Models overview:
#>  Model Distribution   Prior delta    Prior rho    Prior nu    Prior prob.
#>      1       normal        Spike(0) Spike(0.5)                      0.125
#>      2            t        Spike(0) Spike(0.5) Exponential(1)       0.125
#>      3       normal        Spike(0) Beta(3, 3)                      0.125
#>      4            t        Spike(0) Beta(3, 3) Exponential(1)       0.125
#>      5       normal Cauchy(0, 0.71) Spike(0.5)                      0.125
#>      6            t Cauchy(0, 0.71) Spike(0.5) Exponential(1)       0.125
#>      7       normal Cauchy(0, 0.71) Beta(3, 3)                      0.125
#>      8            t Cauchy(0, 0.71) Beta(3, 3) Exponential(1)       0.125
#>  log(marglik) Post. prob. Inclusion BF
#>        -76.82       0.006        0.042
#>        -77.80       0.002        0.016
#>        -75.96       0.014        0.101
#>        -77.33       0.004        0.025
#>        -75.76       0.017        0.124
#>        -72.35       0.525        7.751
#>        -74.88       0.042        0.305
#>        -72.65       0.389        4.463
#> Model (6): There were 2 divergent transitions.

Perinull testing

It is sometimes argued that point null hypotheses are not meaningful to test as evidence for the alternative could already be provided by deviations from the null that would be considered small or uninteresting. RoBTT addresses this concern by allowing researchers to specify perinull hypotheses (i.e., a prior distribution with small sd instead of the point null).

fit_3 <- RoBTT(
    x1       = fertilization$Self,
    x2       = fertilization$Crossed,
    parallel = TRUE,
    prior_delta      = prior("cauchy", list(0, 1/sqrt(2)), list(0, Inf)),
    prior_rho        = prior("beta",   list(3, 3)),
    prior_nu         = prior("exp",    list(1)),
    prior_delta_null = prior("normal", list(0, 0.15), list(0, Inf)), #prior distribution and truncation
    seed             = 3
)

When looking at the individual models we see that the models that previously had Spike(0.5) are now using normal(0, 0.15) prior from 0 to infinity. This indicates the shift from point null to perinull testing.

summary(fit_3, type = "models")
#> Call:
#> RoBTT(x1 = fertilization$Self, x2 = fertilization$Crossed, prior_delta = prior("cauchy", 
#>     list(0, 1/sqrt(2)), list(0, Inf)), prior_rho = prior("beta", 
#>     list(3, 3)), prior_nu = prior("exp", list(1)), prior_delta_null = prior("normal", 
#>     list(0, 0.15), list(0, Inf)), parallel = TRUE, seed = 3)
#> 
#> Robust Bayesian t-test
#> Models overview:
#>  Model Distribution       Prior delta        Prior rho    Prior nu   
#>      1       normal Normal(0, 0.15)[0, Inf] Spike(0.5)               
#>      2            t Normal(0, 0.15)[0, Inf] Spike(0.5) Exponential(1)
#>      3       normal Normal(0, 0.15)[0, Inf] Beta(3, 3)               
#>      4            t Normal(0, 0.15)[0, Inf] Beta(3, 3) Exponential(1)
#>      5       normal Cauchy(0, 0.71)[0, Inf] Spike(0.5)               
#>      6            t Cauchy(0, 0.71)[0, Inf] Spike(0.5) Exponential(1)
#>      7       normal Cauchy(0, 0.71)[0, Inf] Beta(3, 3)               
#>      8            t Cauchy(0, 0.71)[0, Inf] Beta(3, 3) Exponential(1)
#>  Prior prob. log(marglik) Post. prob. Inclusion BF
#>        0.125       -76.06       0.006        0.040
#>        0.125       -73.76       0.058        0.428
#>        0.125       -75.20       0.014        0.097
#>        0.125       -73.91       0.049        0.364
#>        0.125       -75.10       0.015        0.107
#>        0.125       -71.65       0.473        6.274
#>        0.125       -74.21       0.037        0.266
#>        0.125       -71.95       0.349        3.757

Conclusions

This vignette showcased the applications of the RoBTT R package. For methodological background see Maier et al. (2024) (https://doi.org/10.3758/s13423-024-02590-5).

References

Fisher, R. A. (1971). The design of experiments. Oliver & Boyd, Edinburgh & London.
Maier, M., Bartoš, F., Quintana, D. S., Bergh, D. van den, Marsman, M., Ly, A., & Wagenmakers, E.-J. (2024). Model-averaged Bayesian t-tests. Psychonomic Bulletin & Review. https://doi.org/10.3758/s13423-024-02590-5

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.