The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Pairwise Comparisons with Significance Brackets

Dave Armstrong and William Poirier

2025-12-03

Significance brackets are lines or brackets that connect two estimates with a notation indicating the significance (or lack thereof) of the difference between the two estimates. These are commonly used in plots and can make comparisons easier to understand. When considering any pair of estimates with confidence intervals, there are three states the pair of intervals could be in:

  1. Non-overlapping: non-overlapping intrevals indicate a significant difference between the two estimates.
  2. Overlapping, but Significant: in some cases, estimates with overlapping intervals can nonetheless be significantly different from each other. The amount of overlap is not necessarily indicative of a difference or not - that depends on the standard errors and covariance of the estimates.
  3. Overlapping and Not Significant: in other cases, estimates with overlapping confidence intervals are not significantly different from each other.

We propose that significance brackets could be used to identify either the second or third groups. First, no significance brackets are needed for intervals that do not overlap, their significance is obvious. So, significance brackets only need to be used on pairs of estimates with overlapping intervals. To minimize visual clutter, we could flag overlapping pairs of intervals that are statistically significantly different from each other noting that all un-flagged pairs are not statistically different from each other. Likewise, the converse could be done - flagging insignificant differences and noting that all other overlapping pairs are significant. The idea would be to use whichever approach generates fewer intervals. That said, even this method does not scale particularly well as the number of intervals gets large. We demonstrate this idea on several examples below.

Chick Weights

We first demonstrate the procedure using the chickwts built-in dataset. The first step is to generate the estimates, we predict chicken weight (weight) with feed type (feed). To make the display easiest to read, we re-rorder the feed type factor by the average weight, which will make the intervals decreasing in their average.

data(chickwts)
chickwts$feed <- reorder(chickwts$feed, 
                         chickwts$weight, 
                         FUN=mean)
chick_mod <- lm(weight~ feed, data=chickwts)

In the package, we have a function called make_annotations() that makes a list of annotations amenable for use in geom_signif() from the package. First, the user must execute the viztest() function on the estimates.

library(marginaleffects)
library(VizTest)
## make predictions 
chick_preds <- predictions(chick_mod, variables="feed", by="feed")

## save predicted values in chick_b
chick_b <- coef(chick_preds)

## set names of predicted values
names(chick_b) <- chick_preds$feed

## make into visual testing data
chick_vt_data <- make_vt_data(est=chick_b, vcov(chick_preds))

## execute viztest function on predictions
chick_vt <- viztest(chick_vt_data, include_zero=FALSE)

The default for make_annotations() is to figure out which approach will produce fewer annotations - flagging overlapping insignificant differences or flagging overlapping significant differences and then return the one with fewer annotations. This is the type="auto" option. If you choose type="significant" the function will return annotations for the overlapping significant differences and if you choose type="insignificant" it will return annotations for the overlapping insignificant differences. Here, we use the default type="auto" option.

chick_annots <- make_annotations(chick_vt)
chick_annots
#> $annotations
#> [1] "NS" "NS" "NS"
#> 
#> $y_position
#> sunflower  meatmeal   soybean 
#>  371.6379  321.0103  286.8477 
#> 
#> $xmin
#> [1] "casein"  "soybean" "linseed"
#> 
#> $xmax
#> [1] "sunflower" "meatmeal"  "soybean"

Now, we can use the annotations as input to geom_signif() to add the significance brackets to a plot of the estimates. Note that make_annotations() makes a list that has named elements with names that are the same as the arguments to geom_signif(). The easiest way to use those as arguments is to use do.call() as shown below. For the uninitiated, do.call() takes as its first argument an unevaluated function (like geom_signif) and as its second argument a named list of arguments to be passed to that function.

library(ggplot2)
library(ggsignif)
ggplot(chick_preds, aes(x=feed, y=estimate)) +
  geom_pointrange(aes(ymin=conf.low, ymax=conf.high)) +
  do.call(geom_signif, chick_annots) +
  labs(
    x = "Feed Type",
    y = "Predicted Weight"
  ) + 
  theme_bw()
Predicted Chick Weights with 95% Confidence Intervals and Insignificance Brackets.  The estimates in pairs marked 'NS' are not significantly different from each other.  All other pairs are statistically different from each other whether the intervals overlap or not.

Predicted Chick Weights with 95% Confidence Intervals and Insignificance Brackets. The estimates in pairs marked ‘NS’ are not significantly different from each other. All other pairs are statistically different from each other whether the intervals overlap or not.

Ornstein Data

The Ornstein data in the package contains measures of the assets of ten different sectors in four different nations along with the number of interlocking director and executive positions shared with other firms. We estimate a generalized linear model of interlocks as a function of assets, sector and nation. We then generate predictions for nation and make the letter display.

## Load Data
data(Ornstein, package="carData")

## Estimate Model
orn_mod <- glm(interlocks ~ log2(assets) + sector + nation, data=Ornstein, 
    family=poisson)

## Generate Predictions
orn_preds <- predictions(orn_mod, variables = "sector", by = "sector")
orn_b <- coef(orn_preds)
names(orn_b) <- orn_preds$sector
orn_vt_data <- make_vt_data(est=orn_b, vcov(orn_preds))
orn_vt <- viztest(orn_vt_data, include_zero=FALSE)
orn_annots <- make_annotations(orn_vt, adjust="none")

ggplot(orn_preds, aes(x=sector, y=estimate)) +
  geom_pointrange(aes(ymin=conf.low, ymax=conf.high)) +
  do.call(geom_signif, orn_annots) +
  labs(
    x = "Sector",
    y = "Predicted Number of Interlocks"
  ) + 
  theme_bw()
Predicted Number of Interlocks by Sector with 95% Confidence Intervals and Significance Brackets.  The estimates in pairs marked with asterisks are significantly different from each other.  All other overlapping pairs are not significantly different from each other.

Predicted Number of Interlocks by Sector with 95% Confidence Intervals and Significance Brackets. The estimates in pairs marked with asterisks are significantly different from each other. All other overlapping pairs are not significantly different from each other.

Sometimes, particularly when there are more than a few estimates, the y-position of the annotations will need a bit of nudging to make the gaps between brackets more visually appealing. This can be done either by directly adjusting the y_position element of the annotations list or by using the nudge argument in the call to make_annotations(). To do the latter, you would need to use some trial and error to ensure the right amount of nudging is applied. The call to make_annotations() below is the result of such trial and error.

orn_annots <- make_annotations(orn_vt, adjust="none", 
                               nudge=c(2, 1, 2, 0,0) )

ggplot(orn_preds, aes(x=sector, y=estimate)) +
  geom_pointrange(aes(ymin=conf.low, ymax=conf.high)) +
  do.call(geom_signif, orn_annots) +
  labs(
    x = "Sector",
    y = "Predicted Number of Interlocks"
  ) + 
  theme_bw()
Predicted Number of Interlocks by Sector with 95% Confidence Intervals and Nudged Significance Brackets.

Predicted Number of Interlocks by Sector with 95% Confidence Intervals and Nudged Significance Brackets.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.