The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Significance brackets are lines or brackets that connect two estimates with a notation indicating the significance (or lack thereof) of the difference between the two estimates. These are commonly used in plots and can make comparisons easier to understand. When considering any pair of estimates with confidence intervals, there are three states the pair of intervals could be in:
We propose that significance brackets could be used to identify either the second or third groups. First, no significance brackets are needed for intervals that do not overlap, their significance is obvious. So, significance brackets only need to be used on pairs of estimates with overlapping intervals. To minimize visual clutter, we could flag overlapping pairs of intervals that are statistically significantly different from each other noting that all un-flagged pairs are not statistically different from each other. Likewise, the converse could be done - flagging insignificant differences and noting that all other overlapping pairs are significant. The idea would be to use whichever approach generates fewer intervals. That said, even this method does not scale particularly well as the number of intervals gets large. We demonstrate this idea on several examples below.
We first demonstrate the procedure using the chickwts
built-in dataset. The first step is to generate the estimates, we
predict chicken weight (weight) with feed type
(feed). To make the display easiest to read, we re-rorder
the feed type factor by the average weight, which will make the
intervals decreasing in their average.
data(chickwts)
chickwts$feed <- reorder(chickwts$feed,
chickwts$weight,
FUN=mean)
chick_mod <- lm(weight~ feed, data=chickwts)In the package, we have a function called
make_annotations() that makes a list of annotations
amenable for use in geom_signif() from the package. First,
the user must execute the viztest() function on the
estimates.
library(marginaleffects)
library(VizTest)
## make predictions
chick_preds <- predictions(chick_mod, variables="feed", by="feed")
## save predicted values in chick_b
chick_b <- coef(chick_preds)
## set names of predicted values
names(chick_b) <- chick_preds$feed
## make into visual testing data
chick_vt_data <- make_vt_data(est=chick_b, vcov(chick_preds))
## execute viztest function on predictions
chick_vt <- viztest(chick_vt_data, include_zero=FALSE)The default for make_annotations() is to figure out
which approach will produce fewer annotations - flagging overlapping
insignificant differences or flagging overlapping significant
differences and then return the one with fewer annotations. This is the
type="auto" option. If you choose
type="significant" the function will return annotations for
the overlapping significant differences and if you choose
type="insignificant" it will return annotations for the
overlapping insignificant differences. Here, we use the default
type="auto" option.
chick_annots <- make_annotations(chick_vt)
chick_annots
#> $annotations
#> [1] "NS" "NS" "NS"
#>
#> $y_position
#> sunflower meatmeal soybean
#> 371.6379 321.0103 286.8477
#>
#> $xmin
#> [1] "casein" "soybean" "linseed"
#>
#> $xmax
#> [1] "sunflower" "meatmeal" "soybean"Now, we can use the annotations as input to
geom_signif() to add the significance brackets to a plot of
the estimates. Note that make_annotations() makes a list
that has named elements with names that are the same as the arguments to
geom_signif(). The easiest way to use those as arguments is
to use do.call() as shown below. For the uninitiated,
do.call() takes as its first argument an unevaluated
function (like geom_signif) and as its second argument a
named list of arguments to be passed to that function.
library(ggplot2)
library(ggsignif)
ggplot(chick_preds, aes(x=feed, y=estimate)) +
geom_pointrange(aes(ymin=conf.low, ymax=conf.high)) +
do.call(geom_signif, chick_annots) +
labs(
x = "Feed Type",
y = "Predicted Weight"
) +
theme_bw()Predicted Chick Weights with 95% Confidence Intervals and Insignificance Brackets. The estimates in pairs marked ‘NS’ are not significantly different from each other. All other pairs are statistically different from each other whether the intervals overlap or not.
The Ornstein data in the package contains measures of the assets of ten different sectors in four different nations along with the number of interlocking director and executive positions shared with other firms. We estimate a generalized linear model of interlocks as a function of assets, sector and nation. We then generate predictions for nation and make the letter display.
## Load Data
data(Ornstein, package="carData")
## Estimate Model
orn_mod <- glm(interlocks ~ log2(assets) + sector + nation, data=Ornstein,
family=poisson)
## Generate Predictions
orn_preds <- predictions(orn_mod, variables = "sector", by = "sector")orn_b <- coef(orn_preds)
names(orn_b) <- orn_preds$sector
orn_vt_data <- make_vt_data(est=orn_b, vcov(orn_preds))
orn_vt <- viztest(orn_vt_data, include_zero=FALSE)
orn_annots <- make_annotations(orn_vt, adjust="none")
ggplot(orn_preds, aes(x=sector, y=estimate)) +
geom_pointrange(aes(ymin=conf.low, ymax=conf.high)) +
do.call(geom_signif, orn_annots) +
labs(
x = "Sector",
y = "Predicted Number of Interlocks"
) +
theme_bw()Predicted Number of Interlocks by Sector with 95% Confidence Intervals and Significance Brackets. The estimates in pairs marked with asterisks are significantly different from each other. All other overlapping pairs are not significantly different from each other.
Sometimes, particularly when there are more than a few estimates, the
y-position of the annotations will need a bit of nudging to make the
gaps between brackets more visually appealing. This can be done either
by directly adjusting the y_position element of the
annotations list or by using the nudge argument in the call
to make_annotations(). To do the latter, you would need to
use some trial and error to ensure the right amount of nudging is
applied. The call to make_annotations() below is the result
of such trial and error.
orn_annots <- make_annotations(orn_vt, adjust="none",
nudge=c(2, 1, 2, 0,0) )
ggplot(orn_preds, aes(x=sector, y=estimate)) +
geom_pointrange(aes(ymin=conf.low, ymax=conf.high)) +
do.call(geom_signif, orn_annots) +
labs(
x = "Sector",
y = "Predicted Number of Interlocks"
) +
theme_bw()Predicted Number of Interlocks by Sector with 95% Confidence Intervals and Nudged Significance Brackets.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.