The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Reporting MAIHDA results: tidy output and publication tables

Hamid Bulut

2026-06-18

From a fitted model to a manuscript

This vignette covers the three reporting helpers that turn a maihda() analysis into publication-ready output without you recomputing anything: glance() (the one-row headline), tidy() (estimates as a tidy tibble), and maihda_table() (the two canonical write-up tables). They read the quantities summary() already computed, so the tables always agree with print()/plot().

library(MAIHDA)
data("maihda_health_data")

a <- maihda(
  BMI ~ Age + Gender + Race + Education + (1 | Gender:Race:Education),
  data = maihda_health_data
)

glance() – the one-row headline

glance() returns the whole MAIHDA headline as a one-row tibble: the VPC/ICC (with its interval when bootstrapped), the PCV, the additive/interaction shares, the discriminatory accuracy (AUC/MOR) for a binary outcome, and the bookkeeping (n_strata, nobs, engine, family, mode).

glance(a)
#> # A tibble: 1 × 16
#>      vpc vpc.conf.low vpc.conf.high   pcv pcv.conf.low pcv.conf.high
#>    <dbl>        <dbl>         <dbl> <dbl>        <dbl>         <dbl>
#> 1 0.0627           NA            NA 0.766           NA            NA
#> # ℹ 10 more variables: additive.share <dbl>, interaction.share <dbl>,
#> #   auc <dbl>, auc.adjusted <dbl>, mor <dbl>, n_strata <int>, nobs <int>,
#> #   engine <chr>, family <chr>, mode <chr>

This is the row no generic mixed-model tool emits, because the PCV needs the null + adjusted pair that only a maihda() analysis holds. It is ideal for rbind()-ing many analyses into a comparison table, or for pulling a single number into inline text – e.g. the VPC is 6.3% and the additive share (PCV) is 76.6%.

tidy() and glance() come from the lightweight generics package – the same generics broom/broom.mixed re-export – so they dispatch whether you have broom, generics, or just MAIHDA loaded. There is no hard broom dependency.

tidy() – estimates as a tidy tibble

tidy() returns MAIHDA estimates in broom’s estimate/std.error/conf.low/ conf.high shape, ready for dplyr, ggplot2, or a table package. Three components are available.

The stratum estimates (the default) – one row per intersection, with the human-readable label:

strata <- tidy(a, component = "strata")
head(strata)
#> # A tibble: 6 × 6
#>   stratum label                          estimate std.error conf.low conf.high
#>   <chr>   <chr>                             <dbl>     <dbl>    <dbl>     <dbl>
#> 1 1       male × Hispanic × Some College   -0.245     1.04   -2.29      1.80  
#> 2 2       male × Black × College Grad       0.772     1.11   -1.41      2.96  
#> 3 3       female × White × College Grad    -1.82      0.352  -2.51     -1.13  
#> 4 4       male × Hispanic × 8th Grade       0.921     1.32   -1.66      3.51  
#> 5 5       female × Mexican × 8th Grade      1.82      0.951  -0.0409    3.69  
#> 6 6       male × White × College Grad      -0.743     0.357  -1.44     -0.0431

The variance components:

tidy(a, component = "variance")
#> # A tibble: 3 × 4
#>   component                 variance    sd proportion
#>   <chr>                        <dbl> <dbl>      <dbl>
#> 1 Between-stratum (random)      2.90  1.70     0.0627
#> 2 Within-stratum (residual)    43.4   6.59     0.937 
#> 3 Total                        46.3   6.80     1

And the fixed effects (component = "fixed"). For a two-model analysis, which = "adjusted" reads the adjusted model instead of the default null:

tidy(a, component = "fixed", which = "adjusted")
#> # A tibble: 11 × 5
#>    term                    estimate std.error conf.low conf.high
#>    <chr>                      <dbl>     <dbl>    <dbl>     <dbl>
#>  1 (Intercept)              30.1           NA       NA        NA
#>  2 Age                       0.0148        NA       NA        NA
#>  3 Gendermale               -0.431         NA       NA        NA
#>  4 RaceHispanic             -1.62          NA       NA        NA
#>  5 RaceMexican              -1.06          NA       NA        NA
#>  6 RaceWhite                -1.67          NA       NA        NA
#>  7 RaceOther                -3.92          NA       NA        NA
#>  8 Education9 - 11th Grade   0.129         NA       NA        NA
#>  9 EducationHigh School      1.08          NA       NA        NA
#> 10 EducationSome College    -0.177         NA       NA        NA
#> 11 EducationCollege Grad    -0.960         NA       NA        NA

Because the output is a plain tibble, you can build your own figure instead of using the built-in plot() – here a caterpillar of the stratum interaction estimates, ordered by magnitude:

library(ggplot2)

strata_ord <- strata[order(strata$estimate), ]
strata_ord$label <- factor(strata_ord$label, levels = strata_ord$label)

ggplot(strata_ord, aes(x = estimate, y = label)) +
  geom_vline(xintercept = 0, linetype = "dashed", colour = "grey60") +
  geom_pointrange(aes(xmin = conf.low, xmax = conf.high)) +
  labs(x = "Stratum random effect (BLUP)", y = NULL,
       title = "Intersectional strata, ordered by estimated deviation") +
  theme_minimal()

maihda_table() – the two canonical write-up tables

maihda_table() assembles the two deliverables a MAIHDA paper reports (cf. Evans et al. 2024): a model-results table contrasting the null (Model 1) and adjusted (Model 2) fits, and a ranked-strata table ordering the intersections by predicted outcome. The print() method shows both in the familiar “estimate [low, high]” layout:

tab <- maihda_table(a)
tab
#> MAIHDA Results Table
#> ====================
#> 
#> Engine: lme4 | Family: gaussian | Mode: two-model
#> Observations: 3000 | Strata: 50
#> 
#> Model results:
#>                 Statistic Null (Model 1) Adjusted (Model 2)
#>                 Intercept         28.208             30.050
#>  Between-stratum variance          2.900              1.172
#>        Between-stratum SD          1.703              1.083
#>                   VPC/ICC          0.063              0.026
#>    PCV (null -> adjusted)                             0.766
#> 
#> Strata ranked by predicted value (null model):
#>  Rank                          Stratum   N               Predicted
#>     1     female × Black × High School  46 33.274 [31.621, 34.927]
#>     2    female × Black × Some College  76 31.413 [30.060, 32.766]
#>     3  female × Black × 9 - 11th Grade  29 31.124 [29.177, 33.071]
#>     4     female × Mexican × 8th Grade  33 30.759 [28.895, 32.623]
#>     5  male × Mexican × 9 - 11th Grade  34 30.206 [28.361, 32.051]
#>     6     female × Other × High School   9 30.099 [27.462, 32.736]
#>     7    female × Black × College Grad  30 30.044 [28.119, 31.969]
#>     8    male × Mexican × College Grad  11 30.036 [27.503, 32.570]
#>     9   female × Mexican × High School  20 30.007 [27.824, 32.190]
#>    10      male × Hispanic × 8th Grade  10 29.929 [27.345, 32.513]
#>   ...                              ... ...                     ...
#>    41 female × Hispanic × Some College  26 27.762 [25.745, 29.779]
#>    42       female × Other × 8th Grade  12 27.741 [25.255, 30.227]
#>    43  female × Mexican × Some College  25 27.739 [25.697, 29.781]
#>    44  female × Other × 9 - 11th Grade   7 27.610 [24.855, 30.365]
#>    45         male × Other × 8th Grade  11 27.304 [24.771, 29.838]
#>    46    female × White × College Grad 335 27.073 [26.383, 27.763]
#>    47    female × Other × Some College  37 26.795 [25.004, 28.585]
#>    48    male × Other × 9 - 11th Grade  14 26.758 [24.359, 29.157]
#>    49      male × Other × College Grad  44 26.669 [24.988, 28.351]
#>    50    female × Other × College Grad  46 25.913 [24.259, 27.566]
#>               Stratum RE
#>     4.367 [2.714, 6.020]
#>     2.594 [1.241, 3.947]
#>     2.174 [0.227, 4.121]
#>    1.823 [-0.041, 3.687]
#>    1.370 [-0.475, 3.215]
#>    1.101 [-1.537, 3.738]
#>    1.115 [-0.810, 3.040]
#>    1.279 [-1.254, 3.813]
#>    1.195 [-0.989, 3.378]
#>    0.921 [-1.663, 3.505]
#>                      ...
#>   -0.957 [-2.974, 1.060]
#>   -1.231 [-3.717, 1.255]
#>   -0.935 [-2.977, 1.107]
#>   -1.240 [-3.995, 1.515]
#>   -1.695 [-4.229, 0.839]
#>  -1.820 [-2.510, -1.130]
#>  -2.066 [-3.857, -0.275]
#>   -2.125 [-4.524, 0.273]
#>  -2.145 [-3.826, -0.464]
#>  -2.865 [-4.518, -1.212]
#>   (30 strata between ranks 11 and 40 not shown; see $strata)
#>   Predicted intervals are conditional (random-effect) only; Stratum RE is the stratum BLUP.
#> 
#> Estimates are point values unless a [low, high] interval is shown (VPC/PCV).

The $models element is a numeric, export-ready data frame (statistics in rows, a column per model with *_lower/*_upper interval columns), so it drops straight into knitr::kable() or write.csv():

knitr::kable(tab$models, digits = 3,
             caption = "MAIHDA model-results table (null vs. adjusted).")
MAIHDA model-results table (null vs. adjusted).
statistic null null_lower null_upper adjusted adjusted_lower adjusted_upper
Intercept 28.208 NA NA 30.050 NA NA
Between-stratum variance 2.900 NA NA 1.172 NA NA
Between-stratum SD 1.703 NA NA 1.083 NA NA
VPC/ICC 0.063 NA NA 0.026 NA NA
PCV (null -> adjusted) NA NA NA 0.766 NA NA
write.csv(tab$models, "maihda_results.csv", row.names = FALSE)

The $strata element holds every stratum ranked by predicted BMI (the data behind the printed top/bottom rows):

head(tab$strata)
#>   rank stratum                           label  n predicted predicted_lower
#> 1    1      26    female × Black × High School 46  33.27402        31.62075
#> 2    2      29   female × Black × Some College 76  31.41301        30.05958
#> 3    3      45 female × Black × 9 - 11th Grade 29  31.12401        29.17711
#> 4    4       5    female × Mexican × 8th Grade 33  30.75902        28.89508
#> 5    5      34 male × Mexican × 9 - 11th Grade 34  30.20626        28.36146
#> 6    6      32    female × Other × High School  9  30.09881        27.46154
#>   predicted_upper random_effect    re_lower re_upper
#> 1        34.92729      4.366992  2.71372474 6.020259
#> 2        32.76643      2.594013  1.24058985 3.947436
#> 3        33.07091      2.174211  0.22730689 4.121115
#> 4        32.62295      1.823058 -0.04088002 3.686996
#> 5        32.05106      1.369987 -0.47481559 3.214789
#> 6        32.73607      1.100737 -1.53652679 3.738002

If you use gt or flextable for formatted tables, the same numeric frame feeds them directly (shown only if gt is installed):

gt::gt(tab$models)

Choosing a model structure with maihda_ic()

The VPC and PCV do not tell you whether a different model specification (another covariate set, strata definition, or family) fits better. maihda_ic() answers that with information criteria – AIC/BIC for the likelihood engines, WAIC/LOOIC for brms – and a delta column (gap from the best model):

maihda_ic(a)
#> MAIHDA Information Criteria
#> ===========================
#> 
#>              model    n            estimator df logLik   AIC   BIC delta
#>      Model1 (Null) 3000 ML (refit from REML)  4  -9940 19889 19913  15.5
#>  Model1 (Adjusted) 3000 ML (refit from REML) 13  -9924 19873 19951   0.0
#> 
#> delta = difference from the best model on AIC (lower is better).
#> REML lmer fit(s) were refitted with ML so AIC/BIC are comparable across different fixed effects.
#> Information criteria are only comparable across models fitted to the same analytic sample (and, for AIC/BIC, the same family).

See also

References

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.