Regression Examples

Josie Athens

2018-07-04

1 Introduction

The aim of this vignette is to illustrate the use/functionality of the glm_coef function. glm_coef can be used to display model coefficients with confidence intervals and p-values. The advantages and limitations of glm_coef are:

  1. Recognises the main models used in epidemiology/public health.
  2. Automatically back transforms estimates and confidence intervals, when the model requires it.
  3. Can use robust standard errors for the calculation of confidence intervals.
    • Standard errors are used by default.
    • The use of standard errors is restricted by the following classes of objects (models): gee, glm and survreg.
  4. Can display nice labels for the names of the parameters.
  5. Returns a data frame that can be modified and/or exported as tables for publications (with further editing).

We start by loading relevant packages and setting alignment in pander tables (as suggested in the Template of this package):

library(pubh, warn.conflicts = FALSE)
library(car, warn.conflicts = FALSE)
library(descr, warn.conflicts = FALSE)
library(effects, warn.conflicts = FALSE)
library(multcomp, warn.conflicts = FALSE)
library(pander, warn.conflicts = FALSE)

set.alignment("right", row.names = "left", permanent = TRUE)

2 Multiple Linear Regression

For continuous outcomes there is no need of exponentiating the results.

data(birthwt)
birthwt$smoke <- factor(birthwt$smoke, labels=c("Non-smoker", "Smoker"))
birthwt$race <- factor(birthwt$race > 1, labels=c("White", "Non-white"))
model_norm <- glm(bwt ~ smoke + race, data = birthwt)

Traditional output from the model:

pander(Anova(model_norm))
Analysis of Deviance Table (Type II tests)
  LR Chisq Df Pr(>Chisq)
smoke 15.83 1 6.917e-05
race 18.49 1 1.706e-05
pander(summary(model_norm))
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 3335 91.16 36.58 6.831e-87
smokeSmoker -428.5 107.7 -3.979 9.904e-05
raceNon-white -452.1 105.1 -4.3 2.751e-05

(Dispersion parameter for gaussian family taken to be 471136.9 )

Null deviance: 99969656 on 188 degrees of freedom
Residual deviance: 87631472 on 186 degrees of freedom

Table of coefficients:

glm_coef(model_norm)
#>               Estimate Std. Error Lower CI Upper CI Pr(>|t|)
#> (Intercept)    3334.82     162.77  3013.71  3655.93  < 0.001
#> smokeSmoker    -428.49     126.34  -677.74  -179.24  < 0.001
#> raceNon-white  -452.10     155.59  -759.05  -145.15    0.004

Once we know the order in which the parameters are displayed, we can add labels to our final table:

Note: Compare results using naive and robust standard errors.

pander(glm_coef(model_norm, labels=c("Constant", "Smoker - Non-smoker", "Non-white - White"),
                se.rob = FALSE), split.table=Inf, caption="Table of coeffients using naive 
       standard errors.")
Table of coeffients using naive standard errors.
  Estimate Std. Error Lower CI Upper CI Pr(>|t|)
Constant 3335 91.16 3155 3515 < 0.001
Smoker - Non-smoker -428.5 107.7 -640.9 -216.1 < 0.001
Non-white - White -452.1 105.1 -659.5 -244.7 < 0.001
pander(glm_coef(model_norm, labels=c("Constant", "Smoker - Non-smoker", "Non-white - White")),
       split.table = Inf, caption="Table of coeffients using robust standard errors.")
Table of coeffients using robust standard errors.
  Estimate Std. Error Lower CI Upper CI Pr(>|t|)
Constant 3335 162.8 3014 3656 < 0.001
Smoker - Non-smoker -428.5 126.3 -677.7 -179.2 < 0.001
Non-white - White -452.1 155.6 -759 -145.2 0.004

Effect plot:

plot(Effect(c("smoke", "race"), model_norm), multiline = TRUE, main = NULL, 
     ylab = "Birth weight (g)", xlab = "Smoking status", symbols = list(pch=16),
     confint = list(style="auto"), aspect = 3/4, lines = list(col=c(2,4), lwd=1.5))

3 Logistic Regression

For logistic regression we are interested in the odds ratios.

data(diet, package = "Epi")
model_binom <- glm(chd ~ fibre, data = diet, family = binomial)
pander(glm_coef(model_binom, labels = c("Constant", "Fibre intake (g/day)")), split.table = Inf,
       caption = "Parameter estimates from logistic regression.")
Parameter estimates from logistic regression.
  OR Std. Error Lower CI Upper CI Pr(>|z|)
Constant 0.95 0.59 0.3 3.01 0.934
Fibre intake (g/day) 0.33 0.37 0.16 0.67 0.002

Effect plot:

plot(Effect("fibre", model_binom), type = "response", rug = FALSE, aspect = 3/4,
       ylab = "P (CHD)", xlab = "Fibre (g/day)", lwd = 2, confint = list(style="none"),
     main = NULL)

3.1 Matched Case-Control Studies: Condtional Logistic Regression

data(bdendo, package = "Epi")
levels(bdendo$gall) <- c("No GBD", "GBD")
levels(bdendo$est) <- c("No oestrogen", "Oestrogen")
model_clogit <- clogit(d ~ est * gall + strata(set), data = bdendo)
glm_coef(model_clogit)
#>                         OR Std. Error Lower CI Upper CI Pr(>|z|)
#> estOestrogen         14.88      14.88     4.49    49.36  < 0.001
#> gallGBD              18.07      18.07     3.20   102.01    0.001
#> estOestrogen:gallGBD  0.13       0.13     0.02     0.90    0.039
pander(glm_coef(model_clogit, labels = c("Oestrogen/No oestrogen", "GBD/No GBD", 
                                         "Oestrogen:GBD Interaction")), 
       split.table = Inf, caption = "Parameter estimates from conditional logistic regression.")
Parameter estimates from conditional logistic regression.
  OR Std. Error Lower CI Upper CI Pr(>|z|)
Oestrogen/No oestrogen 14.88 14.88 4.49 49.36 < 0.001
GBD/No GBD 18.07 18.07 3.2 102 0.001
Oestrogen:GBD Interaction 0.13 0.13 0.02 0.9 0.039

Creating data frame for effect plot:

bdendo_grid <- with(bdendo, expand.grid(
  gall = levels(gall),
  est = levels(est),
  set = sample(1:63, 1)
))

Predictions:

bdendo_grid$pred <- inv_logit(predict(model_clogit, bdendo_grid, type = "lp"))

Effect plot:

xyplot(pred  ~ gall, data = bdendo_grid, groups = est, type = "l", 
    lwd = 2, xlab = "Gall blader disease", ylab = "P (cancer)", 
    auto.key = list(title = "Oestrogen", space = "right", cex = 0.8))

3.2 Ordinal Logistic Regression

library(ordinal, warn.conflicts = FALSE)
data(housing)
model_clm <- clm(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
glm_coef(model_clm)
#>               Ordinal OR Lower CI Upper CI Std. Error Pr(>|Z|)
#> Low|Medium          0.61     0.48     0.78       0.12  < 0.001
#> Medium|High         2.00     1.56     2.55       0.13  < 0.001
#> InflMedium          1.76     1.44     2.16       0.10  < 0.001
#> InflHigh            3.63     2.83     4.66       0.13  < 0.001
#> TypeApartment       0.56     0.45     0.71       0.12  < 0.001
#> TypeAtrium          0.69     0.51     0.94       0.16    0.018
#> TypeTerrace         0.34     0.25     0.45       0.15  < 0.001
#> ContHigh            1.43     1.19     1.73       0.10  < 0.001
labs_ord <- c("Constant: Low/Medium satisfaction",
              "Constant: Medium/High satisfaction",
              "Perceived influence: Medium/Low",
              "Perceived influence: High/Low",
              "Accommodation: Apartment/Tower",
              "Accommodation: Atrium/Tower",
              "Accommodation: Terrace/Tower",
              "Afforded: High/Low")
pander(glm_coef(model_clm, labels = labs_ord), split.table = Inf,
       caption = "Parameter estimates on satisfaction of householders.")
Parameter estimates on satisfaction of householders.
  Ordinal OR Lower CI Upper CI Std. Error Pr(>|Z|)
Constant: Low/Medium satisfaction 0.61 0.48 0.78 0.12 < 0.001
Constant: Medium/High satisfaction 2 1.56 2.55 0.13 < 0.001
Perceived influence: Medium/Low 1.76 1.44 2.16 0.1 < 0.001
Perceived influence: High/Low 3.63 2.83 4.66 0.13 < 0.001
Accommodation: Apartment/Tower 0.56 0.45 0.71 0.12 < 0.001
Accommodation: Atrium/Tower 0.69 0.51 0.94 0.16 0.018
Accommodation: Terrace/Tower 0.34 0.25 0.45 0.15 < 0.001
Afforded: High/Low 1.43 1.19 1.73 0.1 < 0.001

Effect plot:

plot(Effect(c("Infl", "Type", "Cont"), model_clm), main = NULL, aspect = 3/4, rotx = 45, 
     ylab = "Satisfaction (probability)", lines = list(lwd = 1.5, multiline = TRUE),
     confint = list(style="none"), symbols = list(pch = rep(20, 3)),
     ylim = c(0, 1))

Note: In tne previous table parameter estimates and confidene intervals for Perceived influence and Accommodation were not adjusted for multiple comparisons. See example from Poisson Regression to see how to include adjusted parameters.

3.3 Multinomial Regression

library(nnet)
model_multi <- multinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
#> # weights:  24 (14 variable)
#> initial  value 1846.767257 
#> iter  10 value 1747.045232
#> final  value 1735.041933 
#> converged
glm_coef(model_multi)
#> 
#> [1] "Medium  vs  Low"
#>               Multinomial OR lower95ci upper95ci z value Pr(>|z|)
#> (Intercept)               NA     -0.76     -0.08   -2.42    0.015
#> InflMedium              1.56      1.18      2.06    3.15    0.002
#> InflHigh                1.94      1.35      2.80    3.57  < 0.001
#> TypeApartment           0.65      0.46      0.91   -2.53    0.012
#> TypeAtrium              1.14      0.74      1.77    0.59    0.556
#> TypeTerrace             0.51      0.34      0.77   -3.23    0.001
#> ContHigh                1.43      1.11      1.86    2.73    0.006
#> 
#> 
#> [1] "High  vs  Low"
#>               Multinomial OR lower95ci upper95ci z value Pr(>|z|)
#> (Intercept)               NA     -0.45      0.17   -0.87    0.384
#> InflMedium              2.09      1.59      2.73    5.37  < 0.001
#> InflHigh                5.02      3.61      6.96    9.65  < 0.001
#> TypeApartment           0.48      0.35      0.65   -4.74  < 0.001
#> TypeAtrium              0.66      0.44      1.01   -1.93    0.054
#> TypeTerrace             0.24      0.16      0.36   -7.06  < 0.001
#> ContHigh                1.62      1.27      2.06    3.88  < 0.001

Effect plot:

plot(Effect(c("Infl", "Type", "Cont"), model_multi), main = NULL, aspect = 3/4, rotx = 45, 
     ylab = "Satisfaction (probability)", lines = list(lwd = 1.5, multiline = TRUE),
     confint = list(style="none"), symbols = list(pch = rep(20, 3)),
     ylim = c(0, 1))

4 Poisson Regression

For Poisson regression we are interested in incidence rate ratios.

data(quine)
levels(quine$Eth) <- list(White = "N", Aboriginal = "A")
levels(quine$Sex) <- list(Male = "M", Female = "F")
model_pois <- glm(Days ~ Eth + Sex + Age, family = poisson, data = quine)
glm_coef(model_pois)
#>                 IRR Std. Error Lower CI Upper CI Pr(>|z|)
#> (Intercept)   11.53       0.28     6.63    20.06  < 0.001
#> EthAboriginal  1.70       0.21     1.14     2.54     0.01
#> SexFemale      0.90       0.18     0.63     1.28    0.556
#> AgeF1          0.80       0.32     0.43     1.48    0.475
#> AgeF2          1.42       0.26     0.85     2.36     0.18
#> AgeF3          1.35       0.28     0.78     2.32    0.284

4.1 Negative-binomial

The assumption is that the mean is equal than the variance. Is that the case?

pander(estat(~ Days|Eth, data = quine, label = "Days of school absences"), split.table=Inf)
  Eth N Min. Max. Mean Median SD CV
Days of school absences White 77 0 69 12.18 7 13.56 1.11
Aboriginal 69 0 81 21.23 15 17.72 0.83

Note: Look at the relative dispersion (coefficient of variation), for the variance to be equal to the means the CV would have to be about 35%.

More formally the following calculation should be close to 1:

deviance(model_pois) / df.residual(model_pois)
#> [1] 12.44646

Thus, we have over-dispersion. One option is to use a negative binomial distribution.

model_negbin <- glm.nb(Days ~ Eth + Sex + Age, data = quine)
unadj <- glm_coef(model_negbin, labels=c("Constant",
                                   "Race: Aboriginal/White",
                                   "Sex: Female/Male",
                                   "F1/Primary",
                                   "F2/Primary",
                                   "F3/Primary"))

Notice that age group is a factor with more than two levels and is significant:

pander(Anova(model_negbin))
Analysis of Deviance Table (Type II tests)
  LR Chisq Df Pr(>Chisq)
Eth 12.66 1 0.0003743
Sex 0.1486 1 0.6999
Age 9.484 3 0.0235

Thus, we want to report confidence intervals and \(p\)-values adjusted for multiple comparisons.

The unadjusted CIs:

pander(unadj, split.table = Inf, caption = "Parameter estimates with unadjusted CIs and p-values.")
Parameter estimates with unadjusted CIs and p-values.
  IRR Std. Error Lower CI Upper CI Pr(>|z|)
Constant 12.24 0.27 7.28 20.58 < 0.001
Race: Aboriginal/White 1.76 0.2 1.19 2.62 0.005
Sex: Female/Male 0.94 0.18 0.66 1.33 0.722
F1/Primary 0.69 0.29 0.39 1.22 0.204
F2/Primary 1.2 0.26 0.71 2.01 0.496
F3/Primary 1.29 0.27 0.75 2.2 0.357

Effect plot:

plot(Effect(c("Age", "Eth"), model_negbin), lines = list(lwd = 1.5, multiline = TRUE),
     confint = list(style="none"), symbols = list(pch = rep(20, 2)), main = NULL, 
     aspect = 3/4)

4.2 Adjusting CIs and p-values for multiple comparisons

We adjust for multiple comparisons:

model_glht <- glht(model_negbin, linfct  = mcp(Age = "Tukey"))
age_glht <- xymultiple(model_glht, Exp = TRUE, plot = FALSE)

We can see the comparison graphically with:

xymultiple(model_glht, Exp = TRUE)
#>   Comparison Ratio  lwr  upr Pr(>|Z|)
#> 1    F1 - F0  0.69 0.38 1.26    0.220
#> 2    F2 - F0  1.20 0.66 2.17    0.550
#> 3    F3 - F0  1.29 0.69 2.40    0.550
#> 4    F2 - F1  1.73 1.02 2.92    0.022
#> 5    F3 - F1  1.86 1.07 3.21    0.020
#> 6    F3 - F2  1.08 0.62 1.88    0.737
Parameter estimates on the effect of age group on the number of days absent from school. Bars represent 95% CIs adjusted by the method of Westfall for multiple comparisons.

Parameter estimates on the effect of age group on the number of days absent from school. Bars represent 95% CIs adjusted by the method of Westfall for multiple comparisons.

We use this information to construct the final table:

final <- unadj
final[, 5] <- as.character(final[, 5])
age_glht[, 5] <- as.character(age_glht[, 5])
final[4:6, 3:5] <- age_glht[1:3, 3:5]
pander(final, split.table = Inf, caption = "Parameter estimates. CIs and
       p-values for age group were adjusted for multiple comparisons by the 
       method of Westfall.")
Parameter estimates. CIs and p-values for age group were adjusted for multiple comparisons by the method of Westfall.
  IRR Std. Error Lower CI Upper CI Pr(>|z|)
Constant 12.24 0.27 7.28 20.58 < 0.001
Race: Aboriginal/White 1.76 0.2 1.19 2.62 0.005
Sex: Female/Male 0.94 0.18 0.66 1.33 0.722
F1/Primary 0.69 0.29 0.38 1.26 0.22
F2/Primary 1.2 0.26 0.66 2.17 0.55
F3/Primary 1.29 0.27 0.69 2.4 0.55

5 Survival Analysis

data(bladder)
bladder$times <- bladder$stop
bladder$rx <- factor(bladder$rx, labels=c("Placebo", "Thiotepa"))

5.1 Parametric method

model_surv <- survreg(Surv(times, event) ~ rx, data = bladder)

Using robust standard errors (default):

glm_coef(model_surv)
#>            Survival time ratio Std. Error Lower CI Upper CI Pr(>|z|)
#> rxThiotepa                1.64       0.31     0.89     3.04    0.116
#> Scale                     1.00       0.08     0.85     1.18    0.992
pander(glm_coef(model_surv, labels = c("Treatment: Thiotepa/Placebo", "Scale")),
       split.table = Inf)
  Survival time ratio Std. Error Lower CI Upper CI Pr(>|z|)
Treatment: Thiotepa/Placebo 1.64 0.31 0.89 3.04 0.116
Scale 1 0.08 0.85 1.18 0.992

In this example the scale parameter is not statistically different from one, meaning hazard is constant and thus, we can use the exponential distribution:

model_exp <- survreg(Surv(times, event) ~ rx, data = bladder, dist = "exponential")
pander(glm_coef(model_exp, labels = "Treatment: Thiotepa/Placebo"),
       split.table = Inf)
  Survival time ratio Std. Error Lower CI Upper CI Pr(>|z|)
Treatment: Thiotepa/Placebo 1.64 0.33 0.85 3.16 0.139

Interpretation: Patients receiving Thiotepa live on average 64% more than those in the Placebo group.

Using naive standard errors:

pander(glm_coef(model_exp, se.rob = FALSE, labels = "Treatment: Thiotepa/Placebo"),
       split.table = Inf)
  Survival time ratio Std. Error Lower CI Upper CI Pr(>|z|)
Treatment: Thiotepa/Placebo 1.64 0.2 1.11 2.41 0.012

Data for predictions:

bladder_grid <- with(bladder, expand.grid(
  rx = levels(rx)
))

Predictions:

bladder_pred <- predict(model_exp, bladder_grid, se.fit = TRUE, type = "response")
bladder_grid$fit <- bladder_pred$fit
bladder_grid$se <- bladder_pred$se
bladder_grid$lo <- bladder_pred$fit - 1.96 * bladder_pred$se
bladder_grid$up <- bladder_pred$fit + 1.96 * bladder_pred$se

Effect plot:

xyplot(cbind(fit, lo, up) ~ rx, data = bladder_grid, pch = 20, panel = panel.errbars,
       ylab = "Survival time", xlab = "Treatment", aspect = 3/4)

5.2 Cox proportional hazards regression

model_cox <-  coxph(Surv(times, event) ~ rx, data = bladder)
pander(glm_coef(model_cox, labels = c("Treatment: Thiotepa/Placebo")), split.table = Inf)
  Hazard ratio Std. Error Lower CI Upper CI Pr(>|z|)
Treatment: Thiotepa/Placebo 0.64 0.2 0.44 0.94 0.024

Interpretation: Patients receiving Thiotepa are 64% less likely of dying than those in the Placebo group.

Data for predictions:

cox_grid <- with(bladder, expand.grid(
  rx = levels(rx)
))

Predictions:

cox_pred <- predict(model_cox, cox_grid, se.fit = TRUE, type = "risk")
cox_grid$fit <- cox_pred$fit
cox_grid$se <- cox_pred$se
cox_grid$lo <- cox_pred$fit - 1.96 * cox_pred$se
cox_grid$up <- cox_pred$fit + 1.96 * cox_pred$se

Effect plot:

xyplot(cbind(fit, lo, up) ~ rx, data = cox_grid, pch = 20, panel = panel.errbars,
       ylab = "Hazard", xlab = "Treatment", aspect = 3/4)

6 Mixed Linear Regression Models

6.1 Continuous outcomes

library(nlme, warn.conflicts = FALSE)
data(Orthodont)
model_lme <- lme(distance ~ Sex * I(age - mean(age, na.rm = TRUE)), random = ~ 1|Subject, 
                 method = "ML", data = Orthodont)
glm_coef(model_lme)
#>                                            Coeff Lower CI Upper CI   SE DF
#> (Intercept)                                24.97    24.03    24.03 0.48 79
#> SexFemale                                  -2.32    -3.78    -3.78 0.75 25
#> I(age - mean(age, na.rm = TRUE))            0.78     0.63     0.63 0.08 79
#> SexFemale:I(age - mean(age, na.rm = TRUE)) -0.30    -0.54    -0.54 0.12 79
#>                                            t value Pr(>|t|)
#> (Intercept)                                  52.39  < 0.001
#> SexFemale                                    -3.11    0.005
#> I(age - mean(age, na.rm = TRUE))             10.06  < 0.001
#> SexFemale:I(age - mean(age, na.rm = TRUE))   -2.49    0.015
pander(glm_coef(model_lme, labels = c("Constant", "Sex: female-male", "Age (years)", 
                                      "Sex:Age interaction")), split.table = Inf)
  Coeff Lower CI Upper CI SE DF t value Pr(>|t|)
Constant 24.97 24.03 24.03 0.48 79 52.39 < 0.001
Sex: female-male -2.32 -3.78 -3.78 0.75 25 -3.11 0.005
Age (years) 0.78 0.63 0.63 0.08 79 10.06 < 0.001
Sex:Age interaction -0.3 -0.54 -0.54 0.12 79 -2.49 0.015
plot(Effect(c("age", "Sex"), model_lme, residuals = TRUE), rug = FALSE, xlab = "Age (years)", 
     ylab = "Distance (mm)", main = NULL, aspect = 3/4, partial.residuals = list(pch = 20),
     lines = list(col = c(2, 4), lwd = 1.5))

library(gee, warn.conflicts = FALSE)
model_gee_norm <- gee(distance ~ Sex * I(age - mean(age, na.rm = TRUE)), id = Subject, 
                      data = Orthodont, corstr = "AR-M")
#>                                (Intercept) 
#>                                 24.9687500 
#>                                  SexFemale 
#>                                 -2.3210227 
#>           I(age - mean(age, na.rm = TRUE)) 
#>                                  0.7843750 
#> SexFemale:I(age - mean(age, na.rm = TRUE)) 
#>                                 -0.3048295

For GEE models, robust standard errors are used by default:

pander(glm_coef(model_gee_norm, labels = c("Constant", "Sex: female-male", "Age (years)", 
                                      "Sex:Age interaction")), split.table = Inf)
  Coeff Lower CI Upper CI SE Pr(>|z|)
Constant 25.06 24.2 25.92 0.44 < 0.001
Sex: female-male -2.42 -3.89 -0.94 0.75 0.001
Age (years) 0.77 0.56 0.98 0.1 < 0.001
Sex:Age interaction -0.29 -0.53 -0.05 0.12 0.02

Data for prediction:

Orthodont$fit <- model_gee_norm$fitted.values

Effect plot:

xyplot(distance ~ age|Sex, data = Orthodont, groups = Subject, pch = 20, xlab = "Age (years)", 
     ylab = "Distance (mm)", aspect = 3/4) +
  xyplot(fit ~ age|Sex, data = Orthodont, type = "l", lwd = 2)

6.2 Count outcomes

data(Thall)

c1 <- cbind(Thall[, c(1:5)], count = Thall$y1)[, c(1:4, 6)]
c2 <- cbind(Thall[, c(1:4, 6)], count = Thall$y2)[, c(1:4, 6)]
c3 <- cbind(Thall[, c(1:4, 7)], count = Thall$y3)[, c(1:4, 6)]
c4 <- cbind(Thall[, c(1:4, 8)], count = Thall$y3)[, c(1:4, 6)]
epilepsy <- rbind(c1, c2, c3, c4)
model_gee <- gee(count ~ treat + base + I(age - mean(age, na.rm = TRUE)), id = factor(id), 
                 data = epilepsy, family = poisson, corstr = "exchangeable", scale.fix = TRUE)
pander(glm_coef(model_gee, labels = c("Constant", "Treatment (Prograbide/Control)", 
                               "Baseline count", "Age (years)")), split.table = Inf)
  Coeff Exp(Coeff) Lower CI Upper CI SE Pr(>|z|)
Constant 1.23 NA 1 1.45 0.11 < 0.001
Treatment (Prograbide/Control) -0.13 0.88 0.68 1.14 0.13 0.33
Baseline count 0.02 1.02 1.02 1.02 0 < 0.001
Age (years) 0.02 1.02 1.01 1.04 0.01 0.003

Using glmer:

library(lme4, warn.conflicts = FALSE)
model_glmer <- glmer(count ~ treat + base + I(age - mean(age, na.rm = TRUE)) + 
                       (1|id), data=epilepsy, family=poisson)
pander(glm_coef(model_glmer, labels = c("Constant", "Treatment (Prograbide/Control)", 
                               "Baseline count", "Age (years)")), split.table = Inf)
  Coeff Exp(Coeff) Lower CI Upper CI SE z value Pr(>|z|)
Constant 0.85 NA 0.53 1.18 0.16 5.21 < 0.001
Treatment (Prograbide/Control) -0.22 0.8 0.57 1.13 0.18 -1.27 0.203
Baseline count 0.03 1.03 1.02 1.03 0 8.7 < 0.001
Age (years) 0.01 1.01 0.99 1.04 0.01 0.91 0.364

Effect plot:

plot(Effect(c("age", "treat"), model_glmer), rug = FALSE, lwd = 2, main = NULL,
     xlab = "Age (years)", ylab = "Events", aspect = 3/4, multiline = TRUE)

Do we may have over-dispersion?

pander(estat(~ count|treat, data = epilepsy, label = "Number of seizures"))
  treat N Min. Max. Mean Median SD CV
Number of seizures Control 112 0 76 8.8 5 12.09 1.37
Prograbide 124 0 102 8.31 4 14.48 1.74

Scaling the variance:

model_quasi <- gee(count ~ treat + base + I(age - mean(age, na.rm = TRUE)), id = factor(id), 
                 data = epilepsy, family = quasi(variance = "mu^2", link = "log"), 
                 corstr = "exchangeable")
pander(glm_coef(model_quasi, labels = c("Constant", "Treatment (Prograbide/Control)", 
                               "Baseline count", "Age (years)")), split.table = Inf)
  Coeff Exp(Coeff) Lower CI Upper CI SE Pr(>|z|)
Constant 0.97 NA 0.8 1.15 0.09 < 0.001
Treatment (Prograbide/Control) -0.17 0.85 0.67 1.08 0.12 0.175
Baseline count 0.03 1.03 1.03 1.03 0 < 0.001
Age (years) 0.02 1.02 1 1.03 0.01 0.041