The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
The ebrahim.gof package implements the Ebrahim-Farrington goodness-of-fit test for logistic regression models. This test is particularly effective for binary data and sparse datasets, providing an improved alternative to the traditional Hosmer-Lemeshow test.
Goodness-of-fit testing is crucial in logistic regression to assess whether the fitted model adequately describes the data. The most commonly used test is the Hosmer-Lemeshow test, but it has several limitations:
The Ebrahim-Farrington test addresses these limitations by using a modified Pearson chi-square statistic based on Farrington’s (1996) theoretical framework, but simplified for practical implementation with binary data.
The main function ef.gof()
performs the goodness-of-fit
test:
# Simulate binary data
set.seed(123)
n <- 500
x <- rnorm(n)
linpred <- 0.5 + 1.2 * x
prob <- plogis(linpred) # Convert to probabilities
y <- rbinom(n, 1, prob)
# Fit logistic regression
model <- glm(y ~ x, family = binomial())
predicted_probs <- fitted(model)
# Perform Ebrahim-Farrington test
result <- ef.gof(y, predicted_probs, G = 10)
print(result)
#> Test Test_Statistic p_value
#> 1 Ebrahim-Farrington -1.250567 0.8944537
For binary data with automatic grouping, the Ebrahim-Farrington test statistic is:
\[Z_{EF} = \frac{T_{EF} - (G - 2)}{\sqrt{2(G-2)}}\]
Where: - \(T_{EF}\) is the modified Pearson chi-square statistic - \(G\) is the number of groups - The test statistic follows a standard normal distribution under \(H_0\)
The null hypothesis is that the model fits the data adequately.
The number of groups \(G\) can affect the test’s performance:
# Test with different numbers of groups
group_sizes <- c(4, 8, 10, 15, 20)
results <- data.frame(
Groups = group_sizes,
P_value = sapply(group_sizes, function(g) {
ef.gof(y, predicted_probs, G = g)$p_value
})
)
print(results)
#> Groups P_value
#> 1 4 0.7597797
#> 2 8 0.3993666
#> 3 10 0.8944537
#> 4 15 0.7542151
#> 5 20 0.3920783
Let’s compare the Ebrahim-Farrington test with the traditional Hosmer-Lemeshow test:
# Hosmer-Lemeshow test (requires ResourceSelection package)
if (requireNamespace("ResourceSelection", quietly = TRUE)) {
library(ResourceSelection)
# Perform both tests
ef_result <- ef.gof(y, predicted_probs, G = 10)
hl_result <- hoslem.test(y, predicted_probs, g = 10)
# Compare results
comparison <- data.frame(
Test = c("Ebrahim-Farrington", "Hosmer-Lemeshow"),
P_value = c(ef_result$p_value, hl_result$p.value),
Test_Statistic = c(ef_result$Test_Statistic, hl_result$statistic)
)
print(comparison)
} else {
cat("ResourceSelection package not available for comparison\n")
}
#> ResourceSelection 0.3-6 2023-06-27
#> Test P_value Test_Statistic
#> Ebrahim-Farrington 0.8944537 -1.250567
#> X-squared Hosmer-Lemeshow 0.9431075 2.855296
Let’s examine the power of the test to detect model misspecification:
# Function to simulate power under model misspecification
simulate_power <- function(n, beta_quad = 0.1, n_sims = 100, G = 10) {
rejections_ef <- 0
rejections_hl <- 0
for (i in 1:n_sims) {
# Generate data with quadratic term (true model)
x <- runif(n, -2, 2)
linpred_true <- 0 + x + beta_quad * x^2
prob_true <- plogis(linpred_true)
y <- rbinom(n, 1, prob_true)
# Fit misspecified linear model (omitting quadratic term)
model_mis <- glm(y ~ x, family = binomial())
pred_probs <- fitted(model_mis)
# Ebrahim-Farrington test
ef_test <- ef.gof(y, pred_probs, G = G)
if (ef_test$p_value < 0.05) rejections_ef <- rejections_ef + 1
# Hosmer-Lemeshow test (if available)
if (requireNamespace("ResourceSelection", quietly = TRUE)) {
hl_test <- ResourceSelection::hoslem.test(y, pred_probs, g = G)
if (hl_test$p.value < 0.05) rejections_hl <- rejections_hl + 1
}
}
power_ef <- rejections_ef / n_sims
power_hl <- if (requireNamespace("ResourceSelection", quietly = TRUE)) {
rejections_hl / n_sims
} else {
NA
}
return(list(power_ef = power_ef, power_hl = power_hl))
}
# Calculate power for different sample sizes
sample_sizes <- c(200, 500, 1000)
power_results <- data.frame(
n = sample_sizes,
EbrahimFarrington_Power = sapply(sample_sizes, function(n) {
simulate_power(n, beta_quad = 0.15, n_sims = 50)$power_ef
})
)
if (requireNamespace("ResourceSelection", quietly = TRUE)) {
power_results$HosmerLemeshow_Power <- sapply(sample_sizes, function(n) {
simulate_power(n, beta_quad = 0.15, n_sims = 50)$power_hl
})
}
print(power_results)
#> n EbrahimFarrington_Power HosmerLemeshow_Power
#> 1 200 0.08 0.12
#> 2 500 0.20 0.14
#> 3 1000 0.26 0.22
For datasets with grouped observations (multiple trials per covariate pattern), you can use the original Farrington test:
# Simulate grouped data
set.seed(456)
n_groups <- 30
m_trials <- sample(5:20, n_groups, replace = TRUE)
x_grouped <- rnorm(n_groups)
prob_grouped <- plogis(0.2 + 0.8 * x_grouped)
y_grouped <- rbinom(n_groups, m_trials, prob_grouped)
# Create data frame and fit model
data_grouped <- data.frame(
successes = y_grouped,
trials = m_trials,
x = x_grouped
)
model_grouped <- glm(
cbind(successes, trials - successes) ~ x,
data = data_grouped,
family = binomial()
)
predicted_probs_grouped <- fitted(model_grouped)
# Original Farrington test for grouped data
result_grouped <- ef.gof(
y_grouped,
predicted_probs_grouped,
model = model_grouped,
m = m_trials,
G = NULL # No automatic grouping for original test
)
print(result_grouped)
#> Test Test_Statistic p_value
#> 1 Farrington-Original -1.476122 0.9300444
G
specified):
m
provided,
G = NULL
):
Farrington, C. P. (1996). On Assessing Goodness of Fit of Generalized Linear Models to Sparse Data. Journal of the Royal Statistical Society. Series B (Methodological), 58(2), 349-360.
Ebrahim, Khaled Ebrahim (2024). Goodness-of-Fits Tests and Calibration Machine Learning Algorithms for Logistic Regression Model with Sparse Data. Master’s Thesis, Alexandria University.
Hosmer, D. W., & Lemeshow, S. (2000). Applied Logistic Regression, Second Edition. New York: Wiley.
Hosmer, D. W., & Lemeshow, S. (1980). A goodness-of-fit test for the multiple logistic regression model. Communications in Statistics - Theory and Methods, 9(10), 1043–1069. https://doi.org/10.1080/03610928008827941
The Ebrahim-Farrington test provides a powerful and practical tool for assessing goodness-of-fit in logistic regression, particularly for binary data and sparse datasets. Its simplified implementation makes it accessible for routine use while maintaining strong theoretical foundations.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.