The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Type: Package
Title: Ebrahim-Farrington Goodness-of-Fit Test for Logistic Regression
Version: 1.0.0
Date: 2025-08-20
Maintainer: Ebrahim Khaled Ebrahim <ebrahimkhaled@alexu.edu.eg>
Description: Implements the Ebrahim-Farrington goodness-of-fit test for logistic regression models, particularly effective for sparse data and binary outcomes. This test provides an improved alternative to the traditional Hosmer-Lemeshow test by using a modified Pearson chi-square statistic with data-dependent grouping. The test is based on Farrington (1996) theoretical framework but simplified for practical implementation with binary data. Includes functions for both the original Farrington test (for grouped data) and the new Ebrahim-Farrington test (for binary data with automatic grouping). For more details see Hosmer (1980) <doi:10.1080/03610928008827941> and Farrington (1996) <doi:10.1111/j.2517-6161.1996.tb02086.x>.
License: GPL-3
URL: https://github.com/ebrahimkhaled/ebrahim.gof
BugReports: https://github.com/ebrahimkhaled/ebrahim.gof/issues
Depends: R (≥ 3.5.0)
Imports: stats
Suggests: testthat (≥ 3.0.0), knitr, rmarkdown, ResourceSelection, ggplot2
Encoding: UTF-8
RoxygenNote: 7.3.2
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2025-08-27 09:05:01 UTC; ebrah
Author: Ebrahim Khaled Ebrahim [aut, cre]
Repository: CRAN
Date/Publication: 2025-09-01 17:10:22 UTC

Ebrahim-Farrington Goodness-of-Fit Test for Logistic Regression

Description

Performs the Ebrahim-Farrington goodness-of-fit test for logistic regression models. This test is particularly effective for binary data and sparse datasets, providing an improved alternative to the traditional Hosmer-Lemeshow test.

Usage

ef.gof(y, predicted_probs, model = NULL, m = NULL, G = 10)

Arguments

y

Numeric vector of binary responses (0/1) for binary data, or counts of successes for grouped data.

predicted_probs

Numeric vector of predicted probabilities from the logistic regression model. Must be same length as y.

model

Optional glm object. Required only for the original Farrington test with grouped data (when m is provided and G is NULL).

m

Optional numeric vector of trial counts for each observation (for grouped data). If NULL, data is assumed to be binary.

G

Optional integer specifying the number of groups for binary data grouping. Default is 10. If NULL, no grouping is performed and m must be provided.

Details

The Ebrahim-Farrington test is based on Farrington's (1996) theoretical framework but simplified for practical implementation with binary data. The test uses a modified Pearson chi-square statistic with data-dependent grouping, where observations are grouped by their predicted probabilities.

For binary data (when G is specified), the test automatically groups observations into G groups based on predicted probabilities and applies the simplified Ebrahim-Farrington statistic:

Z_{EF} = \frac{T_{EF} - (G - 2)}{\sqrt{2(G-2)}}

where T_{EF} is the modified Pearson chi-square statistic, and G is the number of groups.

For grouped data (when m is provided), the test applies the original Farrington test with full variance calculations.

Value

A data frame with the following columns:

Test

Character string identifying the test performed

Test_Statistic

Numeric value of the standardized test statistic

p_value

Numeric p-value for the test

Note

Author(s)

Ebrahim Khaled Ebrahim ebrahimkhaled@alexu.edu.eg

References

Farrington, C. P. (1996). On Assessing Goodness of Fit of Generalized Linear Models to Sparse Data. *Journal of the Royal Statistical Society. Series B (Methodological)*, 58(2), 349-360. Ebrahim, K. E. (2025). Goodness-of-Fits Tests and Calibration Machine Learning Algorithms for Logistic Regression Model with Sparse Data. *Master's Thesis*, Alexandria University. Hosmer, D. W., & Lemeshow, S. (1980). A goodness-of-fit test for the multiple logistic regression model. *Communications in Statistics - Theory and Methods*, 9(10), 1043–1069. https://doi.org/10.1080/03610928008827941

See Also

hoslem.test for the Hosmer-Lemeshow test

Examples

# Example 1: Binary data with automatic grouping (Ebrahim-Farrington test)
set.seed(123)
n <- 500
x <- rnorm(n)
linpred <- 0.5 + 1.2 * x
prob <- 1 / (1 + exp(-linpred))
y <- rbinom(n, 1, prob)

# Fit logistic regression
model <- glm(y ~ x, family = binomial())
predicted_probs <- fitted(model)

# Perform Ebrahim-Farrington test with 10 groups
result <- ef.gof(y, predicted_probs, G = 10)
print(result)

# Example 2: Compare with different number of groups
result_4 <- ef.gof(y, predicted_probs, G = 4)
result_20 <- ef.gof(y, predicted_probs, G = 20)

# Example 3: Grouped data (original Farrington test)
# Note: This requires actual grouped data with trials > 1
# Simulated grouped data
n_groups <- 50
m_trials <- sample(5:20, n_groups, replace = TRUE)
x_grouped <- rnorm(n_groups)
linpred_grouped <- -0.5 + 1.0 * x_grouped
prob_grouped <- 1 / (1 + exp(-linpred_grouped))
y_grouped <- rbinom(n_groups, m_trials, prob_grouped)

# Fit model for grouped data
data_grouped <- data.frame(successes = y_grouped, trials = m_trials, x = x_grouped)
model_grouped <- glm(cbind(successes, trials - successes) ~ x, 
                     data = data_grouped, family = binomial())
predicted_probs_grouped <- fitted(model_grouped)

# Original Farrington test
result_grouped <- ef.gof(y_grouped, predicted_probs_grouped, 
                         model = model_grouped, m = m_trials)
print(result_grouped)


These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.