Custom Contributions and Model Comparison

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

When contr_name() Is Not Enough

contr_name() covers standard R distributions, but sometimes you need full control: a non-standard parameterization, a truncated distribution, or a model with constraints. contr_fn() wraps arbitrary R functions into a contribution.

Analytical Derivatives

contr_name() contributions rely on numDeriv for the score and Hessian. For performance-critical applications, you can supply analytical derivatives via contr_fn():

library(likelihood.contr)
library(likelihood.model)
#> Loading required package: algebraic.mle

# Exponential exact: log f(t; lambda) = log(lambda) - lambda * t
exp_exact <- contr_fn(
  loglik = function(df, par, ...) {
    sum(dexp(df$t, rate = par[1], log = TRUE))
  },
  score = function(df, par, ...) {
    c(rate = nrow(df) / par[1] - sum(df$t))
  },
  hess = function(df, par, ...) {
    matrix(-nrow(df) / par[1]^2, 1, 1)
  }
)

You can mix analytical and numerical contributions in the same model. Here the exact contribution uses analytical derivatives while the right-censored contribution falls back to numDeriv:

model <- likelihood_contr(
  obs_type = "status",
  exact = exp_exact,
  right = contr_name("exp", "right", ob_col = "t")
)

set.seed(42)
raw <- rexp(200, rate = 2)
df <- data.frame(
  t      = pmin(raw, 1.0),
  status = ifelse(raw <= 1.0, "exact", "right")
)

result <- fit(model)(df, par = c(rate = 1))
summary(result)
#> Maximum Likelihood Estimate (Fisherian)
#> ----------------------------------------
#> 
#> Coefficients:
#>      Estimate Std. Error   2.5% 97.5%
#> rate   1.8421     0.1413 1.5652 2.119
#> 
#> Log-likelihood: -66.14 
#> AIC: 134.3 
#> Number of observations: 200

Model Comparison with LRT

The likelihood ratio test compares nested models. Is the data better explained by a Weibull (2 parameters) than an exponential (1 parameter)?

set.seed(99)
df_test <- data.frame(
  t      = rweibull(200, shape = 2, scale = 3),
  status = "exact"
)

null_model <- likelihood_contr(
  obs_type = "status",
  exact = contr_name("exp", "exact", ob_col = "t")
)
alt_model <- likelihood_contr(
  obs_type = "status",
  exact = contr_name("weibull", "exact", ob_col = "t")
)

null_fit <- suppressWarnings(fit(null_model)(df_test, par = c(rate = 0.5)))
alt_fit  <- suppressWarnings(fit(alt_model)(df_test, par = c(shape = 1.5, scale = 2)))

lrt(
  null     = null_model,
  alt      = alt_model,
  data     = df_test,
  null_par = coef(null_fit),
  alt_par  = coef(alt_fit),
  dof      = 1
)
#> Likelihood Ratio Test
#> ---------------------
#> Test statistic: 105.8 
#> Degrees of freedom: 1 
#> P-value: 8.001e-25

The data was generated from a Weibull with shape = 2, so the test correctly rejects the exponential null.

Fisher Information via Monte Carlo

When the model has a data-generating function, fim() estimates the expected Fisher information by simulation:

model_with_rdata <- likelihood_contr(
  obs_type = "status",
  exact = contr_name("exp", "exact", ob_col = "t"),
  rdata_fn = function(theta, n, ...) {
    data.frame(t = rexp(n, rate = theta[1]), status = "exact")
  }
)

set.seed(1)
I <- fim(model_with_rdata)(theta = c(rate = 2), n_obs = 100, n_samples = 500)
I
#>      rate
#> rate   25

For n = 100 exponential observations with rate = 2, the analytical Fisher information is n / rate^2 = 25. The Monte Carlo estimate should be close.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.