| Type: | Package |
| Title: | Information Assessment for Individual Modalities in Multimodal Regression Models |
| Version: | 1.0 |
| Description: | Provides methods for quantifying the information gain contributed by individual modalities in multimodal regression models. Information gain is measured using Expected Relative Entropy (ERE) or pseudo-R² metrics, with corresponding p-values and confidence intervals. Currently supports linear and logistic regression models with plans for extension to additional Generalized Linear Models and Cox proportional hazard model. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Imports: | tidyverse, MASS, SIS, glmnet, ncvreg, MBESS, survival, dplyr |
| Depends: | R (≥ 3.6.0) |
| NeedsCompilation: | no |
| Packaged: | 2025-08-29 20:33:59 UTC; 10518 |
| Author: | Wanting Jin [aut, cre], Quefeng Li [aut] |
| Maintainer: | Wanting Jin <jinwanting5@gmail.com> |
| LazyData: | true |
| Repository: | CRAN |
| Date/Publication: | 2025-09-03 21:30:02 UTC |
Example Dataset
Description
A toy dataset to demonstrate running this package on multimodal linear models.
Usage
data_linear_model
Format
A data object that contains
yA vector of 200 observations of continuous outcomes.
XA 200
\times600 matrix containing all training data.mod.idxA list of modality indices.
Example Dataset
Description
A toy dataset to demonstrate running this package on multimodal logistic models.
Usage
data_logistic_model
Format
A data object that contains
yA vector 200 observations of outcomes. (0 or 1)
XA 200
\times600 matrix containing all training data.mod.idxA list of modality indices.
Modality Assessment in Multimodal Generalized Linear Models
Description
Provides statistical inference for modality-specific information gain in multimodal GLMs. Estimates ERE and pseudo-R² with confidence intervals and p-values using Sure Independence Screening for variable selection and penalized likelihood for inference.
Usage
mglm.test(
X,
y,
mod.idx,
family = c("gaussian", "binomial"),
iter = TRUE,
penalty = c("SCAD", "MCP", "lasso"),
tune = c("bic", "ebic", "aic"),
lambda = NULL,
nlambda = 100,
conf.level = 0.95,
CI.type = c("two.sided", "one.sided"),
trace = FALSE
)
Arguments
X |
The |
y |
The |
mod.idx |
A list of column indices for all modalities in the concatenated data matrix |
family |
A description of the error distribution and link function to be used in the model. Currently, we allow the Binomial ("binomial") and Gaussian ("gaussian") families with canonical links only. |
iter |
Specifies whether to perform iterative SIS. The default is
|
penalty |
Specifies the type of penalty to be used in the variable selection and
inference procedure.
Options include |
tune |
Specifies the method for selecting the optimal tuning parameters in (I)SIS and
penalized likelihood procedure. Options include |
lambda |
A user-specified decreasing sequence of lambda values for penalized likelihood
procedure. By default, a sequence of values of length |
nlambda |
The number of lambda values. The default is 100. |
conf.level |
Levels of the confidence interval. The default is |
CI.type |
A string specifying the type of the confidence interval. Options include
|
trace |
Specifies whether to print out logs of iterations in SIS procedure. The default is
|
Value
An object with S3 class "mglm.test" containing:
sel.idx |
List of indices of selected features by (I)SIS in each modality. |
num.nonzeros |
Number of selected features by (I)SIS in each modality. |
ERE |
Point estimation of ERE for each modality. |
ERE.CI.L |
Lower bound of the confidence interval of ERE for each modality |
ERE.CI.U |
Upper bound of the confidence interval of ERE for each modality |
R2 |
Point estimate of pseudo- |
R2.CI.L |
Lower bound of the confidence interval of pseudo- |
R2.CI.U |
Upper bound of the confidence interval of pseudo- |
conf.level |
Level of confidence intervals. |
Examples
## Example 1: Linear model
data(data_linear_model)
X <- data_linear_model$X
y <- data_linear_model$y
mod.idx <- data_linear_model$mod.idx
test <- mglm.test(X = X, y = y, mod.idx = mod.idx, family = "gaussian",
iter = TRUE, penalty = "SCAD", tune = "bic",
conf.level = 0.95, CI.type = "one.sided")
summary(test)
## Example 2: Logistic regression
data(data_logistic_model)
X <- data_logistic_model$X
y <- data_logistic_model$y
mod.idx <- data_logistic_model$mod.idx
test <- mglm.test(X = X, y = y, mod.idx = mod.idx, family = "binomial",
iter = TRUE, penalty = "SCAD", tune = "bic",
conf.level = 0.95, CI.type = "two.sided")
sum.test <- summary(test)
Summary method for objects of class "mglm.test"
Description
Summary method for objects of class "mglm.test"
Usage
## S3 method for class 'mglm.test'
summary(object, ...)
## S3 method for class 'summary.mglm.test'
print(x, ...)
Arguments
object |
An |
... |
Additional arguments that could be passed to |
x |
A |
Value
An object with S3 class summary.mglm.test. The class has its own print
method and contains the following list of elements.
sum.ERE |
The summary table of point estimate and confidence interval of ERE for each modality. |
sum.R2 |
The summary table of point estimate and confidence interval
of pseudo- |
conf.level |
Level of confidence intervals. |
sel.mod |
Index of the most informative modality. |