README

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

dfr

Implementation of the Dual Feature Reduction (DFR) approach for the Sparse Group Lasso (SGL) and the Adaptive Sparse Group Lasso (aSGL). The DFR approach is a feature reduction approach that applies strong screening to reduce the feature space before optimisation, leading to speed-up improvements for fitting SGL models. DFR is implemented using the Adaptive Three Operator Splitting (ATOS) algorithm, with linear and logistic SGL models supported, both of which can be fit using k-fold cross-validation. Dense and sparse input matrices are supported.

Installation

install.packages("dfr")

Your R configuration must allow for a working Rcpp. To install a develop the development version from GitHub run

library(devtools)
install_github("ff1201/dfr")

Example

library(dfr)
groups = c(rep(1:20, each=3),
           rep(21:40, each=4),
           rep(41:60, each=5),
           rep(61:80, each=6),
           rep(81:100, each=7))

data = sgs::gen_toy_data(p=500, n=400, groups = groups, seed_id=3)

model = dfr_sgl(X = data$X, y = data$y, groups = groups, alpha = 0.95)

where X is the input matrix, y the response vector, groups a vector containing indices for the groups of the predictors, and alpha determines the convex balance between the lasso and group lasso.

no_screen = system.time(model <- dfr_sgl(X = data$X, y = data$y, groups = groups, alpha = 0.95,screen=FALSE))
screen = system.time(model_screen <- dfr_sgl(X = data$X, y = data$y, groups = groups, alpha = 0.95,screen=TRUE))
c(no_screen[3], screen[3])

library(dfr)
groups = c(rep(1:20, each=3),
           rep(21:40, each=4),
           rep(41:60, each=5),
           rep(61:80, each=6),
           rep(81:100, each=7))

data = sgs::gen_toy_data(p=500, n=400, groups = groups, seed_id=3)

model = dfr_adap_sgl(X = data$X, y = data$y, groups = groups, alpha = 0.95, gamma_1 = 0.1, gamma_2 = 0.1)

where gamma_1 and gamma_2 determine the shape of the adaptive penalties. Again, we can see the impact of screening

no_screen = system.time(model <- dfr_adap_sgl(X = data$X, y = data$y, groups = groups, alpha = 0.95, gamma_1 = 0.1, gamma_2 = 0.1, screen=FALSE))
screen = system.time(model_screen <- dfr_adap_sgl(X = data$X, y = data$y, groups = groups, alpha = 0.95, gamma_1 = 0.1, gamma_2 = 0.1, screen=TRUE))
c(no_screen[3], screen[3])

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.