---
title: "Sparse Linear Regression with Quadrupen"
subtitle: "A tour of standard analysis (fit, cross-validation, information criteria, stability) "
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{quadrupen-basics}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4
)
```

## Setup

```{r setup}
library(quadrupen)
data("Birthwt", package = "grpreg")
y <- Birthwt$bwt[-130] ## outlier
X <- Birthwt$X[-130, ]
```

The `Birthwt` dataset contains birth weight data with `r nrow(Birthwt$X)` observations and `r ncol(Birthwt$X)` predictors. We use it throughout this vignette to illustrate sparse linear regression.

## Fitting sparse linear models

The main entry point is `sparse_lm()`, which fits a regularization path for several sparse penalties: `"l1"` (Lasso/Elastic-net), `"mcp"` (Minimax Concave Penalty) and `"scad"` (Smoothly Clipped Absolute Deviation). Convenience wrappers `lasso()`, `mcp()`, `scad()` and `elastic_net()` are also available.

### Lasso

```{r lasso-fit}
fit_lasso <- lasso(X, y, minratio = 1e-2)
fit_lasso
```

### SCAD

SCAD is a non-convex penalty that matches the Lasso near the origin and tapers to a constant for large coefficients, thereby eliminating shrinkage bias for strong signals. The shape parameter `eta` must be greater than 2.

```{r scad-fit}
fit_scad <- scad(X, y, eta = 4, minratio = 1e-2)
fit_scad
```

### MCP

MCP applies a linearly diminishing marginal penalty that vanishes once a coefficient exceeds `eta × lambda1`, yielding near-unbiased estimates for large signals. The shape parameter `eta` must be greater than 1.

```{r mcp-fit}
fit_mcp <- mcp(X, y, eta = 3, minratio = 1e-2)
fit_mcp
```

### Elastic-Net

The Elastic-net combines an $\ell_1$ and an $\ell_2$ penalty, trading off sparsity against grouping of correlated predictors. The `lambda2` argument controls the ridge component.

```{r enet-fit}
fit_enet <- elastic_net(X, y, lambda2 = 1, minratio = 1e-2)
fit_enet
```

### Fit objects

All objects returned by `sparse_lm()` and its wrappers are R6 instances of the class `SparseFit`, inheriting from `QuadrupenFit`. These objects (see [`QuadrupenFit`]) store all data related to the fit, accessible directly via named fields (e.g., `fit_lasso$coefficients`, `fit_lasso$deviance`, `fit_lasso$degrees_freedom`; see `str(fit_lasso)` and the documentation).

They also provide methods for visualizing and analyzing the fit, and for pursuing complementary analyses such as model selection, cross-validation, and stability selection. For users unfamiliar with R6 classes, S3 methods are exported for the most common operations (e.g., `plot(fit_lasso)` is equivalent to `fit_lasso$plot()`), but the R6 methods expose more options.

## Regularization paths

`$plot_path()` (or `plot(fit, type = "path")`) displays how coefficients evolve along the penalty path.

```{r path-plots, fig.show='hold', out.width='50%'}
fit_lasso$plot_path(xvar="fraction", log_scale = FALSE)
fit_mcp$plot_path(labels = colnames(X))
fit_scad$plot_path()
fit_enet$plot_path(standardize=FALSE)
```

## Information criteria

`$criteria()` computes AIC, BIC, mBIC, eBIC and GCV from the estimated degrees of freedom, storing the result as an [`InformationCriteria`] R6 object in the field `$information_criteria` of the current fit. This method is called automatically during fitting (`sparse_lm()` or any wrapper) at no additional computational cost, so users rarely need to invoke it manually — the main exception being when a custom penalty term is desired.

```{r criteria}
fit_lasso$criteria()
```

```{r criteria-plots, fig.show='hold', out.width='50%'}
fit_lasso$plot(type = "criteria")
```

For greater flexibility, use the `plot` method of the `InformationCriteria` object directly to select which criteria to display:

```{r criteria-plots-2, fig.show='hold', out.width='50%'}
fit_lasso$information_criteria$plot(c("AIC", "BIC", "mBIC"))
fit_mcp$information_criteria$plot("GCV")
```

## Model extraction

`$get_model()` extracts the coefficient vector selected by a given criterion.

```{r get-model}
coef_lasso_bic <- fit_lasso$get_model("BIC")
coef_mcp_bic   <- fit_mcp$get_model("BIC")

cat("Non-zero Lasso coefficients (BIC):\n")
print(coef_lasso_bic[coef_lasso_bic != 0])

cat("\nNon-zero MCP coefficients (BIC):\n")
print(coef_mcp_bic[coef_mcp_bic != 0])
```

An existing lambda value can also be passed directly:

```{r get-model-lambda}
lambda_lasso <- fit_lasso$major_tuning
coef_lasso_lambda <- fit_lasso$get_model(lambda_lasso[18])

cat("Non-zero Lasso coefficients for lambda = ", round(lambda_lasso[18], 2), "\n")
print(coef_lasso_lambda[coef_lasso_lambda != 0])
```

## Cross-validation

`$cross_validate()` performs K-fold cross-validation over the penalty grid, storing the result as a [`CrossValidation`] R6 object in the field `$cross_validation` of the current fit.

```{r cv, message=FALSE}
set.seed(42)
fit_lasso$cross_validate(K = 10, verbose = FALSE)
fit_mcp$cross_validate(K = 10, verbose = FALSE)
```

```{r cv-plots, fig.show='hold', out.width='50%'}
fit_lasso$plot(type = "crossval")
fit_mcp$plot(type = "crossval")
```

Model selection using the CV-minimizing penalty:

```{r cv-model}
coef_lasso_cv <- fit_lasso$get_model("CV_min")
cat("Non-zero Lasso coefficients (CV min):\n")
print(coef_lasso_cv[coef_lasso_cv != 0])
```

## Cross-validation on $\lambda_1 \times \lambda_2$

For regularizers with two penalties (typically the Elastic-net), cross-validation can be run on a two-dimensional grid:

```{r cv2D, message=FALSE}
set.seed(42)
fit_enet$cross_validate(K = 5, lambda2 = 10^seq(1, -3, len=20), verbose = FALSE)
```

```{r cv-plots-2D, fig.show='hold', out.width='50%'}
fit_enet$plot(type = "crossval")
fit_enet$cross_validation$plotCV_1D(se =  FALSE) + ggplot2::ylim(c(0.4, 0.55))
```

## Prediction

`$predict()` returns fitted values for the whole path, or for a specific model selected by a criterion or a lambda value.

```{r predict}
## Predictions at the CV_min-selected model
y_hat_lasso <- fit_lasso$predict(selection = "CV_min")
y_hat_mcp   <- fit_mcp$predict(selection = "CV_min")
y_hat_enet  <- fit_enet$predict(selection = "CV_min")

## R² at the selected model
r2_lasso <- fit_lasso$r_squared[fit_lasso$get_model("CV_min", type = "index")]
r2_mcp   <- fit_mcp$r_squared[fit_mcp$get_model("CV_min", type = "index")]
r2_enet  <- fit_enet$r_squared[fit_enet$get_model("CV_min", type = "index")]
cat(sprintf("R² Lasso (CV_min): %.3f\nR² MCP   (CV_min): %.3f\nR² Enet  (CV_min): %.3f\n", r2_lasso, r2_mcp, r2_enet))
```

## Stability selection

`$stability()` estimates selection probabilities via repeated sub-sampling (Meinshausen & Bühlmann, 2010). Variables that appear consistently across sub-samples receive high probabilities and are considered robustly selected. The stability path is stored as a [`StabilityPath`] R6 object in the field `$stability_path` of the current fit.

```{r stability, message=FALSE}
set.seed(42)
fit_lasso$stability(n_subsamples = 200, verbose = FALSE)
```

```{r stability-plot}
fit_lasso$plot(type = "stability")
fit_lasso$stability_path$plot(nvarsel = 7)
colnames(X)[fit_lasso$stability_path$selection(nvarsel = 7) ]
```

## Debiased Estimators

Every $\ell_1$-based regularizer is known to be biased due to the shrinkage effect. A standard remedy is to refit the model on the selected support without penalization. This is implemented in **quadrupen** at no extra computational cost — the refitted coefficients are computed alongside the regularization path — and can be activated by setting the `$debias` field to `TRUE`.


```{r debias}
fit_lasso$debias <- TRUE
fit_lasso$plot_path()
```

Since both versions are stored in the same object, it is straightforward to compare the regularized and debiased solutions — for the path as well as for CV error — by toggling `$debias`.

```{r debias-cv}
fit_lasso$cross_validate(verbose = FALSE)
fit_lasso$plot(type = "crossval")
fit_lasso$debias <- FALSE
fit_lasso$plot_path()
```

Setting `$debias` back to `FALSE` restores the original regularized coefficients.