---
title: "Get started with mixqr"
author: "Kailas Venkitasubramanian"
output:
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 2
bibliography: mixqr.bib
vignette: >
  %\VignetteIndexEntry{Get started with mixqr}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE,
  fig.width = 7, fig.height = 4.2, dpi = 150, fig.align = "center"
)
set.seed(1)
```

**mixqr** is an extensible framework for finite mixtures of quantile (and
expectile) regressions: at its core it finds hidden subgroups in your data and
fits a separate quantile regression in each. This page is a five-minute tour of
that core; the [Tutorial](mixqr-tutorial.html) is the full guide, and the
[Extensions article](https://kvenkita.github.io/mixqr/articles/mixqr-extensions.html)
covers the expectile/M-quantile families, penalized selection, and non-crossing
multi-quantile estimation built on the same platform.

```{r}
library(mixqr)
```

## A two-regime example

The `engine` data [@brinkman1981] record the equivalence ratio (richness of the
air/fuel mix) against nitrous-oxide concentration for a test engine. A single
line fits badly; there are **two regimes**.

```{r fit}
fit <- mixqr(equivalence ~ nox, data = engine, tau = 0.5, m = 2,
             variance = "stochEM")
fit
```

`mixqr()` has jointly (i) split the observations into two groups and (ii)
estimated a **median** regression in each. `summary()` adds standard errors:

```{r summary}
summary(fit)
```

## A first picture

A little `ggplot2` shows the two recovered regimes and their median lines.

```{r plot, fig.alt = "Engine data coloured by recovered regime with two median regression lines."}
library(ggplot2)

dat <- transform(engine, regime = factor(predict(fit, type = "class")))
grid <- data.frame(nox = seq(min(engine$nox), max(engine$nox), length.out = 100))
lines <- do.call(rbind, lapply(1:2, function(j) {
  data.frame(nox = grid$nox,
             equivalence = cbind(1, grid$nox) %*% fit$beta[, j],
             regime = factor(j))
}))

ggplot(dat, aes(nox, equivalence, colour = regime)) +
  geom_point(size = 2, alpha = 0.8) +
  geom_line(data = lines, linewidth = 1.1) +
  scale_colour_manual(values = c("#1b6ca8", "#e07b39")) +
  labs(x = "Nitrous oxide", y = "Equivalence ratio",
       title = "Two median regimes recovered by mixqr") +
  theme_minimal(base_size = 12)
```

## Where to next

* The **[Tutorial](mixqr-tutorial.html)** walks through a full applied analysis:
  interpreting estimates, classifying observations, reading diagnostics, fitting
  several quantiles, choosing the number of components, and reporting results,
  all with publication-ready graphics.
* The **[Validation article](https://kvenkita.github.io/mixqr/articles/mixqr-validation.html)**
  documents the simulation evidence behind the point estimates and the standard errors.

```{r cite, eval = FALSE}
citation("mixqr")
```

## References