The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Three method extensions built on the core EM, each closing a documented gap in the mixture-of-quantile-regressions toolkit.
family = "expectile" / "mquantile").
Asymmetric-least-squares (Newey & Powell 1987) and asymmetric- Huber
(Breckling & Chambers 1988) component losses, fitted by IRLS through
new registry engines. Expectile components are crossing-free in the
asymmetry level by construction; M-quantile dials between the quantile
and the expectile.mixqr_pen()). A SCAD / adaptive-LASSO / LASSO / MCP
penalty on the weighted check-loss M-step (the quantile analogue of
Khalili & Chen 2007), with each component getting its own sparse
support, a mixture BIC path for tuning, and component pruning. The inner
solve reuses rqPen; selectedVars() reports the
active set per component.mixqr_nc()). Fits a vector of
quantile levels jointly with one latent classification shared across all
levels (a coupled E-step), closing the two problems Wu & Yao (2016,
sec. 5) leave open: cross-level classification ambiguity and
within-component crossing (repaired by monotone rearrangement,
Chernozhukov, Fernandez-Val & Galichon 2010).
sim_mixqr_cross() provides a crossing-exhibiting design for
demonstrations.First CRAN release.
Exported two extension-API building blocks,
weighted_rq() and constrained_kde(), so
companion packages (location-varying gating, non-crossing) can reuse the
component and error-density machinery without forking the core.
Post-review refinements (correctness and performance) addressing two independent adversarial peer reviews:
Constraint integrity (R1). The constrained KDE
now preserves the tau-quantile = 0 constraint in every feasible case:
the two-constant Hall-Presnell weights are used only when non-negative
and well-conditioned, otherwise a per-point empirical-likelihood tilt
(Hall & Presnell 1999) enforces the constraint (verified for tau in
{0.05, 0.5, 0.9, 0.95}). Genuinely infeasible (one-sided) components are
flagged via fit$diagnostics$constraint and a warning, never
silently mis-calibrated.
Faithful Algorithm 3.1 (R2). The stochastic-EM P-step now draws the mixing probabilities (rejection-sampled, eq. 3.4) and the error density (bootstrap), not only the regression coefficients.
Calibrated standard errors (R3). Sparsity SEs
are disclosed as classification-conditional (in summary());
under-supported components and rank-deficient weighted designs now warn;
se_method/se_conditional recorded.
kdEM performance (R4). The E-step uses O(n) grid
interpolation and the grid is built by a binned/FFT KDE
(stats::density), removing the O(n^2) cost – kdEM is now
~3x ALD (was ~220x), meeting the speed target.
Bounded separability diagnostic (R6).
mi_fraction is now a bounded trace ratio in [0, 1]
(previously could return ~1e14 on imbalanced clusters).
New responsibility-based overlap diagnostic
(fit$diagnostics$overlap), independent of the stochastic-EM
path.
Real data (R5). Ships the engine
dataset (Brinkman 1981 ethanol-combustion data, the Wu & Yao Fig. 5
example); the README example now uses it. Added a golden test
reproducing the Wu & Yao Table 1 simulation means.
Selection rigor (R7).
mixqr_select() gains criterion = "cv" (K-fold
cross-validated held-out predictive log-likelihood) that PENALISES
complexity and works for either engine; AIC/BIC selection now emits the
mixture-boundary caveat and the ALD likelihood is labelled a working
likelihood.
Slope-based identifiability (R8). Default label ordering is now by slope (aligned with Wu & Yao Thm 2.1’s distinct-slope condition), and the distinctness guard uses a scale-relative threshold.
Robustness / UX (R10). Rank-deficient
(collinear) designs now error clearly; added
confint.mixqr() (Wald intervals).
Calibrated standard errors. The sparsity
variance now reads f(0) off a kernel density estimate of
the component residuals (Wu & Yao 2016, p.166) rather than the ALD
working density. A Monte-Carlo benchmark
(inst/benchmarks/se_coverage.R) shows
variance = "stochEM" now achieves ~95% (near-nominal)
coverage for the regression coefficients, up from ~67-77%; the
mixing-probability intervals reflect the documented finite-sample
pi-bias.
Diagnostics & docs. New mixqr()
help sections on the Wu & Yao sec.6 semiparametric bias and on
standard-error validity;
predict(type = "quantile_byclass"); component-collapse
and ALD non-monotonicity warnings; fit$total_iter (total EM
iterations across starts). Removed the dead package URL.
Documentation site. A full pkgdown website with
a comprehensive applied tutorial (“A Tutorial on Mixtures of Quantile
Regressions”) featuring publication-ready ggplot2 visualizations, a
get-started vignette, and a validation & diagnostics article. Added
inst/CITATION, author/affiliation metadata, and documented
every exported method.
First release. The frequentist EM substrate (sub-project 01 of the QMM suite).
mixqr() fits finite mixtures of tau-quantile
regressions with two engines: "ald" (fast parametric
asymmetric-Laplace mixture, genuine likelihood + AIC/BIC) and
"kdEM" (Wu & Yao 2016 kernel-density EM with
nonparametric component error densities, unequal or pooled), via a
generic pluggable EM driver mixqr_em().V_W + (1 + 1/B) V_B) with a cluster-separability
diagnostic.mixqr_select() for component-count selection
(AIC/BIC).sim_mixqr2() /
sim_mixqr3() reproducing the Wu & Yao 2- and
3-component designs.register_mixqr_engine())
and reserved diagnostics$crossing /
diagnostics$class_stability slots — the integration channel
for QMM sub-projects 03 (gating) and 04 (non-crossing).Note: this v0.1 is pure R. Rcpp acceleration of the KDE/E-step hot loops is planned.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.