The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Streaming Kernel PLS in bigPLSR: XX^T and Column-Chunked Variants

Frédéric Bertrand

Cedric, Cnam, Paris
frederic.bertrand@lecnam.net

2025-11-26

Overview

This vignette documents bigPLSR’s kernel PLS streaming backends for bigmemory::big.matrix inputs. We provide two complementary streaming strategies:

Both strategies produce the same model up to floating point round-off. Selection is automatic (see ?pls_fit) or can be forced via the option options(bigPLSR.kpls_gram = "rows" | "cols" | "auto").

Math sketch

Let X in R^{n x p}, Y in R^{n x m} be centered.

At component h, kernel-PLS uses the NIPALS-like fixed-point update

  1. Start with u in R^n (e.g., a column of Y).
  2. Compute a = X^T u.
  3. Normalize w = a / ||a||_2.
  4. Scores: t = X w.
  5. Loadings:
    • p = (X^T t)/(t^T t),
    • q = (Y^T t)/(t^T t).
  6. Deflate: X <- X - t p^T, Y <- Y - t q^T, and set u <- Y q.

Coefficients after H components are

beta = W (P^T W)^{-1} Q^T,

yhat = 1 * mu_Y + (x - mu_X) beta.

The row-chunked implementation keeps X on disk and performs steps (2) and (4) with two passes over row blocks:

Loadings p are accumulated precisely like Pass A but with t instead of u.

APIs

pls_fit() chooses the variant via options(bigPLSR.kpls_gram) or heuristics when "auto" is set (the default).

When to prefer each variant

References

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.