The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
grid_weights doesMetaHunt represents every study-level function by its values on a
shared grid x_1, ..., x_G. Distances between two functions
f and h are computed in a weighted L^2
sense:
||f - h||^2 = sum_j w_j * ( f(x_j) - h(x_j) )^2
The vector w = grid_weights is exactly that vector of
weights — one non-negative number per grid point. It defines the measure
mu that the inner product
<f, h>_{L^2(mu)} = sum_j w_j * f(x_j) * h(x_j)
is taken with respect to. Every routine in MetaHunt that talks about
“distance between functions” or “norm of a function” — basis hunting,
constrained projection, conformal pointwise scores — uses the same
grid_weights you supply.
The default is grid_weights = NULL, which inside the
package becomes the uniform vector rep(1 / G, G). Every
grid point counts equally.
This is the right choice when the grid itself is a
representative sample from the population whose functional you
care about. For example, if you sub-sampled the grid from a held-out
target-population cohort with build_grid(), the empirical
distribution of the grid already encodes the target measure, and uniform
grid_weights simply averages over that empirical
distribution.
In short: if your grid was sampled from population P and
you want distances to reflect P, leave
grid_weights = NULL.
grid_weights matterYou should override the default when the grid distribution does not match the population whose distances you care about. Two common scenarios:
grid_weights proportional
to the target density evaluated at each grid point.age > 65). Set grid_weights[j] to
0 outside the sub-region and to the target density
inside.grid_weights does not need to sum to 1
— only its relative magnitudes matter inside dfspa() and
project_to_simplex(). The default wrapper in
apply_wrapper() divides by sum(grid_weights),
so weighted means come out on the natural scale regardless of
normalisation.
Here we generate the same data twice — m = 40 studies,
G = 30 grid points — and run dfspa() with two
different grid_weights. The first run is uniform; the
second up-weights the right half of the grid by a factor of about 5.
m <- 40
G <- 30
x <- seq(0, 1, length.out = G)
# True bases on the grid.
basis <- rbind(sin(pi * x), cos(pi * x), x)
# Mixing weights driven by two latent covariates.
W <- data.frame(w1 = rnorm(m), w2 = rnorm(m))
beta <- cbind(c(1, -0.8), c(-0.5, 1.2), c(0, 0))
eta <- as.matrix(W) %*% beta
pi_true <- exp(eta) / rowSums(exp(eta))
F_hat <- pi_true %*% basis + matrix(rnorm(m * G, sd = 0.05), m, G)
dim(F_hat)
#> [1] 40 30Now the two grid_weights choices:
gw_uniform <- rep(1 / G, G)
# Up-weight x > 0.5 by ~5x. Renormalisation is optional but tidy.
gw_right <- ifelse(x > 0.5, 5, 1)
gw_right <- gw_right / sum(gw_right) * G # comparable scale, optional
fit_unif <- dfspa(F_hat, K = 3, grid_weights = gw_uniform, denoise = FALSE)
fit_right <- dfspa(F_hat, K = 3, grid_weights = gw_right, denoise = FALSE)
# Index sets selected can differ.
fit_unif$original_indices
#> [1] 28 18 14
fit_right$original_indices
#> [1] 18 28 14A single overlay plot of the two recovered first basis functions makes the qualitative effect visible — the right-weighted run pays more attention to the right half of the grid when ranking which study is the “extreme” one to anchor:
matplot(
x,
cbind(fit_unif$bases[1, ], fit_right$bases[1, ]),
type = "l", lty = 1, lwd = 2,
col = c("#0072B2", "#D55E00"),
xlab = "x",
ylab = "first recovered basis",
main = expression(
"Recovered first-vertex basis under uniform vs. target-weighted " ~ L^2 * (mu)
)
)
abline(v = 0.5, lty = 2, col = "grey60")
legend("topright",
legend = c("uniform", "up-weight x > 0.5", "x = 0.5 boundary"),
col = c("#0072B2", "#D55E00", "grey60"),
lty = c(1, 1, 2), lwd = c(2, 2, 1), bty = "n")The two curves usually agree closely on the well-sampled regions and
diverge where the weights differ most. The takeaway is not that
one choice is right and the other wrong — it is that
grid_weights quietly determines which features of the
function get prioritised.
grid_weights strictly positive on the support you
care about. Zero weight at a grid point removes it from every distance,
norm, and mean computation in the package. This includes the d-fSPA
denoising step, which uses the same L²(μ) inner product to define
neighbourhoods.grid_weights across dfspa(),
project_to_simplex(), and the conformal routines for a
given analysis. Mixing different grid_weights between steps
is rarely what you want — it changes the meaning of “distance”
mid-pipeline.wrapper is often a cleaner solution than re-weighting the
grid. See vignette("wrapper-scalar").grid_weightsThe argument has the same meaning everywhere it appears:
dfspa(F_hat, K, grid_weights = NULL, ...)project_to_simplex(F_hat, bases, grid_weights = NULL, ...)apply_wrapper(F_mat, wrapper = NULL, grid_weights = NULL)
— used for the default weighted-mean wrapper.metahunt(F_hat, W, K, ..., grid_weights = NULL) —
passes the weights through to dfspa() and
project_to_simplex().split_conformal(...),
cross_conformal(...), conformal_from_fit(...)
— all accept grid_weights for the pointwise conformity
score and for the default wrapper when none is supplied.reconstruction_error_curve(),
cv_error_curve(), select_denoising_params() —
propagate grid_weights to the underlying
dfspa() calls.If a function does not accept grid_weights directly, it
is because it operates on already-projected weights pi_hat
(e.g. fit_weight_model()) and therefore does not need the
grid measure.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.