The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
This example gives a complete, runnable bpgmm analysis.
It is small enough to build quickly on CRAN and pkgdown, but it uses the
same functions as a larger analysis. The focus is the structure of one
fitted object; model-selection interpretation is handled separately.
pgmm_rjmcmc() expects a numeric matrix with variables in
rows and observations in columns. The code below creates two
well-separated clusters in two dimensions.
library(bpgmm)
set.seed(2026)
X <- cbind(
matrix(rnorm(8, mean = -2, sd = 0.2), nrow = 2),
matrix(rnorm(8, mean = 2, sd = 0.2), nrow = 2)
)
known_labels <- rep(1:2, each = 4)
dim(X)
#> [1] 2 8
known_labels
#> [1] 1 1 1 1 2 2 2 2cluster_cols <- c("#0072B2", "#D55E00", "#009E73")
plot(
X[1, ], X[2, ],
col = cluster_cols[known_labels],
pch = 19,
xlab = "Variable 1",
ylab = "Variable 2",
main = "Simulated data",
asp = 1
)
legend(
"topleft",
legend = paste("Known", sort(unique(known_labels))),
col = cluster_cols[sort(unique(known_labels))],
pch = 19,
bty = "n"
)This fit keeps \(m\) and the covariance model fixed, so the output structure is explicit. The fitted likelihood contribution is
\[ p(x_i \mid z_i = k, \Theta) = N_p(x_i \mid \mu_k, \Lambda_k\Lambda_k^\top + \Psi_k). \]
For model selection across \(m\) and covariance labels, use the separate model-selection vignette.
fit_log <- capture.output({
fit <- pgmm_rjmcmc(
X = X,
m_init = 2,
m_range = c(1, 3),
q_new = 1,
burn = 1,
niter = 3,
constraint = "UUU",
m_step = 0,
v_step = 0,
verbose = FALSE
)
})
tail(fit_log, 1)
#> character(0)The returned object stores posterior samples for allocations, covariance constraints, component parameters, and hyperparameters.
summarize_pgmm_rjmcmc() returns the modal allocation,
posterior counts for the number of clusters, posterior counts for
covariance models, and optionally an adjusted Rand index.
fit_summary <- summarize_pgmm_rjmcmc(fit, true_cluster = known_labels)
fit_summary$allocation
#> [1] 2 2 2 2 1 1 1 1
fit_summary$n_clusters
#>
#> 2
#> 3
fit_summary$n_constraints
#>
#> UUU
#> 3
fit_summary$ari
#> [1] 1The modal allocation is computed coordinate-wise from the saved posterior allocations:
\[ \hat{z}_i = \operatorname{mode}\{z_i^{(1)},\ldots,z_i^{(S)}\}, \]
where \(S\) is the number of saved posterior samples.
old_par <- par(mfrow = c(1, 2), mar = c(4, 4, 3, 1))
plot(
X[1, ], X[2, ],
col = cluster_cols[known_labels],
pch = 19,
xlab = "Variable 1",
ylab = "Variable 2",
main = "Known labels",
asp = 1
)
plot(
X[1, ], X[2, ],
col = cluster_cols[fit_summary$allocation],
pch = 19,
xlab = "Variable 1",
ylab = "Variable 2",
main = "Posterior allocation",
asp = 1
)The sampler stores draws of \(\tau_k\), \(\mu_k\), \(\Lambda_k\), and \(\Psi_k\). For this fixed-model example, the final saved draw is enough to show the object shape.
last <- length(fit$tau_samples)
fit$tau_samples[[last]]
#> [1] 0.1844265 0.8155735 0.0000000
fit$mean_samples[[last]]
#> [[1]]
#> [,1] [,2]
#> [1,] 1.285546 1.767347
#>
#> [[2]]
#> [,1] [,2]
#> [1,] -2.023394 -2.231323
#>
#> [[3]]
#> [1] NA
lapply(fit$lambda_samples[[last]], dim)
#> [[1]]
#> [1] 2 1
#>
#> [[2]]
#> [1] 2 1
#>
#> [[3]]
#> NULL
lapply(fit$psi_samples[[last]], dim)
#> [[1]]
#> [1] 2 2
#>
#> [[2]]
#> [1] 2 2
#>
#> [[3]]
#> NULLFor an applied analysis, keep the same sequence of steps but use larger sampler settings. The exact values depend on data size and convergence diagnostics.
For a concise analysis report, include:
m_range, q_new, starting covariance model,
and whether m_step, v_step, and
split_combine were enabled;The diagnostics vignette gives examples of trace plots and co-clustering summaries for multiple chains.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.