The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Resubmission of the 1.13.0 release. The 1.13.0 submission was
rejected by the CRAN auto-check service for a single
Overall checktime NOTE on r-devel-windows-x86_64 (11 min
vs. the 10 min limit). The package itself passed cleanly on both Windows
and Debian (Status: OK / OK).
R CMD check comfortably under the 10-minute Windows
limit. The same tests continue to run locally and on R-hub / win-devel
via the NOT_CRAN environment variable that testthat sets.
test-grm.R: the J15S3810 regression (a
15-item / 3,810-respondent fit, ~60s) and the
nitems >= 8 underflow regression (~8s) are skipped on
CRAN. Default-coverage GRM tests on small data continue to run on
CRAN.test-irm.R: the J35S515 shared-fixture
Gibbs run (~23s) and the two reproducibility tests that re-run Gibbs
(~22s combined) are skipped on CRAN. The
Default seed is 123 structural check still runs on
CRAN.This release also incorporates the development changes from 1.12.0–1.12.2 (never published to CRAN). Their entries are retained below for traceability.
Graphical Lasso (Glasso): New
function for sparse precision matrix estimation from ordinal item
response data. The polychoric correlation matrix is computed internally
and the optimal regularization parameter is selected by Extended
Bayesian Information Criterion (EBIC; Foygel and Drton 2010). Implements
block coordinate descent (Friedman, Hastie, Tibshirani 2008; Algorithm
17.2 of Hastie, Tibshirani, Friedman 2009) with cyclical coordinate
descent for the inner lasso step. Warm-starting across the lambda grid
accelerates the search.
Internal helpers glasso_one() (single-lambda solver) and
compute_EBIC_glasso() (EBIC computation) are available but
not exported.
Returns a list with theta (selected precision matrix),
lambda_opt (selected lambda), ebic_opt,
n_edge, and path (data frame of lambda, ebic,
n_edge over the search grid).
print.exametrika Glasso method:
Added a Glasso branch to the shared print.exametrika()
dispatcher to summarize the estimated model (optimal lambda, EBIC value,
edge count, precision matrix).
Chatterjee’s xi correlation (chatterjee_xi,
xi_stable, chatterjee_matrix): New
family of functions implementing Chatterjee’s (2021) rank-based
correlation coefficient. chatterjee_xi() computes the
single-shot value with random tie-breaking. xi_stable()
averages B replications (default B = 1000) to stabilize against
tie-induced variability and returns a list with the mean, standard
deviation, standard error, and B. chatterjee_matrix()
produces the p x p asymmetric pairwise xi matrix from ordinal data with
pairwise-complete handling of missing values; the asymmetry between
xi(j, k) and xi(k, j) enables direction detection in graphical-model
construction.
Glasso() and
chatterjee_matrix() in backticks (e.g.,
`[0, 1]`, `[j, k]`, `Q[i, j]`) so
that roxygen2 no longer rewrites them as \link{}
cross-references. Resolves R CMD check WARNINGs about missing link
targets.Glasso() ... documented:
Added @param ... to Glasso() so that R CMD
check’s “Undocumented arguments” WARNING is cleared.compute_EBIC_glasso() partial-match
fix: Replaced determinant(Theta, log = TRUE) with
the full argument name logarithm = TRUE, removing the
partial-argument-match NOTE.dataFormat(response.type = "rated") no longer
errors when a CA category is unobserved: previously, if no
respondent chose item j’s correct-answer category (e.g.,
everyone got it wrong on a hard item, or the sample was small enough
that the CA category did not appear by chance),
dataFormat() aborted with
"CA for item j is not a valid response category". This
rejected legitimate test data where an item is uniformly difficult or
where the sample size is small. The check has been relaxed to a
warning(); the affected items are encoded as all-incorrect
(U[, j] == 0), which is the correct downstream behavior.
This applies to both the matrix-input path (dataFormat) and
the long-format path (longdataFormat). Mistyped CAs (e.g.,
CA[j] = 8 when categories are 1–7) still surface as
warnings, so the diagnostic value is preserved.Biclustering.nominal() and
Biclustering.ordinal() no longer abort with
"missing value where TRUE/FALSE needed" when extreme grid
configurations (e.g., very small ncls combined with very
large nfld) leave fields or classes empty during EM. The
two log-likelihood checks inside the EM loop — the relative convergence
test and the monotonicity guard — now treat a non-finite
test_log_lik as a non-converged exit instead of throwing.
Affected cells are returned with converge = FALSE so that
GridSearch() skips them automatically when comparing
criteria. This also resolves the same crash in
Biclustering.rated(), which calls
Biclustering.nominal() internally.Class-side Confirmatory Biclustering:
Biclustering() now accepts a conf_class
argument that fixes class memberships during EM. Like conf
(which fixes field memberships), conf_class accepts either
a vector of class labels (one per respondent) or a 0/1 membership matrix
(respondents x classes). When supplied, the class-side E-step is
overridden every iteration. For Ranklustering
(method = "R"), the neighbour-smoothing step is skipped
(smoothed_memb <- clsmemb) since smoothing pre-fixed
labels would defeat the purpose of fixing them. Available for
binary, ordinal, nominal, and
rated data; can be combined with conf to fix
both fields and classes simultaneously. Note: rated
re-orders classes by correct rate after estimation, so the output class
labels may not match the input labels (the individual-to-class mapping
is preserved up to relabeling).
Number of model parameters (nparam) is intentionally not
adjusted when conf or conf_class is in effect:
only PiFR / BCRM cell probabilities count as parameters in this
implementation, and membership matrices are latent posteriors, not
parameters.
Confirmatory ordinal Biclustering: field membership now
stays fixed during EM: Biclustering.ordinal() was
overwriting fldmemb with the E-step estimate every
iteration, so the conf/conf_mat argument only
affected the initial value. Aligned the implementation with
Biclustering.binary() by re-applying
fldmemb <- conf_mat immediately after the field-side
E-step. With this fix result$FieldEstimated matches the
user-supplied assignment exactly.
Confirmatory nominal Biclustering: conf argument is now
actually honored: Biclustering.nominal() validated
length(conf) against NCOL(U), but
U is an exametrika list object so
NCOL(U) always returned 1. The check rejected every
well-formed conf vector with “conf vector size does NOT
match with data.”, making confirmatory nominal Biclustering unreachable
since v1.10.0. Replaced with NCOL(U$Q) (and the matrix-form
NROW(conf) check, plus the conf_mat
allocation, in the same way).
Confirmatory ordinal Biclustering: conf length check
actually validates input: same NCOL(U) issue as
nominal, but in Biclustering.ordinal(). Fixed by switching
to NCOL(U$Q).
Confirmatory binary Biclustering: conf size checks now
reference the formatted response matrix explicitly:
Biclustering.binary() happened to work because
U is rebound to tmp$U * tmp$Z before the conf
block, so NCOL(U) returned the item count. Switched to
NCOL(tmp$U) for consistency with the ordinal/nominal fixes
and to make the intent obvious to readers.
Confirmatory Biclustering: matrix-form conf
is no longer silently dropped: when conf was
supplied as a membership matrix (items x fields) instead of a vector,
all three implementations (binary, ordinal,
nominal) validated the matrix but never copied it into
conf_mat, so the subsequent
nfld <- NCOL(conf_mat) failed with
object 'conf_mat' not found. Added
conf_mat <- as.matrix(conf) in the matrix
branch.
Biclustering.ordinal() now honors the
maxiter argument: The inner EM iteration cap was
hardcoded to 100 (maxemt <- 100) and ignored the
user-supplied maxiter. This mirrors the bug that was fixed
for Biclustering.nominal() in 1.11.0. Callers that pass a
larger maxiter (e.g. 2000 in Monte Carlo
studies) now actually use that ceiling.
GRM() fit indices no longer return
NaN for moderate or larger item counts: The
benchmark model used const <- exp(-nitems * 100) as a
log-domain epsilon inside
sum(cat_counts * log(cat_probs + const)). For about 8 or
more items this expression underflows to exactly 0 in IEEE-754 double
precision, so log(0) = -Inf propagates through
0 * -Inf = NaN and poisons every downstream index
(model_Chi_sq, NFI, CFI, RMSEA, AIC, CAIC, BIC). The benchmark and null
loops now skip zero-count categories explicitly, removing the need for
an additive epsilon.
GRM() degrees of freedom were computed
incorrectly: The model df was set to
n_pattern * (ncat - 1) + 1 and the null df to
n_pattern * (ncat - 1), which is neither
(# bench params) - (# model params) nor
(# bench params) - (# null params). The inflated df then
clamped CFI, TLI, IFI, and
RMSEA to zero whenever chi^2 < df (and the
previous df_A > df_B inequality was the wrong direction
to begin with). The convention now matches
Biclustering.ordinal():
bench_nparam_j = n_pattern * (ncat_j - 1)null_nparam_j = ncat_j - 1model_nparam_j = ncat_j (one slope plus
ncat_j - 1 thresholds)df_A_j = bench_nparam_j - model_nparam_jdf_B_j = bench_nparam_j - null_nparam_jGRM() now accepts any integer-coded ordinal
responses, not just 1..K: Category counts were derived from
apply(dat, 2, max), which assumes responses are already
1-indexed. Data coded from 0 (e.g. 0..3) or with gaps (e.g. 1, 2, 4)
undercounted ncat[j] by one or more and then indexed
grm_prob()[resp] / v[resp] out of range,
producing either an outright invalid subscript type 'list'
error (on J15S3810-style data) or silently truncated
threshold columns. Each item’s responses are now remapped to contiguous
1..K codes via match() against the sorted unique values on
entry, with ncat[j] derived from the mapped levels; both
the R model-fit blocks and the C++ log-likelihood receive 1-based input
regardless of the user’s coding.
GridSearch() now tolerates per-cell fit
errors: Previously, a single Biclustering() (or
LCA() / LRA()) call that raised an error at a
grid corner (for example, empty-cluster edge cases at large
ncls/nfld with small
nobs/nitems) would propagate out of
GridSearch() and abort the entire grid. The call to the
underlying analysis function is now wrapped in tryCatch,
and errors are handled the same way as non-convergence: the cell is
marked NA in the index matrix and the
(ncls, nfld) pair is recorded in
failed_settings. GridSearch() still raises
only when all grid cells fail, preserving the existing
“all-failed” error.
C++ implementation of the IRM Gibbs sampler
core: The collapsed Gibbs sampler shared by
Biclustering_IRM.nominal(),
Biclustering_IRM.ordinal(), and
Biclustering_IRM.rated() is now implemented in C++ via Rcpp
(src/irm_gibbs_core.cpp). The R reference implementation in
R/00_IRM_Gibbs_CORE.R is preserved behind
irm_gibbs_core(..., use_cpp = FALSE) for
cross-checking.
Numerical reproducibility (revised 2026-05-07):
RNG calls are routed through R-level sample.int() and
rmultinom() (via Rcpp::Function) so the
unif_rand() consumption order matches base R exactly. The
C-level entry points (Rcpp::sample,
R::rmultinom) consume RNG differently from base R and were
explicitly avoided.
The C++ and R reference paths are bit-identical for the first Gibbs
iterations on the bundled test datasets and on real polytomous data
(verified through iteration 2 on J143S32 ordinal), but diverge
from iteration 3 onward on real data due to sub-LSB
floating-point ordering differences accumulating through the
lmvbeta() calls in the CRP likelihood. Once the divergence
flips a single rmultinom() outcome, the two chains follow
different sample paths.
Empirically the two paths still target the same
posterior: on a 50-seed comparison (M = 66 ordinal items, N =
143), the marginal distributions of n_field and
n_class are statistically indistinguishable between paths
(Wilcoxon p = 0.56 / 0.81). Use either path for inference; do not rely
on a single seed reproducing the same (n_field, n_class)
across the two paths.
The parity tests in tests/testthat/test-irm-gibbs-cpp.R
cover only short runs (<= 10 iterations) and therefore did not catch
this. A longer-iteration test on real-data-like configurations is on the
roadmap.
Wall-clock: roughly 4x speedup on the inner
Gibbs loop on the bundled test datasets (J20S600 nominal: 82 to 20
ms/iter; J35S500 ordinal: 76 to 19 ms/iter). The end-to-end
Biclustering_IRM() call also benefits proportionally on
larger iteration counts (e.g. simulation runs).
No new dependencies: Rcpp is already in
LinkingTo:; the implementation uses only
Rcpp::Function and
<vector>/<cmath>.
Vectorized EM hot path in
Biclustering.ordinal(): The per-iteration cost of
the ordinal EM loop has been reduced by rewriting five hot spots in base
R without introducing new package dependencies.
apply(Ufcq_prior, c(1, 2), function(x) rev(cumsum(rev(x))))
with an in-place reverse cumulative sum loop over the category axis. The
previous form dispatched nfld * ncls R-level calls per EM
iteration; the new form performs maxQ - 1 vectorized array
additions.apply(X, 1, min) and
apply(X, 1, max) with
do.call(pmin.int, as.data.frame(X)) /
do.call(pmax.int, as.data.frame(X)). These compute the
row-wise reduction in a small number of C-level calls rather than one
R-level call per row. apply(X, 1, which.max) was similarly
replaced by max.col(X, ties.method = "first").log(BBRM[,,q] - BBRM[,,q+1] + const) once
per EM iteration instead of independently in the class-side and
field-side E-steps for every q.Z * Uq[,,q] once per fit (call it
ZU) via column-major recycling
(Uq * as.vector(Z)), replacing about eight separate
elementwise products per EM iteration and reusing the same array in the
post-EM fit-index blocks. The replicate(maxQ, Z) allocation
that produced Zrep has been removed as a byproduct.apply(X, c(1, 2), sum) and
apply(X, c(2, 3), sum) on 3-D arrays with
rowSums(X, dims = 2) / colSums(X, dims = 1),
and apply(M, 2, sum) on matrices with
colSums(M).Output is bit-identical to 1.11.0 on every tested configuration
(max(abs(old - new)) == 0 for BCRM,
BBRM, ClassMembership,
FieldMembership, TestFitIndices, and the full
EM trajectory). Single-fit wall-clock on typical sizes (ncls, nfld up to
20 by 15 on J35S500) improves by roughly 1.2 to 1.3 times; the
per-EM-iteration speedup compounds across GridSearch()
grids.
Vectorized 3-D apply reductions in
Biclustering.nominal(): The same
rowSums/colSums(X, dims = ...) substitutions were applied
to the nominal M-step and null-model blocks, and Zrep was
eliminated in favor of the ZU precomputation. Output is
bit-identical to 1.11.0.
Vectorized Uq one-hot encoding in
Biclustering.ordinal() and
Biclustering.nominal(): The construction of the
one-hot response array Uq[i, j, tmp$Q[i,j]] = 1 previously
used a nobs * nitems nested R loop. It now writes all ones
in one C-level matrix-index assignment, restricted to non-missing cells
(tmp$Z == 1). Single-fit wall-clock on the validation
matrix improves by 1.2-5.3x over 1.11.0 (up to 5x on small cases where
the loop overhead dominated). Output is bit-identical to 1.11.0 at every
downstream location (the missing entries of the old Uq were
never read because every consumer applies the tmp$Z
mask).
Biclustering.ordinal() now returns
SmoothedMembership: The smoothed
(Ranklustering-filtered) class membership matrix is now part of the
return value, mirroring Biclustering.binary(). This fills a
long-standing gap (only the binary form previously exposed it) and makes
Ranklustering with conf_class introspectable: callers can
verify that the smoothing step is correctly skipped when class labels
are pre-fixed (SmoothedMembership equals
ClassMembership in that case).print() smoke test for
Biclustering_IRM.rated() results
(test-irm-rated.R).nrank = 4 LRA.rated regression
test for DistractorAnalysis() to cover varying rank counts
(test-distractor.R).Imports or Depends.
The new C++ Gibbs sampler uses Rcpp, which was already declared in
LinkingTo:.conf_class argument
on Biclustering() is additive and defaults to
NULL, preserving the previous behavior for all existing
callers.Biclustering_IRM.rated(): Rated IRM
(Infinite Relational Model for multiple-choice data): Performs
IRM-based biclustering for rated data (items with correct answers and
multiple response categories). Internally calls
Biclustering_IRM.nominal() for estimation via
Dirichlet-Multinomial collapsed Gibbs sampler, then post-processes the
results: classes are sorted by correct response rate (ranklustering),
and two layers of fit indices are reported — binary (item
correct/incorrect, with full benchmark model including CFI/RMSEA) and
nominal (category-level, AIC/BIC/CAIC only). Returns
TestFitIndices (binary, default) and
TestFitIndices_nominal (nominal) in the output.
DistractorAnalysis(): Distractor analysis
for rated models: S3 generic function for analyzing response
category distributions by rank/class. Computes observed frequency
tables, proportion tables, chi-square tests against chance level, and
Cramer’s V effect sizes for each item-by-rank cell. Works with
LRA.rated and Biclustering.rated /
Biclustering_IRM.rated results. For Biclustering models,
output is grouped by field. Supports items and
ranks filtering in both print() and
plot() methods.
J21S300: New rated sample dataset (21
items, 300 students, 4 response categories) for testing rated
Biclustering and IRM models. Smaller and faster than J35S5000 (35 items,
5000 students).Fixed maxiter parameter ignored in
Biclustering.nominal(): The maxiter
parameter was accepted but internally overridden by a hardcoded
maxemt <- 100. Now correctly uses maxiter
as the iteration limit.
Fixed spurious “No ID column detected” message in
Biclustering.rated() and
Biclustering_IRM.rated(): The crr()
function was called with a raw matrix (tmp$Z * tmp$U)
instead of an exametrika object, triggering an unnecessary
dataFormat() call. Now creates a temporary binary-typed
exametrika object to avoid the message.
Biclustering.rated(): Rated (multiple-choice)
Biclustering: Performs biclustering for rated data (items with
correct answers and multiple response categories). Internally calls
Biclustering.nominal() for estimation, then post-processes
the results: classes are sorted by correct response rate
(ranklustering), and two layers of fit indices are reported — binary
(item correct/incorrect, with full benchmark model) and nominal
(category-level, AIC/BIC/CAIC only). The binary layer uses item-level
correct response rates per class without field constraints
(nparam = nitems * ncls). Supports both Biclustering
(method = "B") and Ranklustering
(method = "R") modes. Returns TestFitIndices
(binary) and TestFitIndices_nominal (nominal) in the
output. Added 16 tests for rated Biclustering using J35S5000.ItemFRP to quasiFRP in
Biclustering.rated() output: The item-level
correct response rate matrix (J x C) is computed without field
constraints because the field assignments come from nominal analysis and
have different meaning in the binary context. The name
quasiFRP (quasi Field Reference Profile) clarifies this
distinction from the standard field-constrained FRP.subscript out of bounds in
nominal/ordinal Biclustering with large nfld: The
initial field assignment
ceiling(1:nitems / (nitems / nfld)) could exceed
nfld due to floating-point rounding. Added
pmin(..., nfld) clamping, consistent with the binary
Biclustering fix already in place.dataFormat() call in
Biclustering.rated(): The function was
unnecessarily re-calling dataFormat() on already-formatted
data, causing repeated “No ID column detected” messages during
GridSearch(). Now reuses the existing exametrika object
directly.x$U (binary correctness matrix) was used instead of
x$Q (polytomous responses). Now correctly uses
x$Q for polytomous models.invalid graphics state
error: par(mfrow=...) and layout()
conflicted when plotting FCRP/FCBR. Now skips
par(mfrow=...) for plot types that use
layout() internally.id parameter validation in
dataFormat(): When a vector (e.g.,
id = tmp$ID) was passed instead of a column number, the
error message was cryptic
(the condition has length > 1). Now validates that
id is a single integer and provides a clear error
message.ell_B = 0) and
chi-square based indices (NFI, RFI, IFI, TLI, CFI, RMSEA) meaningless.
These indices are now reported as NA for nominal data.
Information criteria (AIC, BIC, CAIC) are computed directly from the
model log-likelihood and remain available. Also fixed a
length(benchGroup) →
length(unique(benchGroup)) bug in
Biclustering.nominal() (the unique() call was
missing).nfld. Affects Biclustering()
(binary/ordinal/nominal), Biclustering_IRM()
(binary/ordinal/nominal).field to fld in
Biclustering_IRM.binary(): Internal variable name
changed from field to fld for consistency with
other Biclustering models. No change to the public API
(FieldEstimated output name is unchanged).@format field incorrectly stated “nominal responses” but
the dataset is ordinal (response.type=ordinal). Also fixed
“A ordinal” to “An ordinal” in @description.eval=FALSE to computationally
intensive code chunks in vignettes (GridSearch, IRM, LDLRA, LDLRA_PBIL,
LDB, BINET, ordinal/nominal Biclustering, ordinal/rated LRA). The
Japanese guide (guide-ja) now uses eval=FALSE
for all model estimation chunks to avoid duplicating computation from
English vignettes. Full rendered output is available on the pkgdown site.log(n + 1)
but should be log(n) + 1 per Bozdogan (1987, Psychometrika,
52(3), p.358, Proposition 2, Eq.44). The original Mathematica
implementation had this error (Log[nobs + 1]), and the R
port inherited it. Both the R version
(R/00_ModelFitModule.R) and the Mathematica version
(develop/mtmk15forVer13/mod/Module_ModelFit.nb) have been
corrected. The numerical difference is approximately 1 (constant), but
the corrected formula now matches the published definition:
CAIC(k) = -2 log L + k * (log(n) + 1). This affects all
models that compute fit indices: IRT, LCA, LRA (binary/ordinal/rated),
Biclustering (binary/ordinal/nominal), IRM, BNM, LDLRA, LDB, BINET, and
GRM. All 873 tests pass with the corrected formula.\donttest to
\dontrun: GRM’s multi-panel plot examples caused
“invalid graphics state” errors in non-interactive environments
(pkgdown, CI). Changed to \dontrun to prevent build
failures.CA
parameter: When CA (correct answer vector) was
provided but response.type was not explicitly specified,
dataFormat() incorrectly classified the data as
"ordinal" instead of "rated". This caused
$U (binary scoring matrix) to be NULL. The
auto-detection now checks for CA before ordinal/nominal
detection: binary → rated (if CA provided) → ordinal → nominal.id parameter not working for column
1: Changed id default from 1 to
NULL. Previously, id = 1 (both default and
explicit) triggered auto-detection heuristics, making it impossible to
explicitly specify the first column as the ID column. Now
id = NULL triggers auto-detection, and any numeric value
(including 1) forces that column to be used as ID.1:N) as ID regardless of
whether they also look like valid response values. Previously,
1:10 was misclassified as response data because
looks_like_response_data() returned TRUE.U
matrix: When the response matrix contained missing values
(-1), the binary scoring matrix U incorrectly
scored them as 0 (incorrect) instead of -1
(missing). Now U[i,j] = -1 when
Q[i,j] = -1.drop=FALSE missing in item
exclusion: When items were excluded due to invalid variance
(e.g., containing Inf), the column subsetting
response.matrix[, mask] dropped matrix dimensions if only
one item remained, causing a crash. Added
drop = FALSE.dataFormat() now reports via message() when it
detects problematic data:
g_list
/ adj_list Input Path Fixg_csv variable undefined error when using
g_list or adj_list input:
BINET() crashed with an undefined variable error at the
graph construction step when the DAG was specified via
g_list or adj_list parameters. The internal
variable g_csv (used to build the integrated graph object
all_g) was only defined in the adj_file code
path. Added logic to reconstruct g_csv from
adj_list for the g_list and
adj_list input paths, ensuring all_g is
correctly built with Field edge attributes regardless of input
method.g_list / adj_list length
validation: The length check for g_list and
adj_list incorrectly compared against ncls
(number of classes) instead of nfld (number of fields). In
BINET, each element of adj_list represents the DAG
structure at a specific field, so the list length should equal
nfld. Also fixed g_list type check from
g_list[[1]] (always checking the first element) to
g_list[[j]] (checking each element).... to
Biclustering_IRM.binary() signature: The S3
generic Biclustering_IRM(U, ...) requires all methods to
include ... in their formal arguments. The
.binary method was missing it, causing an R CMD check
WARNING. Added ... to match the generic and the
.nominal/.ordinal methods.msg Field Fixmsg field to correctly distinguish
Rank vs Class models: Binary IRM and Ordinal IRM are
Ranklustering models (ordered latent classes), so
msg = "Rank" is correct. Nominal IRM has no Ranklustering
concept (unordered latent classes), so msg = "Class" is
correct. Previously all three methods had inconsistent values; now
Biclustering_IRM.binary and
Biclustering_IRM.ordinal return msg = "Rank",
while Biclustering_IRM.nominal returns
msg = "Class". Also updated internal column names of
cls01 and FRP from "Class" to
"Rank" for binary/ordinal IRM. The shared Gibbs core
(irm_gibbs_core()) output column names were also
updated.LRD AliasLRD (Latent Rank Distribution) alias to
Biclustering_IRM.binary() return value: The binary
IRM was the only Biclustering model that did not include
LRD in its return value (it only had LCD).
Other Biclustering models (binary Biclustering, nominal IRM, ordinal
IRM) all return both LRD and LCD. Added
LRD = clsdist as an alias for LCD to ensure
consistency across all Biclustering-family models. This improves
interoperability with downstream packages (ggExametrika) that look up
LRD when msg == "Rank".Biclustering_IRM() seed default back
to 123: Ensures reproducibility by default.t(apply()) Dimension Drop Fixt(apply(log_S, 1, irm_log_to_prob))
dimension drop in Nominal/Ordinal IRM: When the Gibbs sampler
converged to ncls=1 or nfld=1,
apply() on a single-column matrix returned a vector instead
of a matrix, causing t() to produce a 1-row matrix instead
of an N-row matrix. This led to incorrect class/field membership
assignment. Replaced all 6 instances of
t(apply(mat, 1, fun)) with explicit for-loops in both
Biclustering_IRM.nominal() (3 locations: EM E-step, final
class membership, final field membership) and
Biclustering_IRM.ordinal() (3 locations: same). Consistent
with the project’s apply() caution policy.Biclustering_IRM() Gibbs sampler: Changed
exp(ptab - min(ptab)) to exp(ptab - max(ptab))
for numerical stability. The previous implementation subtracted the
minimum log-probability, which could cause overflow
(exp(large positive) = Inf) when the range of
log-probabilities was large, resulting in NaN after
normalization. Subtracting the maximum ensures the largest value becomes
exp(0) = 1 and all others underflow harmlessly to near-zero
values.Biclustering.nominal() using stale
log-likelihood in model fit indices: When the EM algorithm
exited due to log-likelihood decrease, BCRM was correctly
reverted to the previous iteration’s value, but
test_log_lik was not. The model fit section recalculated
the correct log-likelihood as testell, but
chi_A, model_log_like, log_lik,
and LogLik all referenced the stale
test_log_lik instead of testell. Also removed
a duplicated model fit code block (copy-paste error).J20S400 response type from
nominal to binary: The
J20S400.rda dataset was incorrectly stored with
response.type = "nominal" despite being a binary (0/1)
dataset with -99 as missing values. Missing values were not properly
handled (Z was all 1s, Q contained raw -99 values). Regenerated from the
original CSV source (develop/sampleData/J20S400.csv) with
dataFormat(..., na = -99), now correctly typed as
binary with 86 missing values properly masked in Z.Biclustering_IRM.nominal() for
nominal/polytomous data: Extends the Infinite Relational Model
(IRM) from binary to nominal scale data using a Dirichlet-Multinomial
collapsed Gibbs sampler. The Chinese Restaurant Process (CRP)
automatically determines the optimal number of classes and fields. After
the Gibbs sampling phase, small classes are consolidated and refined
with an EM algorithm. The Dirichlet prior concentration parameter
alpha controls smoothing of category probabilities.Biclustering_IRM.ordinal() for
ordinal/polytomous data: Extends IRM to ordinal scale data.
Shares the same Dirichlet-Multinomial collapsed Gibbs sampler as nominal
IRM (Phase 1), then applies ordinal-specific EM refinement (Phase 2)
with cumulative normalization to enforce monotonic category boundaries.
The mic parameter (default TRUE) enforces
monotone increasing class ordering by expected score sum. Reports BFRP
(Bicluster Field Reference Profile), FRPIndex, TRP, and Strongly/Weakly
Ordinal Alignment Conditions (SOACflg/WOACflg).Biclustering_IRM() is now an S3
generic: Dispatches to Biclustering_IRM.binary
(existing binary IRM), Biclustering_IRM.nominal (new), and
Biclustering_IRM.ordinal (new). Raw data is automatically
formatted and dispatched based on response.type.irm_gibbs_core() (in R/00_IRM_Gibbs_CORE.R),
used by both nominal and ordinal IRM. Helper functions
(irm_calc_Ufcq, irm_lmvbeta,
irm_log_to_prob, irm_bic_calc, etc.) are also
shared.U_fcq, computing only the contribution of the target
student/item rather than recalculating the full array at each step.LCA() and LRA() now support a
conf parameter for confirmatory analysis: The
conf argument accepts an IRP matrix (ncls/nrank x
testlength) where non-NA values are held fixed throughout EM estimation
and NA values are freely estimated. This enables test equating scenarios
where anchor items retain their known IRPs while new items are
calibrated against them.conf
has column names, items are matched by label rather than by position.
This allows conf to contain a subset of items (anchor items
only) or items in a different order than the data. Items in the data but
not in conf are automatically set to freely estimated.
Unknown labels in conf produce an informative error.somclus()
internal function: The Self-Organizing Maps estimation code
previously inlined in LRA.binary() (~100 lines) has been
extracted into a standalone internal function somclus() in
00_EMclus.R. This parallels emclus() (GTM/EM)
and returns the same structure. Also fixed a pre-existing typo
(h_cout → h_count) and another
(oldsBIC → oldBIC).tidyverse
and readxl for reading Excel-based Mathematica reference
data. Replaced with 24 modern test files using base R
read.csv() and test_path() to load CSV
fixtures from
tests/testthat/fixtures/mathematica_reference/. The new
test suite has zero external package dependencies beyond
testthat.readxl/tidyverse from test
dependencies: DESCRIPTION Suggests no longer
requires any packages beyond knitr, rmarkdown,
and testthat.test-irm-nominal.R): 18 test blocks for
Biclustering_IRM.nominal() using J20S600 data — basic
execution, dimensions, FRP validity, membership, fit indices, backward
compatibility, seed reproducibility, alpha validation.test-irm-ordinal.R): 25 test blocks for
Biclustering_IRM.ordinal() using J35S500 data — basic
execution, dimensions, FRP validity, expected scores, TRP, BFRP,
FRPIndex, SOAC/WOAC flags, mic parameter, fit indices, backward
compatibility, seed reproducibility, alpha validation.stem(nrs(dataFormat(J15S500))) to demonstrate
score distribution visualization.^tests$
and ^inst$ entries that were incorrectly excluding tests
and vignettes from the built package.index
Parameter FixesGridSearch() now
accepts common aliases for fit indices. "loglik",
"log_lik", "LogLik", and "LL" are
mapped to "model_log_like". "Chi_sq" and
"chi_sq" are mapped to "model_Chi_sq".
Previously, using these aliases caused silent NULL
extraction and eventual errors.GridSearch(), before the
computationally expensive grid search loop runs. The error message lists
all valid options including available aliases.model_log_like is now correctly treated as a maximization
target (larger log-likelihood = better fit). Previously, it was
incorrectly placed in the minimization group, which would have selected
the worst-fitting model.tolerance = 1e-2 in
expect_equal) to absolute difference comparison (threshold
0.005). Q3 residual correlations include values near zero where relative
tolerance comparisons are unreliable.layout and
plot.new from graphics to NAMESPACE. These
functions are used by the legend strip layout helpers for polytomous
Biclustering plots.apply(U$Q, 2, unique) returning matrix
instead of list: When all items have the same number of
response categories (e.g., all 5-point Likert), apply()
returns a matrix rather than a list. The subsequent
lapply() then iterates over individual elements instead of
per-column vectors, causing ncat to be a vector of 1s and
crashing the algorithm. Replaced with
lapply(seq_len(nitems), function(j) sort(unique(U$Q[, j])))
which always returns a proper list. Same fix applied to
catfreq999 computation. Both LRA.ordinal and
LRA.rated were affected.apply(tmp$Q, 2, table) returning matrix
instead of list: Same class of bug as the LRA.ordinal/LRA.rated
fix above. In GRM(), the null model log-likelihood
computation used apply(tmp$Q, 2, table) to compute per-item
category frequencies. When all items have the same number of response
categories (e.g., all 5-point Likert), apply() returns a
matrix instead of a list, causing response_list[[j]] to
extract a single number rather than the full frequency table. This
produced incorrect null_log_like values and consequently
wrong chi-square statistics, RMSEA, TLI, and CFI in
ItemFitIndices. Replaced with
lapply(seq_len(nitems), function(j) table(tmp$Q[, j]))
which always returns a proper list.LRA.ordinal() now raises an informative error when items
have different numbers of response categories (e.g., some items with 3
categories and others with 5). The internal matrix algebra uses
fixed-stride indexing that assumes uniform category counts. The error
message suggests alternatives (LRA.rated,
Biclustering.ordinal) that support mixed category counts
via list-based designs.plot(lca_result, type = "FRP") passed validation but failed
at runtime because the $FRP field does not exist in LCA/LRA
return values. Now properly rejected with an informative error message
at the validation stage.stat parameter supporting
"mean" (default), "median", and
"mode".style parameter supporting
"line" (default) and "bar" (stacked bar
chart).stat
parameter.FRPIndex (Field
Reference Profile indices) including location parameters (Alpha, Beta),
slope parameters (A, B), and monotonicity indices (Gamma, C).plot.exametrika()stat: Controls the summary statistic for FRP and RRV
plots ("mean", "median",
"mode").style: Controls the display style for FCRP plots
("line", "bar").#808080) to distinguish from white (incorrect) and black
(correct). Polytomous data uses black (#000000) to
distinguish from the category color palette.print() now
displays FRPIndex (Field Reference Profile Indices) with a note that the
values are based on normalized expected scores
(E[score]-1)/(maxQ-1).?Biclustering, including detailed definitions and
polytomous adaptation logic.Systematic unification of return value structures across all analysis functions for consistency and interoperability.
n_class (retaining
Nclass for backward compatibility)n_rank,
n_field (retaining Nrank,
Nfield)n_class,
n_field (retaining Nclass,
Nfield)n_class,
n_field, n_cycle (retaining
Nclass, Nfield, N_Cycle)n_class,
n_field, n_cycle (retaining
Nclass, Nfield, N_Cycle)n_cycle,
N_Cycle (retaining em_cycle,
EM_Cycle as IRM-specific aliases)log_lik Added to All Functionslog_lik
at the top level of the return object, matching
TestFitIndices$model_log_like.ModelFit class, matching the format used by IRT/LCA/LRA and
other functions. Previously these used bare
calcFitIndices() output (9 fields, no class). Added fields:
model_log_like, bench_log_like,
null_log_like, model_Chi_sq,
null_Chi_sq, model_df,
null_df.ModelFit class.TestFitIndices added as primary
name for multigroup fit indices (previously only
MG_FitIndices). MG_FitIndices retained as
backward-compatible alias. SM_FitIndices (saturated model)
remains unchanged.Estimate
column (most probable class assignment) to the Students matrix.Estimate
column (most probable class assignment) to the Students matrix.FRPIndex
(Field Reference Profile indices) for consistency with
Biclustering.binary and LDB.TRP from
matrix(1×ncls) to numeric vector, consistent
with all other functions.class = c("exametrika", "GridSearch") to return value for
method dispatch support.msg field assignment: Fixed
msg <- "Class" to msg = "Class" inside
structure() call (line 150 of 05_LCA.R). The
<- operator was being interpreted as a standalone
assignment rather than a named list element, causing the
msg field name to be empty.plot(r, type="RMP", students=1)). Added
drop = FALSE to prevent matrix-to-vector coercion when
extracting a single row from the Students matrix.TestFitIndices
log-likelihood fields: Fixed null_log_like which
incorrectly stored the saturated model log-likelihood
(log_lik_satu) instead of the null model log-likelihood
(sum(null_itemll)). Added missing
bench_log_like field containing the saturated model
log-likelihood (sum(satu_itemll2)). Note: the
chi-square-based fit indices (NFI, CFI, TLI, RMSEA, AIC, BIC, etc.) were
always computed correctly; only the stored log-likelihood labels were
affected.TestFitIndices and ItemFitIndices to
the standard 16-field structure with ModelFit class
(c("exametrika", "ModelFit")), matching all other analysis
functions. Added model_log_like,
bench_log_like, null_log_like to
ItemFitIndices. Removed ScoreRankCorr /
RankQuantCorr from TestFitIndices (already
available at the top level of the return object).test-12OLR.R and test-13NLR.R to handle the
new ModelFit class (add unclass() before
as.data.frame()) and adjusted column/index references to
match the unified 16-field structure.layout() with a
thin dedicated row (height ratio 0.2) to reduce visual clutter in data
panels. Added setup_legend_layout() and
draw_legend_strip() internal helper functions.P(Q>=1)=1.0 reference line at the top of FCBR plots for
visual completeness.dataFormat() now correctly identifies factor-type ID
columns before converting factors to numeric. Previously, the
factor-to-numeric conversion occurred before ID detection, causing
factor ID columns with many levels (>=20) to trigger a “Too many
categories” error instead of being recognized as IDs.is_response_data()) that was defined but never
called.obj$Q instead of
obj$U for test length calculationpmax(Ufcq_prior, 1e-10) to prevent division by zero and NaN
errorsThis release improves naming consistency across the package while maintaining full backward compatibility through a deprecation path.
All analysis functions now return results with snake_case field names for better consistency:
n_class - Number of latent classes (replaces
Nclass)n_field - Number of latent fields (replaces
Nfield)n_rank - Number of latent ranks (replaces
Nrank)n_cycle - Number of EM iterations (replaces
N_Cycle)log_lik - Log-likelihood value (replaces
LogLik)The old field names continue to work for backward compatibility:
Nclass, Nfield, Nrank,
N_Cycle, LogLik - Deprecated but
functional# Old code (deprecated but still works)
result <- LCA(data, ncls = 3)
n_classes <- result$Nclass
# New code (recommended)
result <- LCA(data, ncls = 3)
n_classes <- result$n_classProgress messages now display properly in R Markdown documents:
verbose parameter
(default: TRUE)
ncls = 2: nfld=2 nfld=3 nfld=4 ...verbose = FALSE to suppress all progress
messagesiter 1: match=0 nfld=15 ncls=30Adjusting classes: BIC=-99592.5 ncls=21 (min size < 20)verbose = TRUE to
verbose = FALSE (consistent with binary/rated
versions)Saturation Model - iter 1: log_lik=-1.234567Restricted Model - iter 1: log_lik=-0.987654verbose = TRUE to
verbose = FALSE (consistent with binary/ordinal
versions)All progress messages now use proper line breaks instead of carriage returns, ensuring clean output in R Markdown/knitr documents and web documentation.
sort(unique(...)) to ensure consistent ordering: 0
(white/incorrect) → 1 (black/correct)testEll →
test_log_lik)log_lik
terminologyconverge variable to all EM-based functions to
indicate algorithm convergence status
Added implementation of latent rank model for ordinal scale data
New function LRA() now supports ordinal response data
Bug fixes and improvements
Standardized terminology: unified the usage of “class” and “rank” throughout the package
Various minor bug fixes
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.