The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
group_imp() enforces stricter data validation. The
requested feature subset must be a subset the object’s column names
which must be a subset of the mapping data.frame. Set
allow_unmapped = TRUE to bypass errors when intersections
are incomplete.
group_imp() and tune_imp() now error
when arguments are supplied that do not apply to the chosen imputation
method, rather than silently ignoring them.
knn_imp() now uses a logical tree
argument to toggle between Ball tree (TRUE) and brute force
(FALSE). KD tree is no longer supported.
knn_imp() and pca_imp() gain more early
errors and early exits.
pca_imp() gains the same colmax and
post_imp arguments as knn_imp().
prep_groups() (formerly
group_features()) is the new name for the grouping
function. It now accepts a column name vector instead of a full
matrix.
sample_na_loc() (formerly inject_na())
is now exported. The original remains accessible via
slideimp:::inject_na() for legacy code.
sim_mat() now returns a matrix in sample-by-column
format for immediate compatibility with other package functions.
perc_NA is renamed to perc_total_na, and
dimensions are now specified via n (rows) and
p (columns).
tune_imp() gains a unified method
argument that applies to both pca_imp() and
knn_imp(), replacing pca_method and
knn_method. The rep argument is renamed to
n_reps.
tune_imp() results from v0.5.4 are no longer
reproducible because internal NA generation now uses
sample_na_loc().
The khanmiss1 dataset has been removed.
compute_metrics() now supports data frames with a
result list column containing truth and estimate columns,
similar to {yardstick}.
group_imp() and prep_groups()
automatically look up Illumina manifests using the register-on-load
pattern for {slideimp.extra}.
knn_imp() gains max_cache to control
the internal cache size (defaults to 4GB).
sim_mat() gains a rho argument to
support compound symmetry correlation structures in simulated
matrices.
sim_mat() and tune_imp() gain dedicated
print methods that provide concise summaries instead of dumping raw data
to the console.
slide_imp() gains location,
flank, and dry_run arguments for fixed-window
imputation, “flank mode” for features surrounding a subset, and
pre-computation inspection of window statistics.
tune_imp() gains granular control over NA injection
via n_cols, n_rows, num_na, and
na_col_subset. Pre-calculated locations can also be passed
to na_loc to compare methods using identical NA
patterns.
col_vars() and mean_imp_col() have been
overhauled to use the faster {RcppArmadillo} backend and
now support parallel computation with OpenMP.
Dependencies are streamlined. {tibble} and
{purrr} are removed as hard dependencies,
{cli} is added for more informative messaging, and
{carrier} is added as an explicit dependency.
Documentation is thoroughly overhauled with numerous consistency improvements and bug fixes.
{RhpcBLASctl} is added as a suggested package to
allow pinning BLAS cores and avoid thrashing during parallel
runs.
group_imp() and tune_imp() prioritize
process-level parallelization via {mirai}.
knn_imp() supports OpenMP-controlled parallelization via
the cores argument when {mirai} daemons are
not active.
knn_imp() and pca_imp() use optimized
internal Rcpp functions for better performance.
CRAN resubmission.
group_features() is added to help with creating the
group tibble needed for group_imp().
pca_imp() now allows row.w = "n_miss"
to scale row weights by the number of missing values per row.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.