The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
cooccur
(GitHub-only development name) to cooccure to avoid
collision with an unrelated archived CRAN package of the same name.Matrix::sparseMatrix / Matrix::crossprod. The
previous implementation allocated dense n x k and
k x k matrices, which hit R’s vector memory limit on
corpora with many documents and items (e.g. citation networks > ~100k
unique references). The sparse rewrite stays in triplet form end-to-end
and scales linearly with the number of non-zero co-occurrences.attr(x, "matrix") and
attr(x, "raw_matrix") are now sparse Matrix objects.
as_matrix() densifies them on demand so existing downstream
code keeps working.Matrix to Imports..co_parse_delimited,
.co_parse_multi_delimited) vectorised:
trimws(), NA/empty filtering, and per-row deduplication now
run as single C-level calls over the flattened token vector rather than
as per-row R calls. Cuts overall runtime on a 166k-row x
20-items-per-row citation corpus from ~6.2 s to ~3.4 s (~1.8x
faster).cooccur on
GitHub).cooccurrence() (alias co()) builds
co-occurrence networks from six input formats: delimited fields,
multi-column delimited, long/bipartite, binary matrices, wide sequences
(field = "all"), and lists of character vectors.none, jaccard,
cosine, inclusion, association,
dice, equivalence, relative.minmax, log,
log10, binary, zscore,
sqrt, proportion, none.counting = "fractional" implements the
Perianes-Rodriguez et al.
weight_by parameter supports weighted long-format input
(e.g. LDA topic-document probabilities).split_by computes a separate network per group and
returns a combined edge data frame.output argument returns edges in default, Gephi,
igraph, cograph, or matrix form.as_matrix(), as_igraph(),
as_tidygraph(), as_cograph(),
as_netobject().launch_app().These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.