The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
The package offers the following main functions:
stringdist
computes pairwise distances between two input character vectors (shorter one is recycled)stringdistmatrix
computes the distance matrix for one or two vectorsstringsim
computes a string similarity between 0 and 1, based on stringdist
amatch
is a fuzzy matching equivalent of R’s native match
functionain
is a fuzzy matching equivalent of R’s native %in%
operatorseq_dist
, seq_distmatrix
, seq_amatch
and seq_ain
for distances between, and matching of integer sequences.These functions are built upon C
-code that re-implements some common (weighted) string distance functions. Distance functions include:
Also, there are some utility functions:
qgrams()
tabulates the qgrams in one or more character
vectors.seq_qrams()
tabulates the qgrams (somtimes called ngrams) in one or more integer
vectors.phonetic()
computes phonetic codes of strings (currently only soundex)printable_ascii()
is a utility function that detects non-printable ascii or non-ascii characters.Some of stringdist
’s underlying C
functions can be called directly from C
code in other packages. The description of the API can be found by either typing ?stringdist_api
in the R console or open the vignette directly as follows:
vignette("stringdist_C-Cpp_api", package="stringdist")
Examples of packages that link to stringdist
can be found here and here.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.