The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
With strings encoded as a vector of characters, we can perform vector operations over the actual characters. All {charcuterie} functions aim to return a new object of class “chars” so it is also able to be printed as a string and passed to other vector-handling functions.
library(charcuterie)
#>
#> Attaching package: 'charcuterie'
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, union
To convert a regular string into a chars
object, use
chars()
. This prints as a string, but is actually a
vector
chars("string")
#> [1] "string"
# but it's a vector
unclass(chars("string"))
#> [1] "s" "t" "r" "i" "n" "g"
Only a single string can be converted this way, so if you want to produce more than one of these, I suggest
many_chars <- lapply(c("foo", "bar", "baz"), chars)
many_chars
#> [[1]]
#> [1] "foo"
#>
#> [[2]]
#> [1] "bar"
#>
#> [[3]]
#> [1] "baz"
unclass(many_chars[[2]])
#> [1] "b" "a" "r"
A regular string can be recovered using string()
which
pastes the characters back together
and this can optionally take a separator
Because the chars
object is a vector we can do vector
things, such as indexing
"string"[3] # doesn't work
#> [1] NA
chars("string")[3]
#> [1] "r"
chars("banana")[seq(2, 6, 2)]
#> [1] "aaa"
subsetting
head("string", 3) # doesn't work
#> [1] "string"
head(chars("string"), 3)
#> [1] "str"
tail(chars("string"), 3)
#> [1] "ing"
substituting
tabulating
table("mississippi") # doesn't work
#>
#> mississippi
#> 1
table(chars("mississippi"))
#>
#> i m p s
#> 4 1 2 4
sorting
sort("string") # doesn't work
#> [1] "string"
sort(chars("string"))
#> [1] "ginrst"
sort(chars("string"), decreasing = TRUE)
#> [1] "tsrnig"
reversing
Since these are vectors, we no longer need nchar
to
determine the length
length("string") # just the one 'string'
#> [1] 1
length(chars("string")) == nchar("string")
#> [1] TRUE
Membership tests can now determine if a given character is in the ‘string’
"i" %in% "rhythm" # doesn't work
#> [1] FALSE
"y" %in% "rhythm" # doesn't work
#> [1] FALSE
"i" %in% chars("rhythm")
#> [1] FALSE
"y" %in% chars("rhythm")
#> [1] TRUE
is.element("y", "rhythm") # doesn't work
#> [1] FALSE
is.element("y", chars("rhythm"))
#> [1] TRUE
chars
objects can be concatenated; combining two strings
produces a longer string
c("butter", "fly") # doesn't work in the character sense
#> [1] "butter" "fly"
c(chars("butter"), chars("fly"))
#> [1] "butterfly"
c(chars("butter"), chars("fly"))[c(1, 9)]
#> [1] "by"
Set operations can be useful
setdiff(chars("javascript"), chars("script"))
#> [1] "jav"
union(chars("bunny"), chars("rabbit"))
#> [1] "bunyrait"
intersect(chars("bob"), chars("rob"))
#> [1] "bo"
setequal(chars("stop"), chars("post"))
#> [1] TRUE
setequal(chars("stop"), chars("posit"))
#> [1] FALSE
unique(chars("mississippi"))
#> [1] "misp"
Since chars
objects are regular vectors, they continue
to work with other vectorised operations
rev(toupper(chars("string")))
#> [1] "GNIRTS"
toString(chars("abc"))
#> [1] "a, b, c"
Filter(\(x) x != "a", "banana")
#> [1] "banana"
Filter(\(x) x != "a", chars("banana"))
#> [1] "bnn"
This last example motivates a non-set-wise way to exclude some
characters, so this package introduces a new except
function
except(chars("javascript"), chars("script"))
#> [1] "java"
except(chars("carpet"), chars("car"))
#> [1] "pet"
except(chars("banana"), "a")
#> [1] "bnn"
except(chars("banana"), chars("a"))
#> [1] "bnn"
Anywhere a vector of individual character works, a chars
object should also work
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.