The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
The string (or character) data type typically requires more
manipulation to be helpful for data analysts. Thus, there is a need for
a robust package that is up to the task. forstringr is
a new package built on top of ‘stringr’ to execute various string
manipulations in R programming. The main aim of ‘forstringr’ is to
simplify string manipulation for R beginners. This package combines its
power with the adaptability of other manipulation tools such as tidyr
and dplyr. Like in the stringr package, most functions in
forstringr
begin with str_
. For a quick video
tutorial, I gave a talk at Africa R users meetup, which you can find here.
You can install forstringr
package from CRAN with:
install.packages("forstringr")
or the development version from GitHub with
if(!require("devtools")){
install.packages("devtools")
}
::install_github("gbganalyst/forstringr") devtools
This section provides a concise overview of the different functions
available in the forstringr
package. These functions serve
various purposes and are designed to aid in string manipulation
tasks.
length_omit_na()
length_omitna()
counts only non-missing elements of a
vector.
library(forstringr)
#> Loading required package: stringr
<- c("Hausa", NA, "Yoruba", "Ijaw", "Igbo", NA, "Ibibio", "Tiv", "Fulani", "Kanuri", "Others")
ethnicity
# count all the observations, including NAs.
length(ethnicity)
#> [1] 11
# count all the observations, without NAs.
length_omit_na(ethnicity)
#> [1] 9
str_title_case()
str_title_case()
converts string to title case,
capitalizing only the first letter of each word while ignoring articles,
prepositions, and conjunctions.
Please note that str_title_case()
is different from
stringr::str_to_title()
which converts to title case, where
only the first letter of each word is capitalized.
<- "the quick brown fox jumps over a lazy dog"
words
str_title_case(words) # from forstringr package
#> [1] "The Quick Brown Fox Jumps over a Lazy Dog"
str_to_title(words) # from stringr package
#> [1] "The Quick Brown Fox Jumps Over A Lazy Dog"
str_left()
Given a character vector, str_left()
returns the left
side of a string. For examples:
str_left("Nigeria")
#> [1] "N"
str_left("Nigeria", n = 3)
#> [1] "Nig"
str_left(c("Female", "Male", "Male", "Female"))
#> [1] "F" "M" "M" "F"
str_right()
Given a character vector, str_right()
returns the right
side of a string. For examples:
str_right("July 20, 2022", 4)
#> [1] "2022"
str_right("Sale Price", n = 5)
#> [1] "Price"
str_mid()
Like in Microsoft Excel, the str_mid()
returns a specific
number of characters from a text string, starting at the position you
specify, based on the number of characters you select.
str_mid("Super Eagle", 7, 5)
#> [1] "Eagle"
str_mid("Oyo Ibadan", 5, 6)
#> [1] "Ibadan"
str_split_extract()
If you want to split up a string into pieces and extract the results
using a specific index position, then, you will use
str_split_extract()
. You can interpret it as follows:
Given a character string, S
, extract the element at a
given position, k
, from the result of splitting
S
by a given pattern, m
. For example:
<- c("Aliko Dangote", "Mike Adenuga", "Femi Otedola", "Arthur Eze", "Abdulsamad Rabiu", "Cletus Ibeto", "Orji Uzor Kalu", "ABC Orjiakor", "Jimoh Ibrahim", "Tony Elumelu")
top_10_richest_nig
<- str_split_extract(top_10_richest_nig, " ", 1)
first_name
first_name#> [1] "Aliko" "Mike" "Femi" "Arthur" "Abdulsamad"
#> [6] "Cletus" "Orji" "ABC" "Jimoh" "Tony"
str_extract_part()
Extract strings before or after a given pattern. For example:
<- str_extract_part(top_10_richest_nig, pattern = " ", before = TRUE)
first_name
first_name#> [1] "Aliko" "Mike" "Femi" "Arthur" "Abdulsamad"
#> [6] "Cletus" "Orji Uzor" "ABC" "Jimoh" "Tony"
<- c("$159", "$587", "$891", "$207", "$793")
revenue
str_extract_part(revenue, pattern = "$", before = FALSE)
#> [1] "159" "587" "891" "207" "793"
str_englue()
You can dynamically label ggplot2 plots with the glue operators
{}
or {{}}
using str_englue()
.
For example, any value wrapped in { }
will be inserted into
the string and you automatically inserts a given variable name using
{{ }}
.
It is important to note that str_englue()
must be used
inside a function. str_englue("{{ var }}")
defuses the
argument var
and transforms it to a string using the
default name operation.
library(ggplot2)
<- function(df, var, binwidth) {
histogram_plot %>%
df ggplot(aes(x = {{ var }})) +
geom_histogram(binwidth = binwidth) +
labs(title = str_englue("A histogram of {{var}} with binwidth {binwidth}"))
}
%>%
iris histogram_plot(Sepal.Length, binwidth = 0.1)
str_rm_whitespace_df()
Extra spaces are accidentally entered when working with survey data,
and problems can arise when evaluating such data because of extra
spaces. Therefore, the function str_rm_whitespace_df()
eliminates your data frame unnecessary leading, trailing, or other
whitespaces.
# a dataframe with whitespaces
richest_in_nigeria#> # A tibble: 10 × 5
#> Rank Name `Net worth` Age `Source of Wealth`
#> <dbl> <chr> <chr> <dbl> <chr>
#> 1 1 " Aliko Dangote" "$14 Billion" 64 " Cement and Sugar "
#> 2 2 "Mike Adenuga" "$7.9 Billion " 68 "Telecommunication, …
#> 3 3 "Femi Otedola" "$5.9 Billion" 59 "Oil and Gas"
#> 4 4 " Arthur Eze" "$5 Billion" 73 "Oil and Gas"
#> 5 5 "Abdulsamad Rabiu" "$3.7 Billion" 61 "Cement and Sugar"
#> 6 6 " Cletus Ibeto " " $3.5 Billion" 69 "Automobile, Real Estat…
#> 7 7 "Orji Uzor Kalu" "$3.2 Billion" 61 "Furniture, Publishi…
#> 8 8 "ABC Orjiakor " " $1.2 Billion" 63 "Oil and Gas"
#> 9 9 " Jimoh Ibrahim" "$1 Billion " 54 "Insurance, Oil and Gas…
#> 10 10 "Tony Elumelu" "$900 Million" 58 " Banking "
# a dataframe with no whitespaces
str_rm_whitespace_df(richest_in_nigeria)
#> # A tibble: 10 × 5
#> Rank Name `Net worth` Age `Source of Wealth`
#> <dbl> <chr> <chr> <dbl> <chr>
#> 1 1 Aliko Dangote $14 Billion 64 Cement and Sugar
#> 2 2 Mike Adenuga $7.9 Billion 68 Telecommunication, Oil, and Gas
#> 3 3 Femi Otedola $5.9 Billion 59 Oil and Gas
#> 4 4 Arthur Eze $5 Billion 73 Oil and Gas
#> 5 5 Abdulsamad Rabiu $3.7 Billion 61 Cement and Sugar
#> 6 6 Cletus Ibeto $3.5 Billion 69 Automobile, Real Estate
#> 7 7 Orji Uzor Kalu $3.2 Billion 61 Furniture, Publishing
#> 8 8 ABC Orjiakor $1.2 Billion 63 Oil and Gas
#> 9 9 Jimoh Ibrahim $1 Billion 54 Insurance, Oil and Gas, Real Estate
#> 10 10 Tony Elumelu $900 Million 58 Banking
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.