The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
faux-naïf (/ˌfoʊ.naɪˈif/): a person who pretends to be simple or innocent
fauxnaif: an R package for simplifying data by
pretending values are NA
fauxnaif provides an extension to
dplyr::na_if()
. Unlike dplyr’s
na_if()
, na_if_in()
allows you to specify
multiple values to be replaced with NA
using a single
function. fauxnaif also includes a complementary
function na_if_not()
to specify values to keep.
You can install fauxnaif
from CRAN:
install.packages("fauxanif")
Or the development version from GitHub:
# install.packages("remotes")
::install_github("rossellhayes/fauxnaif") remotes
library(dplyr)
library(fauxnaif)
Let’s say we want to remove an unwanted negative value from a vector of numbers
-1:10
#> [1] -1 0 1 2 3 4 5 6 7 8 9 10
We can replace -1…
… explicitly:
na_if_in(-1:10, -1)
#> [1] NA 0 1 2 3 4 5 6 7 8 9 10
… by specifying values to keep:
na_if_not(-1:10, 0:10)
#> [1] NA 0 1 2 3 4 5 6 7 8 9 10
… using a formula:
na_if_in(-1:10, ~ . < 0)
#> [1] NA 0 1 2 3 4 5 6 7 8 9 10
<- c("abc", "", "def", "NA", "ghi", 42, "jkl", "NULL", "mno") messy_string
We can replace unwanted values…
… one at a time:
na_if_in(messy_string, "")
#> [1] "abc" NA "def" "NA" "ghi" "42" "jkl" "NULL" "mno"
… or all at once:
na_if_in(messy_string, "", "NA", "NULL", 1:100)
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
na_if_in(messy_string, c("", "NA", "NULL", 1:100))
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
na_if_in(messy_string, list("", "NA", "NULL", 1:100))
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
… or using a clever formula:
grepl("[a-z]{3,}", messy_string)
#> [1] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
na_if_not(messy_string, ~ grepl("[a-z]{3,}", .))
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
faux_census#> # A tibble: 5 × 4
#> state age income gender
#> <chr> <dbl> <dbl> <chr>
#> 1 TX 57 9999999 Gender is a social construct
#> 2 Canada 49 149000 Male
#> 3 NY 557 90750 f
#> 4 LA 2 61000 Male
#> 5 TN 64 9999999 M
na_if_in() is particularly useful inside
dplyr::mutate()
:
%>%
faux_census mutate(
income = na_if_in(income, 9999999),
age = na_if_in(age, ~ . < 18, ~ . > 120),
state = na_if_not(state, ~ grepl("^[A-Z]{2,}$", .)),
gender = na_if_in(gender, ~ nchar(.) > 20)
)#> # A tibble: 5 × 4
#> state age income gender
#> <chr> <dbl> <dbl> <chr>
#> 1 TX 57 NA <NA>
#> 2 <NA> 49 149000 Male
#> 3 NY NA 90750 f
#> 4 LA NA 61000 Male
#> 5 TN 64 NA M
Or you can use dplyr::across()
on data frames:
%>%
faux_census mutate(
across(age, na_if_in, ~ . < 18, ~ . > 120),
across(state, na_if_not, ~ grepl("^[A-Z]{2,}$", .)),
across(where(is.character), na_if_in, ~ nchar(.) > 20),
across(everything(), na_if_in, 9999999)
)#> # A tibble: 5 × 4
#> state age income gender
#> <chr> <dbl> <dbl> <chr>
#> 1 TX 57 NA <NA>
#> 2 <NA> 49 149000 Male
#> 3 NY NA 90750 f
#> 4 LA NA 61000 Male
#> 5 TN 64 NA M
Hex sticker fonts are Bodoni* by indestructible type* and Source Code Pro by Adobe.
Image adapted from icon made by Freepik from flaticon.com.
Please note that fauxnaif is released with a Contributor Code of Conduct.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.