The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

pudu

R-CMD-check codecov BuyMeACoffee CRAN status

Overview

The goal of pudu is to provide function declarations and inline function definitions that facilitate cleaning strings in C++ code before passing them to R. It works with cpp11::strings and std::vector<std::string> objects.

The idea is the same as the janitor package, but for C++ code.

Why is the name Pudu? Pudu is the smallest deer on planet Earth and this package is tiny too. The original Pudu (unvectorized) was drawn by Pokanvas. This package emerged as a spinoff from the redatam package while cleaning strings in C++ code.

Installation

You can install the development version of pudu with:

remotes::install_github("pachadotdev/pudu")

Example

Here is how you can use the functions in this package in C++ code:

#include <cpp11.hpp>
#include <pudu.hpp>

using namespace cpp11;

// Example 1

std::vector<std::string> x = {" REGION NAME "};

tidy_std_names(x); // returns 'REGION NAME'

// Example 2

tidy_std_vars(x); // returns 'region_name'

// Example 3

// test_tidy_r_names(" REGION NAME ") returns 'REGION NAME'
[[cpp11::register]] cpp11::writable::strings test_tidy_r_names(
  const cpp11::strings& x) {
  cpp11::writable::strings res = tidy_r_names(x);
  return res;
}

// Example 4

// test_tidy_r_names(" REGION NAME ") returns 'region_name'
[[cpp11::register]] cpp11::writable::strings test_tidy_r_vars(
  const cpp11::strings& x) {
  cpp11::writable::strings res = tidy_r_vars(x);
  return res;
}

Messy strings such as ” DEPTO. .REF_ID_ ” are converted to “depto_ref_id” or “DEPTO. .REF_ID_”.

The following tests in R should give an idea of how the functions work:

# German
vars <- "Gau\xc3\x9f"
expect_equal(test_tidy_r_names(vars), "gau")
expect_equal(test_tidy_r_vars(vars), "Gau\u00df")

# French
vars <- "c\xc2\xb4est-\xc3\xa0-dire"
expect_equal(test_tidy_r_names(vars), "c_est_a_dire")
expect_equal(test_tidy_r_vars(vars), "c\u00b4est-\u00e0-dire")

# Spanish
vars <- "\xc2\xbfC\xc3\xb3mo est\xc3\xa1s\x3f"
expect_equal(test_tidy_r_names(vars), "como_estas")
expect_equal(test_tidy_r_vars(vars), "\u00bfC\u00f3mo est\u00e1s\u003f")

# Japanese
vars <- "Konnichiwa \xe3\x81\x93\xe3\x82\x93\xe3\x81\xab\xe3\x81\xa1\xe3\x81\xaf"
expect_equal(test_tidy_r_names(vars), "konnichiwa")
expect_equal(test_tidy_r_vars(vars), "Konnichiwa \u3053\u3093\u306b\u3061\u306f")

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.