Repository Mirror for your Cloud Server and Webhosting

Title:

Tools for Cleaning Up Messy Files

Version:

0.16

Description:

Some tools for cleaning up messy 'Excel' files to be suitable for R. People who have been working with 'Excel' for years built more or less complicated sheets with names, characters, formats that are not homogeneous. To be able to use them in R nowadays, we built a set of functions that will avoid the majority of importation problems and keep all the data at best.

License:

GPL-3

URL:

https://github.com/Thinkr-open/thinkr

BugReports:

https://github.com/Thinkr-open/thinkr/issues

Depends:

R (≥ 3.1)

Imports:

assertthat, cli, devtools, dplyr, ggplot2, lazyeval, lubridate, magrittr, methods, officer, rvg, stats, stringi, stringr, tidyr, utils, withr

Suggests:

knitr, rmarkdown, testthat

VignetteBuilder:

knitr

Encoding:

UTF-8

RoxygenNote:

7.2.0

NeedsCompilation:

Packaged:

2022-08-22 13:16:40 UTC; PC

Author:

Vincent Guyader

[aut, cre], Sébastien Rochette

[aut], ThinkR [cph]

Maintainer:

Vincent Guyader <vincent@thinkr.fr>

Repository:

CRAN

Date/Publication:

2022-08-22 13:30:02 UTC

thinkr: Tools for Cleaning Up Messy Files

Description

Author(s)

Maintainer: Vincent Guyader vincent@thinkr.fr (ORCID)

Authors:

Sébastien Rochette sebastien@thinkr.fr (ORCID)

Other contributors:

ThinkR [copyright holder]

Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

not in

Description

not in

Usage

x %ni% table

Arguments

x

vector or NULL: the values to be matched

table

the values to be matched against

Examples

"a" %ni% letters
"coucou" %ni% letters

delete .test file in testthat folder

Description

Only usefull during package developpement using testthat package

Usage

.efface_test()

Save all ggplot in a pptx

Description

Save all ggplot in a pptx

Usage

all_ggplot_to_pptx(
  out = "tous_les_graphs.pptx",
  open = TRUE,
  png = TRUE,
  folder = "dessin",
  global = TRUE
)

Arguments

out

output file name

open

booleen open file after creation

png

booleen also save as png

folder

png's folder

global

booleen use .GlobalEnv

Examples

## Not run: 
all_ggplot_to_pptx()

## End(Not run)

Transform a vector into numeric if meaningful, even with bad decimal, space or %

Description

Transform a vector into numeric if meaningful, even with bad decimal, space or %

Usage

as_mon_numeric(vec)

Arguments

vec

a vector

Details

Note that text and factors are not transformed as numeric (except FALSE, TRUE, F, T), contrary to R default behavior with 'as.numeric(factor())'

Value

a numeric vector

Examples

as_mon_numeric(c("1", "0", "1"))
as_mon_numeric(c("1.3", "1,5", "1;6", "16%", "17 87 "))
as_mon_numeric(c(TRUE, "A", "F"))
as_mon_numeric(c(TRUE, TRUE, FALSE))
as_mon_numeric(factor(c("toto", "tata", "toto")))

Clean levels label

Description

Clean levels label

Usage

clean_levels(vec, verbose = FALSE, translit = FALSE, punct = FALSE)

Arguments

vec

a factor

verbose

booleen is the function verbose

translit

booleen remove non ascii character

punct

booleen do you remove punctuation

clean_names

Description

clean_names

Usage

clean_names(dataset, verbose = FALSE, translit = TRUE)

Arguments

dataset

a dataframe

verbose

logical

translit

logical remove non ascii character

Value

a dataframe

Examples

data(iris)
clean_names(iris)

Clean character vector

Description

Clean character vector

Usage

clean_vec(
  vec,
  verbose = FALSE,
  unique = TRUE,
  keep_number = FALSE,
  translit = TRUE,
  punct = TRUE
)

Arguments

vec

character vector to clean

verbose

logical is the function verbose

unique

logical do we have to apply make_unique

keep_number

logical keep number at begining

translit

logical remove non ascii character

punct

logical do you remove punctuation

return R instruction to create levels

Description

return R instruction to create levels

Usage

dput_levels(vec)

Arguments

vec

a factor or character vector

Value

a R instruction

Examples

dput_levels(iris$Species)

Get position or excel name of column

Description

ncol_to_excel returns excel column name from a position number. excel_to_ncol returns excel column position number from a column name. excel_col returns all excel column name.

Usage

ncol_to_excel(n)

excel_to_ncol(col_name)

excel_col()

Arguments

n

the column position

col_name

the culumn name

Examples

ncol_to_excel(35)
excel_to_ncol("BF")
excel_col()
ncol_to_excel(1:6)
excel_to_ncol(c('AF', 'AG', 'AH'))

find pattern in name's dataset

Description

find pattern in name's dataset

Usage

find_name(dataset, pattern)

Arguments

dataset

a data.frame (or list or anything with names parameter)

pattern

pattern we are looking for

Value

a list with position and value

Examples


find_name(iris,"Sepal")

transform the excel numeric date format into POSIXct

Description

transform the excel numeric date format into POSIXct

Usage

from_excel_to_posixt(vec, origin = "1904-01-01")

Arguments

vec

a vector

origin

a date-time object, or something which can be coerced by as.POSIXct(tz = "GMT") to such an object.

like gsub but keep a factor as factor

Description

like gsub but keep a factor as factor

Usage

gsub2(x, ...)

Arguments

x

a vector

...

les parametres de la fonction gsub

Value

a vector

does this vector only contains 0 and 1

Description

does this vector only contains 0 and 1

Usage

is.01(x)

Arguments

x

a vector

Value

a boolean

Examples


is.01(c(0,1,0,0,1))
is.01(c(0,1,0,0,5))

does this vector only contains 1 and 2

Description

does this vector only contains 1 and 2

Usage

is.12(x)

Arguments

x

a vector

Value

a boolean

Examples


is.12(c(1,1,2,1,2))
is.12(c(1,1,2,1,5))

Predicate for charater vector full of figures

Description

detects if a character vector is only made with figures. Useful when you

Usage

is_full_figures(.)

Arguments

.

a vector of character (and eventually NA's)

Value

a boolean

Examples

is_full_figures(c(NA,"0","25.3"))
is_full_figures((c(NA,"0","25_3")))

Predicate for full NA vector

Description

is_full_na test if the vector is full of NA's

Usage

is_full_na(.)

Arguments

.

a vector

Value

a vector of boolean

Examples

is_full_na(c(NA, NA, NA))

is a factor a likert scale

Description

is a factor a likert scale

Usage

is_likert(vec, lev)

Arguments

vec

a factor

lev

le scale

Value

boolean

Examples

is_likert(iris$Species,c("setosa","versicolor","virginica"))
is_likert(iris$Species,c("setosa","versicolor","virginica","banana"))
is_likert(iris$Species,c("setosa","versicolor"))

return TRUE if this look like a number

Description

return TRUE if this look like a number

Usage

look_like_a_number(vec)

Arguments

vec

a vector

Value

un booleen

make.unique improvement

Description

make.unique improvement

Usage

make_unique(vec, sep = "_")

Arguments

vec

a vector

sep

char separator to use

Value

a vector

Examples


make_unique(c("a","a","a","b","a","b","c"))

peep the pipeline

Description

peep some data at one step of a pipeline.

Usage

peep(data, ..., printer = print, verbose = FALSE)

Arguments

data

some data

...

function names or expressions that use . as a placeholder for the data

printer

which function use to print

verbose

TRUE to include what is printed

Value

the input data

Examples

if( require(magrittr) ){
  # just symbols
  iris %>% peep(head,tail) %>% summary
  # expressions with .
  iris %>% peep(head(., n=2),tail(., n=3) ) %>% summary
  # or both
  iris %>% peep(head,tail(., n=3) ) %>% summary
  # use verbose to see what happens
  iris %>% peep(head,tail(., n=3), verbose = TRUE) %>% summary
}

Replace pattern everywhere in a data.frame

Description

Replace pattern everywhere in a data.frame

Usage

replace_pattern(dataset, pattern, replacement, exact = FALSE)

Arguments

dataset

a data.frame

pattern

Pattern to look for.

replacement

A character of replacements.

exact

a boolean if TRUE the whole value need ton match

Value

a data.frame

Examples

dataset <- data.frame(
  col_a = as.factor(letters)[1:7],
  col_b = letters[1:7],
  col_c = 1:7,
  col_d = paste0(letters[1:7], letters[1:7]),
  stringsAsFactors = FALSE
)

# replace pattern
replace_pattern(dataset, "a", "XXX-")

# With exact matching
replace_pattern(dataset, "a", "XXX-", exact = TRUE)

export a data.frame to csv

Description

export a data.frame to csv

Usage

save_as_csv(dataset, path, row.names = FALSE, ...)

Arguments

dataset

a data.frame

path

the path

row.names

booleen do we have to save the row names

...

other write.csv parameters

Value

file name as character

Examples


## Not run: 
iris %>% save_as_csv(file.path(tempdir(),'coucou.csv')) %>% browseURL()



## End(Not run)

set a given coltype to each column in a data.frame

Description

set a given coltype to each column in a data.frame

Usage

set_col_type(dataset, col_type)

Arguments

dataset

a data.frame

col_type

a character vector containing the class to apply

Value

a data.frame

thinkr: Tools for Cleaning Up Messy Files

Description

Author(s)

See Also

Pipe operator

Description

Usage

not in

Description

Usage

Arguments

Examples

delete .test file in testthat folder

Description

Usage

Save all ggplot in a pptx

Description

Usage

Arguments

Examples

Transform a vector into numeric if meaningful, even with bad decimal, space or %

Description

Usage

Arguments

Details

Value

Examples

Clean levels label

Description

Usage

Arguments

clean_names

Description

Usage

Arguments

Value

Examples

Clean character vector

Description

Usage

Arguments

return R instruction to create levels

Description

Usage

Arguments

Value

Examples

Get position or excel name of column

Description

Usage

Arguments

Examples

find pattern in name's dataset

Description

Usage

Arguments

Value

Examples

transform the excel numeric date format into POSIXct

Description

Usage

Arguments

like gsub but keep a factor as factor

Description

Usage

Arguments

Value

does this vector only contains 0 and 1

Description

Usage

Arguments

Value

Examples

does this vector only contains 1 and 2

Description

Usage

Arguments

Value

Examples

Predicate for charater vector full of figures