The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Type: Package
Title: Automatic Data Validation and Reporting
Version: 0.2.1
Description: Validate dataset by columns and rows using convenient predicates inspired by 'assertr' package. Generate good looking HTML report or print console output to display in logs of your data processing pipeline.
URL: https://appsilon.github.io/data.validator/, https://github.com/Appsilon/data.validator
BugReports: https://github.com/Appsilon/data.validator/issues
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.2.3
Language: en-US
Imports: assertr (≥ 2.8), shiny.semantic (≥ 0.3.3), knitr, purrr, dplyr, tidyr, utils, R6, rlang, rmarkdown, htmltools, htmlwidgets, tibble
Suggests: covr, fixtuRes, fs, lintr, magrittr, rcmdcheck, readr, shiny, spelling, targets, testthat, visNetwork, withr
Collate: 'results_parsers.R' 'semantic_report_constructors.R' 'utils.R' 'report.R' 'assertions.R'
NeedsCompilation: no
Packaged: 2023-12-11 10:01:11 UTC; kuba
Author: Marcin Dubel [aut, cre], Paweł Przytuła [aut], Jakub Nowicki [aut], Krystian Igras [aut], Dominik Krzeminski [ctb], Servet Ahmet Çizmeli [ctb], Appsilon Sp. z o.o. [cph]
Maintainer: Marcin Dubel <opensource+marcin@appsilon.com>
Repository: CRAN
Date/Publication: 2023-12-11 10:20:06 UTC

Add validation results to the Report object

Description

This function adds results to validator object with aggregating summary of success, error and warning checks. Moreover it parses assertr results attributes and stores them inside usable table.

Usage

add_results(data, report)

Arguments

data

Data that was validated.

report

Report object to store validation results.


Defensive wrapper to add evaluation error to regular validation errors.

Description

Defensive wrapper to add evaluation error to regular validation errors.

Usage

check_assertr_expression(this_call, data, description, error_fun)

Arguments

this_call

assertion command that is checked for valuation errors.

data

A data.frame or tibble to test.

description

A character string with description of assertion.

error_fun

Function that is called when the validation fails

Value

validation object with evaluation errors added to the list if occurred.

See Also

validate_if

Examples

## Not run: 
library(fixtuRes)
library(magrittr)
library(assertr)
library(data.validator)

my_mock_generator <- fixtuRes::MockDataGenerator$new("fixtures_config.yml")
my_data_frame <- my_mock_generator$get_data("my_data_frame", 10)

report <- data.validator::data_validation_report()

validate(my_data_frame, name = "Verifying data uniqueness") %>%
  validate_if(has_all_names("id", "code", "test"), description = "All columns are there") %>%
  validate_if(is.character(test), description = "TEST column is string") %>%
  validate_if(is_uniq(id), description = "ID column is unique") %>%
  validate_if(!is.na(id) & id != "", description = "ID column is not empty") %>%
  validate_if(is.character(code), description = "CODE column is string") %>%
  validate_rows(col_concat, is_uniq, code, type, description = "CODE and TYPE is unique") %>%
  add_results(report)

print(report)

## End(Not run)

Convert error table column types

Description

Convert error table column types

Usage

convert_error_df(error_df)

Arguments

error_df

Table consisting assertr error details


Create summary table row.

Description

Create summary table row.

Usage

create_summary_row(id, number, color, label)

Arguments

id

ID.

number

Number to display.

color

Color of the label.

label

Label to display.

Value

Summary table row.


Create new validator object

Description

The object returns R6 class environment responsible for storing validation results.

Usage

data_validation_report()

Displays results of validations.

Description

Displays results of validations.

Usage

display_results(data, n_passes, n_fails, n_warns, df_error_head_n)

Arguments

data

Report data.

n_passes

Number of successful assertions.

n_fails

Number of warning assertions.

n_warns

Number of violation assertions.

df_error_head_n

Number of rows to display in error table.

Value

Validation report.


Constants

Description

Constants

Usage

error_class

Format

An object of class character of length 1.


Create a recursive function to find the first non-call object

Description

This function iteratively dives into the provided list (R expression), until it finds an object that is not a function call or a complex command. The [[2]] is used with the argument 2, because in the list representation of function calls in R, the actual function is the first element, and its arguments are the subsequent elements. So object 2 generally refers to the first argument of the function call.

Usage

find_first_noncall(object)

Arguments

object

A list representing an R expression.

Value

The first non-call object found in the list representation of an R expression.


Generate a random ID.

Description

Generate a random ID.

Usage

generate_id()

Value

A characters corresponding to random ID.


Match proper method depending on predicate type

Description

Match proper method depending on predicate type

Usage

get_assert_method(
  predicate,
  method = list(direct = assertr::assert, generator = assertr::insist)
)

Arguments

predicate

Predicate or predicate generator function.

method

optional list with fields direct and generator of assertions


get assertion type

Description

get assertion type

Usage

get_assertion_type(assertion)

Arguments

assertion

assertion object (check assertr package for details)

Value

character with id of assertion: "error", "success", "warning"


Constructs an Abstract Syntax Tree for an expression

Description

This function breaks down an R expression into a list structure, creating a tree-like representation of the code.

Usage

get_ast(exp)

Arguments

exp

An R expression to be parsed into a list structure.

Value

A list structure that represents the input R expression.


Extract the name of the initial data object in a magrittr pipe chain

Description

This function analyzes the call stack, identifies the first call using the magrittr pipe operator (' initial object in that pipe chain.

Usage

get_first_name()

Value

A string representing the name of the initial data object in the pipe chain.


Get validation results

Description

The response is a list containing information about successful, failed, warning assertions and the table stores important information about validation results. Those are:

Usage

get_results(report, unnest = FALSE)

Arguments

report

Report object that stores validation results. See add_results.

unnest

If TRUE, error_df table is unnested. Results with remaining columns duplicated in table.


Get results number

Description

Get results number

Usage

get_results_number(results)

Arguments

results

assertion results

Value

table with results number


Generate HTML report.

Description

Generate HTML validation report.

Usage

get_semantic_report_ui(
  n_passes,
  n_fails,
  n_warns,
  validation_results,
  df_error_head_n
)

Arguments

n_passes

Number of passed validations

n_fails

Number of failed validations.

n_warns

Number of warnings.

validation_results

Data frame with validation results.

df_error_head_n

Number of rows to display in error table.

Value

HTML validation report.


Check if a command is complex, i.e, contains any non-alphanumeric character

Description

Check if a command is complex, i.e, contains any non-alphanumeric character

Usage

is_complex_command(command_string)

Arguments

command_string

A character string representing the command to be checked.

Value

Logical value indicating whether the command_string is complex (TRUE) or not (FALSE).


Create a UI accordion container.

Description

Create a UI accordion container.

Usage

make_accordion_container(...)

Arguments

...

Additional arguments inside accordion container.

Value

Accordion container.


Create a UI accordion element.

Description

Create a UI accordion element.

Usage

make_accordion_element(
  results,
  color = "green",
  label,
  active = FALSE,
  type,
  mark,
  df_error_head_n
)

Arguments

results

Results to display.

color

Color of the label icon.

label

Label.

active

Is active?

type

Result type.

mark

Icon to display.

df_error_head_n

Number of rows to display in error table.

Value

Accordion.


Create summary table.

Description

Create summary table.

Usage

make_summary_table(n_passes, n_fails, n_warns)

Arguments

n_passes

Number of passed validations.

n_fails

Number of failed validations.

n_warns

Number of warnings.

Value

Summary table.


Create table row.

Description

Create table row.

Usage

make_table_row(results, type, mark, df_error_head_n)

Arguments

results

Results to display in a row.

type

Result type.

mark

Icon to display.

df_error_head_n

Number of rows to display in error table.

Value

Table row.


Parse errors to data.frame

Description

Parse errors to data.frame

Usage

parse_errors_to_df(data)

Arguments

data

object of assertr error class (check assertr package for details)

Value

data.frame with errors


Parse results to data.frame

Description

Parse results to data.frame

Usage

parse_results_to_df(data)

Arguments

data

assertr object (check assertr package for details)

Value

data.frame with successes and errors


Parse successes to data.frame

Description

Parse successes to data.frame

Usage

parse_successes_to_df(data)

Arguments

data

object of assertr success class (check assertr package for details)

Value

data.frame with successes


Prepare modal content.

Description

Prepare modal content.

Usage

prepare_modal_content(error, df_error_head_n)

Arguments

error

Assertr error.

df_error_head_n

Number of rows to display in error table.

Value

Modal content.


Render simple version of report

Description

Renders content of simple report version that prints validation_results table.

Usage

render_raw_report_ui(
  validation_results,
  success = TRUE,
  warning = TRUE,
  error = TRUE
)

Arguments

validation_results

Validation results table (see get_results).

success

Should success results be presented?

warning

Should warning results be presented?

error

Should error results be presented?


Render semantic version of report

Description

Renders content of semantic report version.

Usage

render_semantic_report_ui(
  validation_results,
  success = TRUE,
  warning = TRUE,
  error = TRUE,
  df_error_head_n = 6L
)

Arguments

validation_results

Validation results table (see get_results).

success

Should success results be presented?

warning

Should warning results be presented?

error

Should error results be presented?

df_error_head_n

Number of rows to display in error table. Works in the same way as head function.


Create table with results.

Description

Create table with results.

Usage

result_table(results, type, mark, df_error_head_n)

Arguments

results

Result to display in table.

type

Result type.

mark

Icon to display.

df_error_head_n

Number of rows to display in error table.

Value

Table row.


Saving results as a HTML report

Description

Saving results as a HTML report

Usage

save_report(
  report,
  output_file = "validation_report.html",
  output_dir = getwd(),
  ui_constructor = render_semantic_report_ui,
  template = system.file("rmarkdown/templates/standard/skeleton/skeleton.Rmd", package =
    "data.validator"),
  ...
)

Arguments

report

Report object that stores validation results.

output_file

Html file name to write report to.

output_dir

Target report directory.

ui_constructor

Function of validation_results and optional parameters that generates HTML code or HTML widget that should be used to generate report content. See custom_report example.

template

Path to Rmd template in which ui_constructor is rendered. See data.validator rmarkdown template to see basic construction - the one is used as a default template.

...

Additional parameters passed to ui_constructor. For example: df_error_head_n


Saving results table to external file

Description

Saving results table to external file

Usage

save_results(report, file_name = "results.csv", method = utils::write.csv, ...)

Arguments

report

Report object that stores validation results. See get_results.

file_name

Name of the resulting file (including extension).

method

Function that should be used to save results table (write.csv default) The function passed to method should have 'x' and 'file' arguments. Functions with different arguments can be passed by creating a wrapper function for it. See example save_results_methods.

...

Remaining parameters passed to method.


Save simple validation summary in text file

Description

Saves print(validator) output inside text file.

Usage

save_summary(
  report,
  file_name = "validation_log.txt",
  success = TRUE,
  warning = TRUE,
  error = TRUE
)

Arguments

report

Report object that stores validation results.

file_name

Name of the resulting file (including extension).

success

Should success results be presented?

warning

Should warning results be presented?

error

Should error results be presented?


Create a UI segment element.

Description

Create a UI segment element.

Usage

segment(title, ...)

Arguments

title

Title of the segment.

...

Additional arguments inside segment.

Value

Segment.


Prepare data for validation chain

Description

Prepare data for validation and generating report. The function prepares data for chain validation and ensures all the validation results are gathered correctly. The function also attaches additional information to the data (name and description) that is then displayed in validation report.

Usage

validate(data, name, description = NULL)

Arguments

data

data.frame or tibble to test

name

name of validation object (will be displayed in the report)

description

description of validation object (will be displayed in the report)


Validation on columns

Description

Validation on columns

Usage

validate_cols(
  data,
  predicate,
  ...,
  obligatory = FALSE,
  description = NA,
  skip_chain_opts = FALSE,
  success_fun = assertr::success_append,
  error_fun = assertr::error_append,
  defect_fun = assertr::defect_append
)

Arguments

data

A data.frame or tibble to test

predicate

Predicate function or predicate generator such as in_set or within_n_sds

...

Columns selection that predicate should be called on. All tidyselect language methods are supported. If not provided, all everything will be used.

obligatory

If TRUE and assertion failed the data is marked as defective. For defective data, all the following rules are handled by defect_fun function

description

A character string with description of assertion. The description is then displayed in the validation report

skip_chain_opts

While wrapping data with validate function, success_fun and error_fun parameters are rewritten with success_append and error_append respectively. In order to use parameters assigned to the function directly set skip_chain_opts to TRUE

success_fun

Function that is called when the validation pass

error_fun

Function that is called when the validation fails

defect_fun

Function that is called when the data is marked as defective

See Also

validate_if validate_rows


Verify if expression regarding data is TRUE

Description

The function checks whether all the logical values returned by the expression are TRUE. The function is meant for handling all the cases that cannot be reached by using validate_cols and validate_rows functions.

Usage

validate_if(
  data,
  expr,
  description = NA,
  obligatory = FALSE,
  skip_chain_opts = FALSE,
  success_fun = assertr::success_append,
  error_fun = assertr::error_append,
  defect_fun = assertr::defect_append
)

Arguments

data

A data.frame or tibble to test

expr

A Logical expression to test for, e.g. var_name > 0

description

A character string with description of assertion. The description is then displayed in the validation report

obligatory

If TRUE and assertion failed the data is marked as defective. For defective data, all the following rules are handled by defect_fun function

skip_chain_opts

While wrapping data with validate function, success_fun and error_fun parameters are rewritten with success_append and error_append respectively. In order to use parameters assigned to the function directly set skip_chain_opts to TRUE

success_fun

Function that is called when the validation pass

error_fun

Function that is called when the validation fails

defect_fun

Function that is called when the data is marked as defective

See Also

validate_cols validate_rows


Validation on rows

Description

Validation on rows

Usage

validate_rows(
  data,
  row_reduction_fn,
  predicate,
  ...,
  obligatory = FALSE,
  description = NA,
  skip_chain_opts = FALSE,
  success_fun = assertr::success_append,
  error_fun = assertr::error_append,
  defect_fun = assertr::defect_append
)

Arguments

data

A data.frame or tibble to test

row_reduction_fn

Function that should reduce rows into a single column that is passed to validation e.g. num_row_NAs

predicate

Predicate function or predicate generator such as in_set or within_n_sds

...

Columns selection that row_reduction_fn should be called on. All tidyselect language methods are supported. If not provided, all everything will be used.

obligatory

If TRUE and assertion failed the data is marked as defective. For defective data, all the following rules are handled by defect_fun function

description

A character string with description of assertion. The description is then displayed in the validation report

skip_chain_opts

While wrapping data with validate function, success_fun and error_fun parameters are rewritten with success_append and error_append respectively. In order to use parameters assigned to the function directly set skip_chain_opts to TRUE.

success_fun

Function that is called when the validation pass

error_fun

Function that is called when the validation fails

defect_fun

Function that is called when the data is marked as defective

See Also

validate_cols validate_if

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.