Type: | Package |
Title: | Automatic Data Validation and Reporting |
Version: | 0.2.1 |
Description: | Validate dataset by columns and rows using convenient predicates inspired by 'assertr' package. Generate good looking HTML report or print console output to display in logs of your data processing pipeline. |
URL: | https://appsilon.github.io/data.validator/, https://github.com/Appsilon/data.validator |
BugReports: | https://github.com/Appsilon/data.validator/issues |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.3 |
Language: | en-US |
Imports: | assertr (≥ 2.8), shiny.semantic (≥ 0.3.3), knitr, purrr, dplyr, tidyr, utils, R6, rlang, rmarkdown, htmltools, htmlwidgets, tibble |
Suggests: | covr, fixtuRes, fs, lintr, magrittr, rcmdcheck, readr, shiny, spelling, targets, testthat, visNetwork, withr |
Collate: | 'results_parsers.R' 'semantic_report_constructors.R' 'utils.R' 'report.R' 'assertions.R' |
NeedsCompilation: | no |
Packaged: | 2023-12-11 10:01:11 UTC; kuba |
Author: | Marcin Dubel [aut, cre], Paweł Przytuła [aut], Jakub Nowicki [aut], Krystian Igras [aut], Dominik Krzeminski [ctb], Servet Ahmet Çizmeli [ctb], Appsilon Sp. z o.o. [cph] |
Maintainer: | Marcin Dubel <opensource+marcin@appsilon.com> |
Repository: | CRAN |
Date/Publication: | 2023-12-11 10:20:06 UTC |
Add validation results to the Report object
Description
This function adds results to validator object with aggregating summary of success, error and warning checks. Moreover it parses assertr results attributes and stores them inside usable table.
Usage
add_results(data, report)
Arguments
data |
Data that was validated. |
report |
Report object to store validation results. |
Defensive wrapper to add evaluation error to regular validation errors.
Description
Defensive wrapper to add evaluation error to regular validation errors.
Usage
check_assertr_expression(this_call, data, description, error_fun)
Arguments
this_call |
assertion command that is checked for valuation errors. |
data |
A data.frame or tibble to test. |
description |
A character string with description of assertion. |
error_fun |
Function that is called when the validation fails |
Value
validation object with evaluation errors added to the list if occurred.
See Also
validate_if
Examples
## Not run:
library(fixtuRes)
library(magrittr)
library(assertr)
library(data.validator)
my_mock_generator <- fixtuRes::MockDataGenerator$new("fixtures_config.yml")
my_data_frame <- my_mock_generator$get_data("my_data_frame", 10)
report <- data.validator::data_validation_report()
validate(my_data_frame, name = "Verifying data uniqueness") %>%
validate_if(has_all_names("id", "code", "test"), description = "All columns are there") %>%
validate_if(is.character(test), description = "TEST column is string") %>%
validate_if(is_uniq(id), description = "ID column is unique") %>%
validate_if(!is.na(id) & id != "", description = "ID column is not empty") %>%
validate_if(is.character(code), description = "CODE column is string") %>%
validate_rows(col_concat, is_uniq, code, type, description = "CODE and TYPE is unique") %>%
add_results(report)
print(report)
## End(Not run)
Convert error table column types
Description
Convert error table column types
Usage
convert_error_df(error_df)
Arguments
error_df |
Table consisting assertr error details |
Create summary table row.
Description
Create summary table row.
Usage
create_summary_row(id, number, color, label)
Arguments
id |
ID. |
number |
Number to display. |
color |
Color of the label. |
label |
Label to display. |
Value
Summary table row.
Create new validator object
Description
The object returns R6 class environment responsible for storing validation results.
Usage
data_validation_report()
Displays results of validations.
Description
Displays results of validations.
Usage
display_results(data, n_passes, n_fails, n_warns, df_error_head_n)
Arguments
data |
Report data. |
n_passes |
Number of successful assertions. |
n_fails |
Number of warning assertions. |
n_warns |
Number of violation assertions. |
df_error_head_n |
Number of rows to display in error table. |
Value
Validation report.
Constants
Description
Constants
Usage
error_class
Format
An object of class character
of length 1.
Create a recursive function to find the first non-call object
Description
This function iteratively dives into the provided list (R expression), until it finds an object that is not a function call or a complex command. The [[2]] is used with the argument 2, because in the list representation of function calls in R, the actual function is the first element, and its arguments are the subsequent elements. So object 2 generally refers to the first argument of the function call.
Usage
find_first_noncall(object)
Arguments
object |
A list representing an R expression. |
Value
The first non-call object found in the list representation of an R expression.
Generate a random ID.
Description
Generate a random ID.
Usage
generate_id()
Value
A characters corresponding to random ID.
Match proper method depending on predicate type
Description
Match proper method depending on predicate type
Usage
get_assert_method(
predicate,
method = list(direct = assertr::assert, generator = assertr::insist)
)
Arguments
predicate |
Predicate or predicate generator function. |
method |
optional list with fields direct and generator of assertions |
get assertion type
Description
get assertion type
Usage
get_assertion_type(assertion)
Arguments
assertion |
assertion object (check |
Value
character with id of assertion: "error", "success", "warning"
Constructs an Abstract Syntax Tree for an expression
Description
This function breaks down an R expression into a list structure, creating a tree-like representation of the code.
Usage
get_ast(exp)
Arguments
exp |
An R expression to be parsed into a list structure. |
Value
A list structure that represents the input R expression.
Extract the name of the initial data object in a magrittr pipe chain
Description
This function analyzes the call stack, identifies the first call using the magrittr pipe operator (' initial object in that pipe chain.
Usage
get_first_name()
Value
A string representing the name of the initial data object in the pipe chain.
Get validation results
Description
The response is a list containing information about successful, failed, warning assertions and the table stores important information about validation results. Those are:
table_name - name of validated table
assertion.id - id used for each assertion
description - assertion description
num.violations - number of violations (assertion and column specific)
call - assertion call
message - assertion result message for specific column
type - error, warning or success
error_df - nested table storing details about error or warning result (like violated indexes and values)
Usage
get_results(report, unnest = FALSE)
Arguments
report |
Report object that stores validation results. See add_results. |
unnest |
If TRUE, error_df table is unnested. Results with remaining columns duplicated in table. |
Get results number
Description
Get results number
Usage
get_results_number(results)
Arguments
results |
assertion results |
Value
table with results number
Generate HTML report.
Description
Generate HTML validation report.
Usage
get_semantic_report_ui(
n_passes,
n_fails,
n_warns,
validation_results,
df_error_head_n
)
Arguments
n_passes |
Number of passed validations |
n_fails |
Number of failed validations. |
n_warns |
Number of warnings. |
validation_results |
Data frame with validation results. |
df_error_head_n |
Number of rows to display in error table. |
Value
HTML validation report.
Check if a command is complex, i.e, contains any non-alphanumeric character
Description
Check if a command is complex, i.e, contains any non-alphanumeric character
Usage
is_complex_command(command_string)
Arguments
command_string |
A character string representing the command to be checked. |
Value
Logical value indicating whether the command_string is complex (TRUE) or not (FALSE).
Create a UI accordion container.
Description
Create a UI accordion container.
Usage
make_accordion_container(...)
Arguments
... |
Additional arguments inside accordion container. |
Value
Accordion container.
Create a UI accordion element.
Description
Create a UI accordion element.
Usage
make_accordion_element(
results,
color = "green",
label,
active = FALSE,
type,
mark,
df_error_head_n
)
Arguments
results |
Results to display. |
color |
Color of the label icon. |
label |
Label. |
active |
Is active? |
type |
Result type. |
mark |
Icon to display. |
df_error_head_n |
Number of rows to display in error table. |
Value
Accordion.
Create summary table.
Description
Create summary table.
Usage
make_summary_table(n_passes, n_fails, n_warns)
Arguments
n_passes |
Number of passed validations. |
n_fails |
Number of failed validations. |
n_warns |
Number of warnings. |
Value
Summary table.
Create table row.
Description
Create table row.
Usage
make_table_row(results, type, mark, df_error_head_n)
Arguments
results |
Results to display in a row. |
type |
Result type. |
mark |
Icon to display. |
df_error_head_n |
Number of rows to display in error table. |
Value
Table row.
Parse errors to data.frame
Description
Parse errors to data.frame
Usage
parse_errors_to_df(data)
Arguments
data |
object of assertr error class (check |
Value
data.frame with errors
Parse results to data.frame
Description
Parse results to data.frame
Usage
parse_results_to_df(data)
Arguments
data |
assertr object (check |
Value
data.frame with successes and errors
Parse successes to data.frame
Description
Parse successes to data.frame
Usage
parse_successes_to_df(data)
Arguments
data |
object of assertr success class (check |
Value
data.frame with successes
Prepare modal content.
Description
Prepare modal content.
Usage
prepare_modal_content(error, df_error_head_n)
Arguments
error |
Assertr error. |
df_error_head_n |
Number of rows to display in error table. |
Value
Modal content.
Render simple version of report
Description
Renders content of simple report version that prints validation_results
table.
Usage
render_raw_report_ui(
validation_results,
success = TRUE,
warning = TRUE,
error = TRUE
)
Arguments
validation_results |
Validation results table (see get_results). |
success |
Should success results be presented? |
warning |
Should warning results be presented? |
error |
Should error results be presented? |
Render semantic version of report
Description
Renders content of semantic report version.
Usage
render_semantic_report_ui(
validation_results,
success = TRUE,
warning = TRUE,
error = TRUE,
df_error_head_n = 6L
)
Arguments
validation_results |
Validation results table (see get_results). |
success |
Should success results be presented? |
warning |
Should warning results be presented? |
error |
Should error results be presented? |
df_error_head_n |
Number of rows to display in error table.
Works in the same way as |
Create table with results.
Description
Create table with results.
Usage
result_table(results, type, mark, df_error_head_n)
Arguments
results |
Result to display in table. |
type |
Result type. |
mark |
Icon to display. |
df_error_head_n |
Number of rows to display in error table. |
Value
Table row.
Saving results as a HTML report
Description
Saving results as a HTML report
Usage
save_report(
report,
output_file = "validation_report.html",
output_dir = getwd(),
ui_constructor = render_semantic_report_ui,
template = system.file("rmarkdown/templates/standard/skeleton/skeleton.Rmd", package =
"data.validator"),
...
)
Arguments
report |
Report object that stores validation results. |
output_file |
Html file name to write report to. |
output_dir |
Target report directory. |
ui_constructor |
Function of |
template |
Path to Rmd template in which ui_constructor is rendered. See
|
... |
Additional parameters passed to |
Saving results table to external file
Description
Saving results table to external file
Usage
save_results(report, file_name = "results.csv", method = utils::write.csv, ...)
Arguments
report |
Report object that stores validation results. See get_results. |
file_name |
Name of the resulting file (including extension). |
method |
Function that should be used to save results table (write.csv default)
The function passed to |
... |
Remaining parameters passed to |
Save simple validation summary in text file
Description
Saves print(validator)
output inside text file.
Usage
save_summary(
report,
file_name = "validation_log.txt",
success = TRUE,
warning = TRUE,
error = TRUE
)
Arguments
report |
Report object that stores validation results. |
file_name |
Name of the resulting file (including extension). |
success |
Should success results be presented? |
warning |
Should warning results be presented? |
error |
Should error results be presented? |
Create a UI segment element.
Description
Create a UI segment element.
Usage
segment(title, ...)
Arguments
title |
Title of the segment. |
... |
Additional arguments inside segment. |
Value
Segment.
Prepare data for validation chain
Description
Prepare data for validation and generating report. The function prepares data for chain validation and ensures all the validation results are gathered correctly. The function also attaches additional information to the data (name and description) that is then displayed in validation report.
Usage
validate(data, name, description = NULL)
Arguments
data |
data.frame or tibble to test |
name |
name of validation object (will be displayed in the report) |
description |
description of validation object (will be displayed in the report) |
Validation on columns
Description
Validation on columns
Usage
validate_cols(
data,
predicate,
...,
obligatory = FALSE,
description = NA,
skip_chain_opts = FALSE,
success_fun = assertr::success_append,
error_fun = assertr::error_append,
defect_fun = assertr::defect_append
)
Arguments
data |
A data.frame or tibble to test |
predicate |
Predicate function or predicate generator such as |
... |
Columns selection that |
obligatory |
If TRUE and assertion failed the data is marked as defective. For defective data, all the following rules are handled by defect_fun function |
description |
A character string with description of assertion. The description is then displayed in the validation report |
skip_chain_opts |
While wrapping data with validate function, |
success_fun |
Function that is called when the validation pass |
error_fun |
Function that is called when the validation fails |
defect_fun |
Function that is called when the data is marked as defective |
See Also
validate_if validate_rows
Verify if expression regarding data is TRUE
Description
The function checks whether all the logical values returned by the expression are TRUE. The function is meant for handling all the cases that cannot be reached by using validate_cols and validate_rows functions.
Usage
validate_if(
data,
expr,
description = NA,
obligatory = FALSE,
skip_chain_opts = FALSE,
success_fun = assertr::success_append,
error_fun = assertr::error_append,
defect_fun = assertr::defect_append
)
Arguments
data |
A data.frame or tibble to test |
expr |
A Logical expression to test for, e.g. |
description |
A character string with description of assertion. The description is then displayed in the validation report |
obligatory |
If TRUE and assertion failed the data is marked as defective. For defective data, all the following rules are handled by defect_fun function |
skip_chain_opts |
While wrapping data with validate function, |
success_fun |
Function that is called when the validation pass |
error_fun |
Function that is called when the validation fails |
defect_fun |
Function that is called when the data is marked as defective |
See Also
validate_cols validate_rows
Validation on rows
Description
Validation on rows
Usage
validate_rows(
data,
row_reduction_fn,
predicate,
...,
obligatory = FALSE,
description = NA,
skip_chain_opts = FALSE,
success_fun = assertr::success_append,
error_fun = assertr::error_append,
defect_fun = assertr::defect_append
)
Arguments
data |
A data.frame or tibble to test |
row_reduction_fn |
Function that should reduce rows into a single column that is passed to
validation e.g. |
predicate |
Predicate function or predicate generator such as |
... |
Columns selection that |
obligatory |
If TRUE and assertion failed the data is marked as defective. For defective data, all the following rules are handled by defect_fun function |
description |
A character string with description of assertion. The description is then displayed in the validation report |
skip_chain_opts |
While wrapping data with validate function, |
success_fun |
Function that is called when the validation pass |
error_fun |
Function that is called when the validation fails |
defect_fun |
Function that is called when the data is marked as defective |
See Also
validate_cols validate_if