README

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

vazul

vazul is an R package for analyis blinding in research contexts. It offers two main approaches to anonymize data while preserving analytical validity: masking (replacing values with anonymous labels) and scrambling (randomizing the order of existing values).

Analysis Blinding Approaches

Masking replaces original values with anonymous labels, completely hiding the original information:

treatment <- c("control", "treatment", "control")
mask_labels(treatment)
#> "masked_group_01" "masked_group_02" "masked_group_01"

scramble_values(treatment) 
#> "treatment" "control" "control"  # Same values, different order

Installation

install.packages("vazul")

remotes::install_github("nthun/vazul")

Functions

Masking Functions

Replace categorical values with anonymous labels to completely hide original information.

mask_labels() - Mask vector values

library(vazul)

# Basic masking
treatment <- c("control", "treatment", "control", "treatment")
set.seed(123)
mask_labels(treatment)
#> "masked_group_01" "masked_group_02" "masked_group_01" "masked_group_02"

# Custom prefix
mask_labels(treatment, prefix = "group_")
#> "group_01" "group_02" "group_01" "group_02"

mask_variables() - Mask data frame columns

df <- data.frame(
  condition = c("A", "B", "A", "B"),
  treatment = c("ctrl", "test", "ctrl", "test"),
  score = c(85, 92, 78, 88)
)

# Mask multiple columns
mask_variables(df, c("condition", "treatment"))

# Use tidyselect helpers
mask_variables(df, where(is.character))

The .across_variables parameter allows for consistent masking across multiple columns (e.g., longitudinal data in wide format).

df <- data.frame(
  wave_1 = c("A", "B", "A"),
  wave_2 = c("B", "A", "B"),
  score = c(10, 20, 30)
)

# Mask across variables consistently
mask_variables(df, starts_with("wave_"), .across_variables = TRUE)

Scrambling Functions

scramble_values() - Scramble vector order

# Numeric data
set.seed(123) 
scramble_values(1:5)
#> [1] 3 2 5 4 1

# Categorical data
scramble_values(c("A", "B", "C", "A", "B"))
#> [1] "B" "A" "C" "B" "A"

scramble_variables() - Scramble data frame columns

df <- data.frame(x = 1:6, group = rep(c("A", "B"), each = 3))

# Scramble across entire column
scramble_variables(df, "x")

# Scramble within groups
scramble_variables(df, "x", .groups = "group")

# Using dplyr grouping
library(dplyr)
df |> group_by(group) |> scramble_variables("x")

Row-wise scrambling: Use .byrow = TRUE to shuffle values within each row across the selected columns.

df_items <- data.frame(
  item1 = c(1, 4, 7),
  item2 = c(2, 5, 8), 
  item3 = c(3, 6, 9)
)

# Shuffles values horizontally within each row
scramble_variables(df_items, item1:item3, .byrow = TRUE)

Datasets

MARP Dataset

Many Analysts Religion Project data: 10,535 participants across 24 countries studying religiosity and well-being.

Williams Dataset

Experimental study data: 112 participants examining risk-taking behavior under different wealth conditions.

Explanation of the package name

Vazul was a Hungarian prince in the 11. century. He was blinded by the king to become unfit for the throne. More info: https://en.wikipedia.org/wiki/Vazul

Documentation

Authors

Citation

Nagy, T., Kovács, M., & Sarafoglou, A. (2026). vazul: An R package for analysis blinding. Zenodo. https://doi.org/10.5281/zenodo.18269711

License

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.

vazul

Analysis Blinding Approaches

Installation

Functions

Masking Functions

`mask_labels()` - Mask vector values

`mask_variables()` - Mask data frame columns

Scrambling Functions

`scramble_values()` - Scramble vector order

`scramble_variables()` - Scramble data frame columns

Row-wise scrambling: Use `.byrow = TRUE` to shuffle values within each row across the selected columns.

Datasets

MARP Dataset

Williams Dataset

Explanation of the package name

Documentation

Authors

Citation

License

vazul

Analysis Blinding Approaches

Installation

Functions

Masking Functions

mask_labels() - Mask vector values

mask_variables() - Mask data frame columns

Scrambling Functions

scramble_values() - Scramble vector order

scramble_variables() - Scramble data frame columns

Row-wise scrambling: Use .byrow = TRUE to shuffle values within each row across the selected columns.

Datasets

MARP Dataset

Williams Dataset

Explanation of the package name

Documentation

Authors

Citation

License

`mask_labels()` - Mask vector values

`mask_variables()` - Mask data frame columns

`scramble_values()` - Scramble vector order

`scramble_variables()` - Scramble data frame columns

Row-wise scrambling: Use `.byrow = TRUE` to shuffle values within each row across the selected columns.