Audit sampling: Get started

Koen Derks

2023-04-07

Introduction

Welcome to the ‘Audit sampling’ vignette of the jfa package. Here you can find a detailed explanation of the functions in the package that facilitate the statistical audit sampling workflow. Specifically, these functions implement standard audit sampling techniques to calculate sample sizes, select items from a population, and evaluate misstatement in a data sample. The jfa package enables users to create a prior probability distribution to perform Bayesian audit sampling using these functions. For more detailed explanations of each function, read the other vignettes on the package website.

Cheat sheet

The cheat sheet below will help you get started with jfa’s’ intended audit sampling workflow. You can download a pdf version of the cheat sheet here.

cheatsheet

Functions and intended usage

banner

Below you can find an explanation of the available functions in jfa sorted by their occurrence in the standard audit sampling workflow.

Create a prior distribution with auditPrior()

Lifecycle: stable

The auditPrior() function is used to specify a prior distribution for Bayesian audit sampling. The interface allows a complete customization of the prior distribution as well as a formal translation of pre-existing audit information into a prior distribution. The function returns an object of class jfaPrior which can be used with associated summary() and plot() methods. Objects with class jfaPrior can also be used as input for the prior argument in other functions. Moreover, jfaPrior objects have a corresponding predict() function to produce the predictions of the prior distribution on the data level.

Full function with default arguments:

auditPrior(method = c(
             "default", "strict", "param", "impartial", "hyp",
             "arm", "bram", "sample", "factor", "nonparam"
           ),
           likelihood = c(
             "poisson", "binomial", "hypergeometric",
             "normal", "uniform", "cauchy", "t", "chisq",
             "exponential"
           ),
           N.units = NULL,
           alpha = NULL,
           beta = NULL,
           materiality = NULL,
           expected = 0,
           ir = NULL,
           cr = NULL,
           ub = NULL,
           p.hmin = NULL,
           x = NULL,
           n = NULL,
           factor = NULL,
           samples = NULL,
           conf.level = 0.95)

Supported options for the method argument:

Supported options for the likelihood argument:

Example usage:

# Default beta(1, 1) prior distribution
x <- auditPrior(method = "default", likelihood = "binomial")

# Custom gamma(1, 10) prior distribution
x <- auditPrior(method = "param", likelihood = "poisson", alpha = 1, beta = 10)

# Beta prior distribution incorporating inherent risk (70%) and control risk (50%)
x <- auditPrior(method = "arm", likelihood = "binomial", materiality = 0.05, ir = 0.7, cr = 0.5)

summary(x) # Prints information about the prior distribution
## 
##  Prior Distribution Summary
## 
## Options:
##   Likelihood:                    binomial 
##   Specifics:                     ir = 0.7; cr = 0.5; dr = 0.1428571 
## 
## Results:
##   Functional form:               beta(α = 1, β = 21) 
##   Mode:                          0 
##   Mean:                          0.045455 
##   Median:                        0.032468 
##   Variance:                      0.0018865 
##   Skewness:                      1.7442 
##   Information entropy (nat):     -2.0921 
##   95 percent upper bound:        0.13295 
##   Precision:                     0.13295
predict(x, n = 20, cumulative = TRUE) # Predictions for a sample of n = 20
##      x<=0      x<=1      x<=2      x<=3      x<=4      x<=5      x<=6      x<=7 
## 0.5121951 0.7682927 0.8930582 0.9521576 0.9793114 0.9913797 0.9965519 0.9986816 
##      x<=8      x<=9     x<=10     x<=11     x<=12     x<=13     x<=14     x<=15 
## 0.9995206 0.9998352 0.9999468 0.9999841 0.9999956 0.9999989 0.9999998 1.0000000 
##     x<=16     x<=17     x<=18     x<=19     x<=20 
## 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000

Plan a sample with planning()

Lifecycle: stable

The planning() function is used to calculate a minimum sample size for audit samples. It allows specification of statistical requirements for the sample with respect to the performance materiality or the precision. The function returns an object of class jfaPlanning which can be used with associated summary() and plot() methods. To perform Bayesian planning, the input for the prior argument can be an object of class jfaPrior as returned by the auditPrior() function, or an object of class jfaPosterior as returned by the evaluation() function.

Full function with default arguments:

planning(materiality = NULL,
         min.precision = NULL,
         expected = 0,
         likelihood = c("poisson", "binomial", "hypergeometric"),
         conf.level = 0.95,
         N.units = NULL,
         by = 1,
         max = 5000,
         prior = FALSE)

Supported options for the likelihood argument:

Example usage:

# Classical planning using the Poisson likelihood
x <- planning(materiality = 0.03, likelihood = "poisson")

# Bayesian planning using a default beta(1, 1) prior and binomial likelihood
x <- planning(materiality = 0.03, likelihood = "binomial", prior = TRUE)

# Bayesian planning using a custom beta(1, 10) prior and binomial likelihood
x <- planning(
  materiality = 0.03,
  prior = auditPrior(method = "param", likelihood = "binomial", alpha = 1, beta = 10)
)

summary(x) # Prints information about the planning
## 
##  Bayesian Audit Sample Planning Summary
## 
## Options:
##   Confidence level:              0.95 
##   Materiality:                   0.03 
##   Hypotheses:                    H₀: Θ > 0.03 vs. H₁: Θ < 0.03 
##   Expected:                      0 
##   Likelihood:                    binomial 
##   Prior distribution:            beta(α = 1, β = 10) 
## 
## Results:
##   Minimum sample size:           89 
##   Tolerable errors:              0 
##   Posterior distribution:        beta(α = 1, β = 99) 
##   Expected most likely error:    0 
##   Expected upper bound:          0.029807 
##   Expected precision:            0.029807 
##   Expected BF₁₀:                 54.479

Select sample items with selection()

Lifecycle: stable

The selection() function is used to perform statistical selection of audit samples. It offers flexible implementations of the most common audit sampling algorithms for attributes sampling and monetary unit sampling. The function returns an object of class jfaSelection which can be used with associated summary() and plot() methods. The input for the size argument can be an object of class jfaPlanning as returned by the planning() function.

Full function with default arguments:

selection(data,
          size,
          units = c("items", "values"),
          method = c("interval", "cell", "random", "sieve"),
          values = NULL,
          order = NULL,
          decreasing = FALSE,
          randomize = FALSE,
          replace = FALSE,
          start = 1)

Supported options for the units argument:

Supported options for the method argument:

Example usage:

# Selection using random record (attributes) sampling
x <- selection(data = BuildIt, size = 100, units = "items", method = "random")

# Selection using fixed interval monetary unit sampling (using column 'bookValue' in BuildIt)
x <- selection(
  data = BuildIt, size = 100, units = "values",
  method = "interval", values = "bookValue"
)

summary(x) # Prints information about the selection
## 
##  Audit Sample Selection Summary
## 
## Options:
##   Requested sample size:         100 
##   Sampling units:                monetary units 
##   Method:                        fixed interval sampling 
##   Starting point:                1 
## 
## Data:
##   Population size:               3500 
##   Population value:              1403221 
##   Selection interval:            14032 
## 
## Results:
##   Selected sampling units:       100 
##   Proportion of value:           0.037014 
##   Selected items:                100 
##   Proportion of size:            0.028571

Evaluate a sample with evaluation()

Lifecycle: stable

The evaluation() function takes a sample or summary statistics of the sample and performs evaluation according to the specified method and sampling objectives. The function returns an object of class jfaEvalution which can be used with associated summary() and plot() methods. To perform Bayesian evaluation, the input for the prior argument can be an object of class jfaPrior as returned by the auditPrior() function, or an object of class jfaPosterior as returned by the evaluation() function.

Full function with default arguments:

evaluation(materiality = NULL, 
           method = c(
             "poisson", "binomial", "hypergeometric",
             "stringer", "stringer.meikle", "stringer.lta", "stringer.pvz",
             "rohrbach", "moment", "coxsnell",
             "direct", "difference", "quotient", "regression", "mpu"
           ),
           alternative = c("less", "two.sided", "greater"),
           conf.level = 0.95,
           data = NULL,
           values = NULL,
           values.audit = NULL,
           strata = NULL,
           times = NULL,
           x = NULL,
           n = NULL,
           N.units = NULL,
           N.items = NULL,
           pooling = c("none", "complete", "partial"), 
           prior = FALSE)

Supported options for the method argument:

Example usage:

# Classical evaluation using the Poisson likelihood (and summary statistics)
x <- evaluation(materiality = 0.03, x = 1, n = 100, method = "poisson")

# Bayesian evaluation using a default minimal information prior (and summary statistics)
x <- evaluation(materiality = 0.03, x = 1, n = 100, method = "poisson", prior = TRUE)

# Bayesian evaluation using a custom beta(1, 10) prior (and summary statistics)
x <- evaluation(
  materiality = 0.03, x = 1, n = 100,
  prior = auditPrior(method = "param", likelihood = "binomial", alpha = 1, beta = 10)
)

summary(x) # Prints information about the evaluation
## 
##  Bayesian Audit Sample Evaluation Summary
## 
## Options:
##   Confidence level:               0.95 
##   Materiality:                    0.03 
##   Hypotheses:                     H₀: Θ > 0.03 vs. H₁: Θ < 0.03 
##   Method:                         binomial 
##   Prior distribution:             beta(α = 1, β = 10) 
## 
## Data:
##   Sample size:                    100 
##   Number of errors:               1 
##   Sum of taints:                  1 
## 
## Results:
##   Posterior distribution:         beta(α = 2, β = 109) 
##   Most likely error:              0.0091743 
##   95 percent credible interval:   [0, 0.042399] 
##   Precision:                      0.033225 
##   BF₁₀:                            15.385

Create a report with report()

Lifecycle: experimental

The report() function takes an object of class jfaEvaluation as returned by the evaluation() function and automatically creates a html or pdf report containing the analysis results and their interpretation.

Full function with default arguments:

report(object,
       file = "report.html",
       format = c("html_document", "pdf_document"))

Example usage:

# Generate an automatic report
report(object = x, file = 'myReport.html')

For an example report, see the following link.

Benchmarks

To validate the statistical results, jfa’s automated unit tests regularly verify the main output from the package against the following benchmarks:

Statistical tables

Below you can find several informative tables that contain statistical sample sizes, upper limits, one-sided p values, and Bayes factors. These tables are created using the planning() and evaluation() functions provided in the package.

Sample sizes

Upper limits

One-sided p values

Bayes factors

References