The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Increasingly sophisticated MCMC and Monte Carlo algorithms raise the scope for errors, either in derivations of mathematical quantities needed for sampling, or implementation errors. Testing for such errors should be an integral and routine part of the workflow of any Bayesian analysis.
mcunit makes it easy to do this, by allowing for unit testing MCMC and other Monte Carlo implementations within the framework of the testthat package (Wickham 2011). The methods correspond to those proposed in Gandy and Scott (2020). They are based on statistical hypothesis testing, and are designed to achieve an arbitrarily low false rejection probability, so that the user can be confident that a correctly implemented algorithm will almost certainly not be rejected. By default, this false rejection rate is set to \(10^{-5}\).
MCMC samplers can be testes using expect_mcmc
or expect_mcmc_reversible
. These functions use different statistical hypothesis tests described in Section 2.1 and 2.2 of (Gandy and Scott 2020) respectively. expect_mcmc_reversible
requires that the sampler to be tested is (or at least, is expected to be) reversible.
This vignette demonstrates how to use these tests, using the following simple example. Consider the model \[y \sim \theta_1 + \theta_2 + \epsilon,\] where \(\theta := (\theta_1,\theta_2)\) is apriori independent, zero mean normal with standard deviation \(\sigma=10\). The white noise term \(\epsilon\) is independent from \(\theta\) and also zero mean normal but with variance \(\sigma_{\epsilon}^2\) = 0.1. While inference is easy here, we consider drawing samples from the posterior \(\pi(\theta \mid y)\) using a Gibbs sampler. The posterior conditional distributions for \(\theta_1\) and \(\theta_2\) are normal with expectations \[\mathbb{E}(\theta_i | y, \theta_j) = \frac{\sigma^2}{\sigma^2_{\epsilon} + \sigma^2}(y - \theta_j),\] and variances \[\mathbb{V}(\theta_i | y, \theta_j) = \frac{1}{\frac{1}{\sigma^2_{\epsilon}} + \frac{1}{\sigma^2}}.\]
Start with a correctly implemented random scan Gibbs sampler. First load the package and set the seed.
## Loading required package: mcunit
The following function updates one element of \(\theta\) given the other and given the observed data \(y\).
gibbsUpdate <- function(y, theta_j) {
# Samples theta_i given y and theta_j
mean <- 100 * (y - theta_j) / (100 + 0.1)
var <- 1. / (1. / 100 + 1. / 0.1)
rnorm(1, mean=mean, sd=sqrt(var))
}
The argument y
is the observed data, and theta_j
is the component to condition on. From this, we define a correctly implemented random scan sampler.
randomScan <- function(theta, y, thinning) {
# Random Scan Gibbs
for(i in 1:thinning) {
# select index to update
i <- sample.int(2,1)
theta[i] <- gibbsUpdate(y, theta[i %% 2 + 1])
}
theta
}
This function takes the current state of the chain theta
, the data y
and a thinning parameter which determines the number of individual updates to make before returning the new state.
Our task is to test the correctness of randomScan
. We will do this using both expect_mcmc
and expect_mcmc_reversible
. This latter method can be used because the sampler should, if correct, be reversible.
Both methods require a list object
which describes the MCMC sampler to be tested. The list must contain the following elements:
object$genprior
: A function with no arguments that generates a sample from the prior distribution. No default value.object$gendata
: A function that takes as input the parameter value (of the type generated by genprior) and returns the observed data as an arbitrary R object. No default value.object$stepMCMC
: A function that takes three arguments:
theta
: the current position of the chain (of the same type as produced by the prior),dat
: the observed data (of the same type as produced by gendat)thinning
: the number of steps the chain should take. 1 corresponds to one step.object$test
: Function that takes either one or two arguments, and returns a vector with components which will be used for checking the MCMC sampler. The first argument is interpreted as a parameter value, and if a second argument exists, it is interpreted as a data value. An example is the identity function: \(f(\theta) = \theta\). Alternatively, if you have access to the model’s likelihood function, you could use \(p(y \mid \theta)\).Please see the documentation for expect_mcmc
and expect_mcmc_reversible
for further details on this.
We begin constructing this list by defining the prior sampler and the data sampler.
obj <- list()
obj$genprior <- function() rnorm(n=2, mean=0, sd=10)
obj$gendata <- function(theta) sum(theta) + rnorm(n=1, mean=0, sd=sqrt(0.1))
As test functions, we use the components \(\theta_1\), \(theta_2\), the prior density \(\pi(\theta)\) and likelihood function \(p(y \mid \theta)\). The density and likelihood appear to be a good default choice because they are one-dimensional regardless of the dimension of the parameter vector and data.
priorDensity <- function(theta) prod(dnorm(theta, mean=0, sd=10))
likelihood <- function(theta, y) dnorm(y, mean=sum(theta), sd=sqrt(0.1))
testVec <- function(theta, y) c(theta[1], theta[2], priorDensity(theta), likelihood(theta, y))
obj$test <- testVec
Finally, we are left to define a single MCMC step.
## $genprior
## function() rnorm(n=2, mean=0, sd=10)
##
## $gendata
## function(theta) sum(theta) + rnorm(n=1, mean=0, sd=sqrt(0.1))
##
## $test
## function(theta, y) c(theta[1], theta[2], priorDensity(theta), likelihood(theta, y))
##
## $stepMCMC
## function(theta, dat, thinning) randomScan(theta, dat, thinning)
First consider expect_mcmc
. We use \(100\) thinning steps between samples to give the test some power to detect errors.
No errors were detected, because there are none. Recall that the false rejection rate is set to \(10^{-5}\) by default, and so repeated application of this test should very rarely flag randomScan.
This sampler is also reversible, and so we try expect_mcmc_reversible
.
Again, no error is detected (as expected).
Consider systematic scan of the vector \(\theta\) rather than random scan.
systematicScan <- function(theta, y, thinning) {
# Systematic Scan Gibbs
for(i in 1:thinning) {
theta[1] <- gibbsUpdate(y, theta[2])
theta[2] <- gibbsUpdate(y, theta[1])
}
theta
}
This sampler is correct and expect_mcmc
is unlikely to falsely detect errors.
obj$stepMCMC <- function(theta, dat, thinning) systematicScan(theta, dat, thinning)
expect_mcmc(obj, thinning = 100)
Good. What happens using expect_mcmc_reversible
with thinning of \(1\)?
The test failed. This illustrates why expect_mcmc_reversible
should not be used for a non-reversible sampler - we have no guarantee over the false rejection rate. Even though this sampler is correctly implemented, we are erroneously detecting an error.
Now we make a genuine mistake in the sampler, and investigate whether the methods detect it. We replace the variance terms \(\sigma^2\) and \(\sigma^2_{\epsilon}\) in \(\mathbb{V}(\theta_i | y, \theta_j)\) with their corresponding standard deviations (i.e. \(\sigma\) and \(\sigma_{\epsilon}\)).
gibbsUpdate <- function(y, theta_j) {
# Samples theta_i given y and theta_j
mean <- 100 * (y - theta_j) / (100 + 0.1)
var <- 1. / (1. / 10 + 1. / sqrt(0.1))
rnorm(1, mean=mean, sd=sqrt(var))
}
Also make sure to switch back to random scan…
Rerunning the tests…
## Error: Test failed for components with p-values=NULL in iteration 1
## Error: Test failed for components with p-values=NULL in iteration 1
Great! Both tests detect a problem. Even better, you can be pretty much sure there is a problem because the chance of a false rejection is upper bounded by \(10^{-5}\)!
Gandy, A., and Scott, J. (2020), “Unit testing for MCMC and other Monte Carlo methods.”
Wickham, H. (2011), “Testthat: Get started with testing,” The R Journal, 3, 5–10.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.