Type: | Package |
Title: | Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters |
Version: | 1.7.1 |
Description: | Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported. |
License: | MIT + file LICENSE |
URL: | https://epiforecasts.io/EpiNow2/, https://epiforecasts.io/EpiNow2/dev/, https://github.com/epiforecasts/EpiNow2 |
BugReports: | https://github.com/epiforecasts/EpiNow2/issues |
Depends: | R (≥ 3.5.0) |
Imports: | checkmate, cli, data.table, futile.logger (≥ 1.4), ggplot2, lifecycle, lubridate, methods, patchwork, posterior, purrr, R.utils (≥ 2.0.0), Rcpp (≥ 0.12.0), rlang (≥ 0.4.7), rstan (≥ 2.26.0), rstantools (≥ 2.2.0), runner, scales, stats, truncnorm, utils |
Suggests: | cmdstanr, covr, future, future.apply, here, knitr, precommit, progressr, rmarkdown, spelling, testthat, usethis, withr |
LinkingTo: | BH (≥ 1.66.0), Rcpp (≥ 0.12.0), RcppEigen (≥ 0.3.3.3.0), RcppParallel (≥ 5.0.1), rstan (≥ 2.26.0), StanHeaders (≥ 2.26.0) |
Additional_repositories: | https://stan-dev.r-universe.dev |
Biarch: | true |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
Language: | en-GB |
LazyData: | true |
RoxygenNote: | 7.3.2.9000 |
NeedsCompilation: | yes |
SystemRequirements: | GNU make C++17 |
VignetteBuilder: | knitr |
Packaged: | 2025-02-19 16:50:44 UTC; eidesfun |
Author: | Sam Abbott |
Maintainer: | Sebastian Funk <sebastian.funk@lshtm.ac.uk> |
Repository: | CRAN |
Date/Publication: | 2025-02-19 23:40:09 UTC |
EpiNow2: Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
Description
Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) doi:10.12688/wellcomeopenres.16006.1), and current best practices (Gostic et al. (2020) doi:10.1101/2020.06.18.20134858). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.
Author(s)
Maintainer: Sebastian Funk sebastian.funk@lshtm.ac.uk (ORCID)
Authors:
Sam Abbott sam.abbott@lshtm.ac.uk (ORCID)
Joel Hellewell joel.hellewell@lshtm.ac.uk (ORCID)
Katharine Sherratt katharine.sherratt@lshtm.ac.uk
Katelyn Gostic kgostic@uchicago.edu
Joe Hickson joseph.hickson@metoffice.gov.uk
Hamada S. Badr badr@jhu.edu (ORCID)
Michael DeWitt me.dewitt.jr@gmail.com (ORCID)
James M. Azam james.azam@lshtm.ac.uk (ORCID)
EpiForecasts
Other contributors:
Robin Thompson robin.thompson@lshtm.ac.uk [contributor]
Sophie Meakin sophie.meaking@lshtm.ac.uk [contributor]
James Munday james.munday@lshtm.ac.uk [contributor]
Nikos Bosse [contributor]
Paul Mee paul.mee@lshtm.ac.uk [contributor]
Peter Ellis peter.ellis2013nz@gmail.com [contributor]
Pietro Monticone pietro.monticone@edu.unito.it [contributor]
Lloyd Chapman lloyd.chapman1@lshtm.ac.uk [contributor]
Andrew Johnson andrew.johnson@arjohnsonau.com [contributor]
Kaitlyn Johnson johnsonkaitlyne9@gmail.com (ORCID) [contributor]
See Also
Useful links:
Report bugs at https://github.com/epiforecasts/EpiNow2/issues
Creates a delay distribution as the sum of two other delay distributions.
Description
Usage
## S3 method for class 'dist_spec'
e1 + e2
Arguments
e1 |
The first delay distribution (of type <dist_spec>) to combine. |
e2 |
The second delay distribution (of type <dist_spec>) to combine. |
Value
A delay distribution representing the sum of the two delays
Examples
# A fixed lognormal distribution with mean 5 and sd 1.
dist1 <- LogNormal(
meanlog = 1.6, sdlog = 1, max = 20
)
dist1 + dist1
# An uncertain gamma distribution with shape and rate normally distributed
# as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- Gamma(
shape = Normal(3, 0.5),
rate = Normal(2, 0.5),
max = 20
)
dist1 + dist2
Compares two delay distributions
Description
Compares two delay distributions
Usage
## S3 method for class 'dist_spec'
e1 == e2
## S3 method for class 'dist_spec'
e1 != e2
Arguments
e1 |
The first delay distribution (of type <dist_spec>) to combine. |
e2 |
The second delay distribution (of type <dist_spec>) to combine. |
Value
TRUE or FALSE
Examples
Fixed(1) == Normal(1, 0.5)
Probability distributions
Description
Probability distributions
Generates a nonparametric distribution.
Usage
LogNormal(meanlog, sdlog, mean, sd, ...)
Gamma(shape, rate, scale, mean, sd, ...)
Normal(mean, sd, ...)
Fixed(value, ...)
NonParametric(pmf, ...)
Arguments
meanlog , sdlog |
mean and standard deviation of the distribution
on the log scale with default values of |
mean , sd |
mean and standard deviation of the distribution |
... |
arguments to define the limits of the distribution that will be
passed to |
shape , scale |
shape and scale parameters. Must be positive,
|
rate |
an alternative way to specify the scale. |
value |
Value of the fixed (delta) distribution |
pmf |
Probability mass of the given distribution; this is passed as a zero-indexed numeric vector (i.e. the fist entry represents the probability mass of zero). If not summing to one it will be normalised to sum to one internally. |
Details
Probability distributions are ubiquitous in EpiNow2, usually representing epidemiological delays (e.g., the generation time for delays between becoming infecting and infecting others; or reporting delays)
They are generated using functions that have a name corresponding to the
probability distribution that is being used. They generated dist_spec
objects that are then passed to the models underlying EpiNow2.
All parameters can be given either as fixed values (a numeric value) or as
uncertain values (a dist_sepc
). If given as uncertain values, currently
only normally distributed parameters (generated using Normal()
) are
supported.
Each distribution has a representation in terms of "natural" parameters (the ones used in stan) but can sometimes also be specified using other parameters such as the mean or standard deviation of the distribution. If not given as natural parameters then these will be calculated from the given parameters. If they have uncertainty, this will be done by random sampling from the given uncertainty and converting resulting parameters to their natural representation.
Currently available distributions are lognormal, gamma, normal, fixed (delta) and nonparametric. The nonparametric is a special case where the probability mass function is given directly as a numeric vector.
Value
A dist_spec
representing a distribution of the given
specification.
Examples
LogNormal(mean = 4, sd = 1)
LogNormal(mean = 4, sd = 1, max = 10)
# If specifying uncertain parameters, use the natural parameters
LogNormal(meanlog = Normal(1.5, 0.5), sdlog = 0.25, max = 10)
Gamma(mean = 4, sd = 1)
Gamma(shape = 16, rate = 4)
Gamma(shape = Normal(16, 2), rate = Normal(4, 1))
Normal(mean = 4, sd = 1)
Normal(mean = 4, sd = 1, max = 10)
Fixed(value = 3)
Fixed(value = 3.5)
NonParametric(c(0.1, 0.3, 0.2, 0.4))
NonParametric(c(0.1, 0.3, 0.2, 0.1, 0.1))
Convert Reproduction Numbers to Growth Rates
Description
See here # nolint
for justification. Now handled internally by stan so may be removed in
future updates if no user demand.
Usage
R_to_growth(R, gamma_mean, gamma_sd)
Arguments
R |
Numeric, Reproduction number estimates |
gamma_mean |
Numeric, mean of the gamma distribution |
gamma_sd |
Numeric, standard deviation of the gamma distribution . |
Value
Numeric vector of reproduction number estimates
Examples
R_to_growth(2.18, 4, 1)
Add breakpoints to certain dates in a data set.
Description
Add breakpoints to certain dates in a data set.
Usage
add_breakpoints(data, dates = as.Date(character(0)))
Arguments
data |
A |
dates |
A vector of dates to use as breakpoints. |
Value
A data.table with breakpoint
set to 1 on each of the specified
dates.
Examples
reported_cases <- add_breakpoints(example_confirmed, as.Date("2020-03-26"))
Adds a day of the week vector
Description
Adds a day of the week vector
Usage
add_day_of_week(dates, week_effect = 7)
Arguments
dates |
Vector of dates |
week_effect |
Numeric from 1 to 7 defaults to 7 |
Value
A numeric vector containing the period day of the week index
Examples
## Not run:
dates <- seq(as.Date("2020-03-15"), by = "days", length.out = 15)
# Add date based day of week
add_day_of_week(dates, 7)
# Add shorter week
add_day_of_week(dates, 4)
## End(Not run)
Add missing values for future dates
Description
Add missing values for future dates
Usage
add_horizon(data, horizon, accumulate = 1L, obs_column = "confirm", by = NULL)
Arguments
data |
Data frame with a |
horizon |
Deprecated; use |
accumulate |
The number of days to accumulate when generating posterior prediction, e.g. 7 for weekly accumulated forecasts. If this is not set an attempt will be made to detect the accumulation frequency in the data. |
obs_column |
Character (default: "confirm"). If given, only the column specified here will be used for checking missingness. This is useful if using a data set that has multiple columns of hwich one of them corresponds to observations that are to be processed here. |
by |
Character vector. Name(s) of any additional column(s) where data processing should be done separately for each value in the column. This is useful when using data representing e.g. multiple geographies. If NULL (default) no such grouping is done. |
Value
A data.table with missing values for future dates
Allocate Delays into Required Stan Format
Description
Allocate delays for stan. Used in
delay_opts()
.
Usage
allocate_delays(delay_var, no_delays)
Arguments
delay_var |
List of numeric delays |
no_delays |
Numeric, number of delays |
Value
A numeric array
Allocate Empty Parameters to a List
Description
Allocate missing parameters to be empty two dimensional arrays. Used
internally by
forecast_infections()
.
Usage
allocate_empty(data, params, n = 0)
Arguments
data |
A list of parameters |
params |
A character vector of parameters to allocate to empty if missing. |
n |
Numeric, number of samples to assign an empty array |
Value
A list of parameters some allocated to be empty
Apply default CDF cutoff to a <dist_spec> if it is unconstrained
Description
Apply default CDF cutoff to a <dist_spec> if it is unconstrained
Usage
apply_default_cdf_cutoff(dist, default_cdf_cutoff, cdf_cutoff_set)
Arguments
dist |
A <dist_spec> |
default_cdf_cutoff |
Numeric; default CDF cutoff to be used if an
unconstrained distribution is passed as |
cdf_cutoff_set |
Logical; whether the default CDF cutoff has been set by
the user; if yes and |
Value
A <dist_spec> with the default CDF cutoff set if previously not constrained
Applies a threshold to all nonparametric distributions in a <dist_spec>
Description
This function is deprecated. Use
bound_dist()
instead.
Usage
apply_tolerance(x, tolerance)
Arguments
x |
A |
tolerance |
Numeric; the desired tolerance level. Any part of the cumulative distribution function beyond 1 minus this tolerance level is removed. |
Value
A <dist_spec>
where probability masses below the threshold level
have been removed
Convert zero case counts to NA
(missing) if the 7-day average is above a
threshold.
Description
This function aims to detect spurious zeroes by comparing the 7-day average
of the case counts to a threshold. If the 7-day average is above the
threshold, the zero case count is replaced with NA
.
Usage
apply_zero_threshold(data, threshold = Inf, obs_column = "confirm")
Arguments
data |
A |
threshold |
Numeric, defaults to |
obs_column |
Character (default: "confirm"). If given, only the column specified here will be used for checking missingness. This is useful if using a data set that has multiple columns of hwich one of them corresponds to observations that are to be processed here. |
Value
A data.table with the zero threshold applied.
Back Calculation Options
Description
Defines a list specifying the optional arguments for the back calculation
of cases. Only used if
rt = NULL
.
Usage
backcalc_opts(
prior = c("reports", "none", "infections"),
prior_window = 14,
rt_window = 1
)
Arguments
prior |
A character string defaulting to "reports". Defines the prior
to use when deconvolving. Currently implemented options are to use smoothed
mean delay shifted reported cases ("reports"), to use the estimated
infections from the previous time step seeded for the first time step using
mean shifted reported cases ("infections"), or no prior ("none"). Using no
prior will result in poor real time performance. No prior and using
infections are only supported when a Gaussian process is present . If
observed data is not reliable then it a sensible first step is to explore
increasing the |
prior_window |
Integer, defaults to 14 days. The mean centred smoothing window to apply to mean shifted reports (used as a prior during back calculation). 7 days is minimum recommended settings as this smooths day of the week effects but depending on the quality of the data and the amount of information users wish to use as a prior (higher values equalling a less informative prior). |
rt_window |
Integer, defaults to 1. The size of the centred rolling average to use when estimating Rt. This must be odd so that the central estimate is included. |
Value
A <backcalc_opts>
object of back calculation settings.
Examples
# default settings
backcalc_opts()
Fit a Subsampled Bootstrap to Integer Values and Summarise Distribution Parameters
Description
Fits an integer adjusted distribution to a subsampled bootstrap of data and
then integrates the posterior samples into a single set of summary
statistics. Can be used to generate a robust reporting delay that accounts
for the fact the underlying delay likely varies over time or that the size
of the available reporting delay sample may not be representative of the
current case load.
Usage
bootstrapped_dist_fit(
values,
dist = "lognormal",
samples = 2000,
bootstraps = 10,
bootstrap_samples = 250,
max_value,
verbose = FALSE
)
Arguments
values |
Integer vector of values. |
dist |
Character string, which distribution to fit. Defaults to
lognormal ( |
samples |
Numeric, number of samples to take overall from the bootstrapped posteriors. |
bootstraps |
Numeric, defaults to 1. The number of bootstrap samples
(with replacement) of the delay distribution to take. If |
bootstrap_samples |
Numeric, defaults to 250. The number of samples to take in each bootstrap if the sample size of the supplied delay distribution is less than its value. |
max_value |
Numeric, defaults to the maximum value in the observed data. Maximum delay to allow (added to output but does impact fitting). |
verbose |
Logical, defaults to |
Value
A <dist_spec>
object summarising the bootstrapped distribution
Examples
# lognormal
delays <- rlnorm(500, log(5), 1)
out <- bootstrapped_dist_fit(delays,
samples = 1000, bootstraps = 10,
dist = "lognormal"
)
out
Define bounds of a <dist_spec>
Description
This sets attributes for further processing
Usage
bound_dist(x, max = Inf, cdf_cutoff = 0)
Arguments
x |
A |
max |
Numeric, maximum value of the distribution. The distribution will
be truncated at this value. Default: |
cdf_cutoff |
Numeric; the desired CDF cutoff. Any part of the
cumulative distribution function beyond 1 minus the value of this argument is
removed. Default: |
Value
a <dist_spec>
with relevant attributes set that define its bounds
Combines multiple delay distributions for further processing
Description
This combines the parameters so that they can be fed as multiple delay
distributions to
epinow()
or estimate_infections()
.
Note that distributions that already are combinations of other distributions cannot be combined with other combinations of distributions.
Usage
## S3 method for class 'dist_spec'
c(...)
Arguments
... |
The delay distributions to combine |
Value
Combined delay distributions (with class <dist_spec>
)
Examples
# A fixed lognormal distribution with mean 5 and sd 1.
dist1 <- LogNormal(
meanlog = 1.6, sdlog = 1, max = 20
)
dist1 + dist1
# An uncertain gamma distribution with shape and rate normally distributed
# as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- Gamma(
shape = Normal(3, 0.5),
rate = Normal(2, 0.5),
max = 20
)
c(dist1, dist2)
Calculate Credible Interval
Description
Adds symmetric a credible interval based on quantiles.
Usage
calc_CrI(samples, summarise_by = NULL, CrI = 0.9)
Arguments
samples |
A data.table containing at least a value variable |
summarise_by |
A character vector of variables to group by. |
CrI |
Numeric between 0 and 1. The credible interval for which to return values. Defaults to 0.9. |
Value
A data.table containing the upper and lower bounds for the specified credible interval.
Examples
samples <- data.frame(value = 1:10, type = "car")
# add 90% credible interval
calc_CrI(samples)
# add 90% credible interval grouped by type
calc_CrI(samples, summarise_by = "type")
Calculate Credible Intervals
Description
Adds symmetric credible intervals based on quantiles.
Usage
calc_CrIs(samples, summarise_by = NULL, CrIs = c(0.2, 0.5, 0.9))
Arguments
samples |
A data.table containing at least a value variable |
summarise_by |
A character vector of variables to group by. |
CrIs |
Numeric vector of credible intervals to calculate. |
Value
A data.table containing the summarise_by
variables and the
specified lower and upper credible intervals.
Examples
samples <- data.frame(value = 1:10, type = "car")
# add credible intervals
calc_CrIs(samples)
# add 90% credible interval grouped by type
calc_CrIs(samples, summarise_by = "type")
Calculate All Summary Measures
Description
Calculate summary statistics and credible intervals from a
<data.frame>
by
group.
Usage
calc_summary_measures(
samples,
summarise_by = NULL,
order_by = NULL,
CrIs = c(0.2, 0.5, 0.9)
)
Arguments
samples |
A data.table containing at least a value variable |
summarise_by |
A character vector of variables to group by. |
order_by |
A character vector of parameters to order by, defaults to
all |
CrIs |
Numeric vector of credible intervals to calculate. |
Value
A data.table containing summary statistics by group.
Examples
samples <- data.frame(value = 1:10, type = "car")
# default
calc_summary_measures(samples)
# by type
calc_summary_measures(samples, summarise_by = "type")
Calculate Summary Statistics
Description
Calculate summary statistics from a
<data.frame>
by group.
Currently supports the mean, median and standard deviation.
Usage
calc_summary_stats(samples, summarise_by = NULL)
Arguments
samples |
A data.table containing at least a value variable |
summarise_by |
A character vector of variables to group by. |
Value
A data.table containing the upper and lower bounds for the specified credible interval
Examples
samples <- data.frame(value = 1:10, type = "car")
# default
calc_summary_stats(samples)
# by type
calc_summary_stats(samples, summarise_by = "type")
Validate probability distribution for using as generation time
Description
does all the checks in
check_stan_delay()
and additionally makes sure
that if dist
is nonparametric, its first element is zero.
Usage
check_generation_time(dist)
Arguments
dist |
A |
Value
Called for its side effects.
Validate data input
Description
check_reports_valid()
checks that the supplied data is a <data.frame>
,
and that it has the right column names and types. In particular, it checks
that the date column is in date format and does not contain NAs, and that
the other columns are numeric.
Usage
check_reports_valid(
data,
model = c("estimate_infections", "estimate_secondary")
)
Arguments
data |
A data frame with either:
|
model |
The EpiNow2 model to be used. Either "estimate_infections", "estimate_truncation", or "estimate_secondary". This is used to determine which checks to perform on the data input. |
Value
Called for its side effects.
Check that PMF tail is not sparse
Description
Checks if the tail of a PMF vector has more than span
consecutive values smaller than tol
and throws a warning if so.
Usage
check_sparse_pmf_tail(pmf, span = 5, tol = 1e-06)
Arguments
pmf |
A probability mass function vector |
span |
The number of consecutive indices in the tail to check |
tol |
The value which to consider the tail as sparse |
Value
Called for its side effects.
Validate probability distribution for passing to stan
Description
check_stan_delay()
checks that the supplied data is a <dist_spec>
,
that it is a supported distribution, and that is has a finite maximum.
Usage
check_stan_delay(dist)
Arguments
dist |
A |
Value
Called for its side effects.
Clean Nowcasts for a Supplied Date
Description
This function removes nowcasts in the format produced by
EpiNow2
from a
target directory for the date supplied.
Usage
clean_nowcasts(date = Sys.Date(), nowcast_dir = ".")
Arguments
date |
Date object. Defaults to today's date |
nowcast_dir |
Character string giving the filepath to the nowcast results directory. Defaults to the current directory. |
Value
No return value, called for side effects
Clean Regions
Description
Removes regions with insufficient time points, and provides logging
information on the input.
Usage
clean_regions(data, non_zero_points)
Arguments
data |
A |
non_zero_points |
Numeric, the minimum number of time points with non-zero cases in a region required for that region to be evaluated. Defaults to 7. |
Value
A dataframe of cleaned regional data
See Also
Collapse nonparametric distributions in a <dist_spec>
Description
This convolves any consecutive nonparametric distributions contained
in the <dist_spec>.
Usage
## S3 method for class 'dist_spec'
collapse(x, ...)
Arguments
x |
A |
... |
ignored |
Value
A <dist_spec>
where consecutive nonparametric distributions
have been convolved
Examples
# A fixed gamma distribution with mean 5 and sd 1.
dist1 <- Gamma(mean = 5, sd = 1, max = 20)
# An uncertain lognormal distribution with meanlog and sdlog normally
# distributed as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- LogNormal(
meanlog = Normal(3, 0.5),
sdlog = Normal(2, 0.5),
max = 20
)
# The maxf the sum of two distributions
collapse(discretise(dist1 + dist2, strict = FALSE))
Construct Output
Description
Combines the output produced internally by
epinow
into a single list.
Usage
construct_output(
estimates,
estimated_reported_cases,
plots = NULL,
summary = NULL,
samples = TRUE
)
Arguments
estimates |
List of data frames as output by |
estimated_reported_cases |
A list of dataframes as produced by
|
plots |
A list of plots as produced by |
summary |
A list of summary output as produced by |
samples |
Logical, defaults to TRUE. Should samples be saved |
Value
A list of output as returned by epinow
Convert mean and sd to log mean for a log normal distribution
Description
Convert from mean and standard deviation to the log mean of the
lognormal distribution. Useful for defining distributions supported by
estimate_infections()
, epinow()
, and regional_epinow()
.
Usage
convert_to_logmean(mean, sd)
Arguments
mean |
Numeric, mean of a distribution |
sd |
Numeric, standard deviation of a distribution |
Value
The log mean of a lognormal distribution
Examples
convert_to_logmean(2, 1)
Convert mean and sd to log standard deviation for a log normal distribution
Description
Convert from mean and standard deviation to the log standard deviation of the
lognormal distribution. Useful for defining distributions supported by
estimate_infections()
, epinow()
, and regional_epinow()
.
Usage
convert_to_logsd(mean, sd)
Arguments
mean |
Numeric, mean of a distribution |
sd |
Numeric, standard deviation of a distribution |
Value
The log standard deviation of a lognormal distribution
Examples
convert_to_logsd(2, 1)
Internal function for converting parameters to natural parameters.
Description
This is used for preprocessing before generating a
dist_spec
object
from a given set of parameters and distribution
Usage
convert_to_natural(params, distribution)
Arguments
params |
A numerical named parameter vector |
distribution |
Character; the distribution to use. |
Value
A list with two elements, params_mean
and params_sd
, containing
mean and sd of natural parameters.
Examples
## Not run:
convert_to_natural(
params = list(mean = 2, sd = 1),
distribution = "gamma"
)
## End(Not run)
Convolve and scale a time series
Description
This applies a lognormal convolution with given, potentially time-varying
parameters representing the parameters of the lognormal distribution used for
the convolution and an optional scaling factor. This is akin to the model
used in estimate_secondary()
and simulate_secondary()
.
Usage
convolve_and_scale(
data,
type = c("incidence", "prevalence"),
family = c("none", "poisson", "negbin"),
delay_max = 30,
...
)
Arguments
data |
A |
type |
A character string indicating the type of observation the secondary reports are. Options include:
|
family |
Character string defining the observation model. Options are Negative binomial ("negbin"), the default, Poisson ("poisson"), and "none" meaning the expectation is returned. |
delay_max |
Integer, defaulting to 30 days. The maximum delay used in the convolution model. |
... |
Additional parameters to pass to the observation model (i.e
|
Details
Up to version 1.4.0 this function was called simulate_secondary()
.
Value
A <data.frame>
containing simulated data in the format required by
estimate_secondary()
.
See Also
estimate_secondary
Examples
# load data.table for manipulation
library(data.table)
#### Incidence data example ####
# make some example secondary incidence data
cases <- example_confirmed
cases <- as.data.table(cases)[, primary := confirm]
# Assume that only 40 percent of cases are reported
cases[, scaling := 0.4]
# Parameters of the assumed log normal delay distribution
cases[, meanlog := 1.8][, sdlog := 0.5]
# Simulate secondary cases
cases <- convolve_and_scale(cases, type = "incidence")
cases
#### Prevalence data example ####
# make some example prevalence data
cases <- example_confirmed
cases <- as.data.table(cases)[, primary := confirm]
# Assume that only 30 percent of cases are reported
cases[, scaling := 0.3]
# Parameters of the assumed log normal delay distribution
cases[, meanlog := 1.6][, sdlog := 0.8]
# Simulate secondary cases
cases <- convolve_and_scale(cases, type = "prevalence")
cases
Copy Results From Dated Folder to Latest
Description
Copies output from the dated folder to a latest folder. May be undergo
changes in later releases.
Usage
copy_results_to_latest(target_folder = NULL, latest_folder = NULL)
Arguments
target_folder |
Character string specifying where to save results (will create if not present). |
latest_folder |
Character string containing the path to the latest
target folder. As produced by |
Value
No return value, called for side effects
Create Back Calculation Data
Description
Takes the output of
backcalc_opts()
and converts it into a list understood
by stan.
Usage
create_backcalc_data(backcalc = backcalc_opts())
Arguments
backcalc |
A list of options as generated by |
Value
A list of settings defining the Gaussian process
See Also
backcalc_opts
Create initial conditions for delays
Description
Create initial conditions for delays
Usage
create_delay_inits(data)
Arguments
data |
A list of data as produced by |
Value
A list of initial conditions for delays
Construct the Required Future Rt assumption
Description
Converts the
future
argument from rt_opts()
into arguments that can be
passed to stan.
Usage
create_future_rt(future = c("latest", "project", "estimate"), delay = 0)
Arguments
future |
A character string or integer. This argument indicates how to set future Rt values. Supported options are to project using the Rt model ("project"), to use the latest estimate based on partial data ("latest"), to use the latest estimate based on data that is over 50% complete ("estimate"). If an integer is supplied then the Rt estimate from this many days into the future (or past if negative) past will be used forwards in time. |
delay |
Numeric mean delay |
Value
A list containing a logical called fixed and an integer called from
Create Gaussian Process Data
Description
Takes the output of
gp_opts()
and converts it into a list understood by
stan.
Usage
create_gp_data(gp = gp_opts(), data)
Arguments
gp |
A list of options as generated by |
data |
A list containing the following numeric values:
|
Value
A list of settings defining the Gaussian process
See Also
Examples
## Not run:
# define input data required
data <- list(
t = 30,
seeding_time = 7,
horizon = 7
)
# default gaussian process data
create_gp_data(data = data)
# settings when no gaussian process is desired
create_gp_data(NULL, data)
# custom lengthscale
create_gp_data(gp_opts(ls_mean = 14), data)
## End(Not run)
Create Initial Conditions Generating Function
Description
Uses the output of
create_stan_data()
to create a function which can be
used to sample from the prior distributions (or as close as possible) for
parameters. Used in order to initialise each stan chain within a range of
plausible values.
Usage
create_initial_conditions(data)
Arguments
data |
A list of data as produced by |
Value
An initial condition generating function
Create Observation Model Settings
Description
Takes the output of
obs_opts()
and converts it into a list understood
by stan.
Usage
create_obs_model(obs = obs_opts(), dates)
Arguments
obs |
A list of options as generated by |
dates |
A vector of dates used to calculate the day of the week. |
Value
A list of settings ready to be passed to stan defining the Observation Model
See Also
Examples
## Not run:
dates <- seq(as.Date("2020-03-15"), by = "days", length.out = 15)
# default observation model data
create_obs_model(dates = dates)
# Poisson observation model
create_obs_model(obs_opts(family = "poisson"), dates = dates)
# Applying a observation scaling to the data
create_obs_model(
obs_opts(scale = Normal(mean = 0.4, sd = 0.01)),
dates = dates
)
# Apply a custom week week length
create_obs_model(obs_opts(week_length = 3), dates = dates)
## End(Not run)
Create Time-varying Reproduction Number Data
Description
Takes the output from
rt_opts()
and converts it into a list understood by
stan.
Usage
create_rt_data(rt = rt_opts(), breakpoints = NULL, delay = 0, horizon = 0)
Arguments
rt |
A list of options as generated by |
breakpoints |
An integer vector (binary) indicating the location of breakpoints. |
delay |
Numeric mean delay |
horizon |
Numeric, forecast horizon. |
Value
A list of settings defining the time-varying reproduction number
See Also
rt_settings
Examples
## Not run:
# default Rt data
create_rt_data()
# settings when no Rt is desired
create_rt_data(rt = NULL)
# using breakpoints
create_rt_data(rt_opts(use_breakpoints = TRUE), breakpoints = rep(1, 10))
# using random walk
create_rt_data(rt_opts(rw = 7), breakpoints = rep(1, 10))
## End(Not run)
Create Delay Shifted Cases
Description
This functions creates a data frame of reported cases that has been smoothed
using a centred partial rolling average (with a period set by
smoothing_window
) and shifted back in time by some delay. It is used by
estimate_infections()
to generate the mean shifted prior on which the back
calculation method (see backcalc_opts()
) is based.
Usage
create_shifted_cases(data, shift, smoothing_window, horizon)
Arguments
data |
A |
shift |
Numeric, mean delay shift to apply. |
smoothing_window |
Numeric, the rolling average smoothing window to apply. Must be odd in order to be defined as a centred average. |
horizon |
Deprecated; use |
Details
The function first shifts all the data back in time by shift
days (thus
discarding the first shift
days of data) and then applies a centred
rolling mean of length smoothing_window
to the shifted data except for
the final period. The final period (the forecast horizon plus half the
smoothing window) is instead replaced by a log-linear model fit (with 1
added to the data for fitting to avoid zeroes and later subtracted again),
projected to the end of the forecast horizon. The initial part of the data
(corresponding to the length of the smoothing window) is then removed, and
any non-integer resulting values rounded up.
Value
A <data.frame>
for shifted reported cases
Examples
## Not run:
shift <- 7
horizon <- 7
smoothing_window <- 14
## add NAs for horizon
cases <- add_horizon(example_confirmed[1:30], horizon)
## add zeroes initially
cases <- data.table::rbindlist(list(
data.table::data.table(
date = seq(
min(cases$date) - 10,
min(cases$date) - 1,
by = "days"
),
confirm = 0, breakpoint = 0
),
cases
))
create_shifted_cases(cases, shift, smoothing_window, horizon)
## End(Not run)
Create a List of Stan Arguments
Description
Generates a list of arguments as required by the stan sampling functions by
combining the required options with data, and type of initialisation.
Initialisation defaults to random but it is expected that
create_initial_conditions()
will be used.
Usage
create_stan_args(
stan = stan_opts(),
data = NULL,
init = "random",
model = "estimate_infections",
fixed_param = FALSE,
verbose = FALSE
)
Arguments
stan |
A list of stan options as generated by |
data |
A list of stan data as created by |
init |
Initial conditions passed to |
model |
Character, name of the model for which arguments are to be created. |
fixed_param |
Logical, defaults to |
verbose |
Logical, defaults to |
Value
A list of stan arguments
Examples
## Not run:
# default settings
create_stan_args()
# increasing warmup
create_stan_args(stan = stan_opts(warmup = 1000))
## End(Not run)
Create Stan Data Required for estimate_infections
Description
Takes the output of
stan_opts()
and converts it into a list understood by
stan. Internally calls the other create_
family of functions to
construct a single list for input into stan with all data required
present.
Usage
create_stan_data(
data,
seeding_time,
rt,
gp,
obs,
backcalc,
shifted_cases,
forecast
)
Arguments
data |
A |
seeding_time |
Integer; seeding time, usually obtained using
|
rt |
A list of options as generated by |
gp |
A list of options as generated by |
obs |
A list of options as generated by |
backcalc |
A list of options as generated by |
shifted_cases |
A |
forecast |
A list of options as generated by |
Value
A list of stan data
Examples
## Not run:
create_stan_data(
example_confirmed, 7, rt_opts(), gp_opts(), obs_opts(), 7,
backcalc_opts(), create_shifted_cases(example_confirmed, 7, 14, 7)
)
## End(Not run)
Create delay variables for stan
Description
Create delay variables for stan
Usage
create_stan_delays(..., time_points = 1L)
Arguments
... |
Named delay distributions. The names are assigned to IDs |
time_points |
Integer, the number of time points in the data; determines weight associated with weighted delay priors; default: 1 |
Value
A list of variables as expected by the stan model
Create parameters for stan
Description
Create parameters for stan
Usage
create_stan_params(..., lower_bounds = NULL)
Arguments
... |
Named delay distributions. The names are assigned to IDs |
lower_bounds |
Named vector of lower bounds for any delay(s). The names
have to correspond to the names given to the delay distributions passed.
If |
Value
A list of variables as expected by the stan model
Temporary function to support the transition to full support of missing data.
Description
Usage
default_fill_missing_obs(data, obs, obs_column)
Arguments
data |
A |
obs |
A list of options as generated by |
obs_column |
Character (default: "confirm"). If given, only the column specified here will be used for checking missingness. This is useful if using a data set that has multiple columns of hwich one of them corresponds to observations that are to be processed here. |
Value
data set with missing dates filled in as na values
Delay Distribution Options
Description
Returns delay distributions formatted for usage by downstream
functions.
Usage
delay_opts(dist = Fixed(0), default_cdf_cutoff = 0.001, weight_prior = TRUE)
Arguments
dist |
A delay distribution or series of delay distributions. Default is a fixed distribution with all mass at 0, i.e. no delay. |
default_cdf_cutoff |
Numeric; default CDF cutoff to be used if an
unconstrained distribution is passed as |
weight_prior |
Logical; if TRUE (default), any priors given in |
Value
A <delay_opts>
object summarising the input delay distributions.
See Also
convert_to_logmean()
convert_to_logsd()
bootstrapped_dist_fit()
Distributions
Examples
# no delays
delay_opts()
# A single delay that has uncertainty
delay <- LogNormal(
meanlog = Normal(1, 0.2),
sdlog = Normal(0.5, 0.1),
max = 14
)
delay_opts(delay)
# A single delay without uncertainty
delay <- LogNormal(meanlog = 1, sdlog = 0.5, max = 14)
delay_opts(delay)
# Multiple delays (in this case twice the same)
delay_opts(delay + delay)
Discretised probability mass function
Description
This function returns the probability mass function of a discretised and
truncated distribution defined by distribution type, maximum value and model
parameters.
Usage
discrete_pmf(
distribution = c("exp", "gamma", "lognormal", "normal", "fixed"),
params,
max_value,
cdf_cutoff,
width
)
Arguments
distribution |
A character string representing the distribution to be used (one of "exp", "gamma", "lognormal", "normal" or "fixed") |
params |
A list of parameters values (by name) required for each model. For the exponential model this is a rate parameter and for the gamma model this is alpha and beta. |
max_value |
Numeric, the maximum value to allow. Samples outside of this range are resampled. |
cdf_cutoff |
Numeric; the desired CDF cutoff. Any part of the
cumulative distribution function beyond 1 minus the value of this argument is
removed. Default: |
width |
Numeric, the width of each discrete bin. |
Value
A vector representing a probability distribution.
Methodological details
The probability mass function of the discretised probability distribution is a vector where the first entry corresponds to the integral over the (0,1] interval of the corresponding continuous distribution (probability of integer 0), the second entry corresponds to the (0,2] interval (probability mass of integer 1), the third entry corresponds to the (1, 3] interval (probability mass of integer 2), etc. This approximates the true probability mass function of a double censored distribution which arises from the difference of two censored events.
References
Charniga, K., et al. “Best practices for estimating and reporting epidemiological delay distributions of infectious diseases using public health surveillance and healthcare data”, arXiv e-prints, 2024. doi:10.48550/arXiv.2405.08841 Park, S. W., et al., "Estimating epidemiological delay distributions for infectious diseases", medRxiv, 2024. doi:10.1101/2024.01.12.24301247
Discretise a <dist_spec>
Description
Usage
## S3 method for class 'dist_spec'
discretise(x, strict = TRUE, ...)
discretize(x, ...)
Arguments
x |
A |
strict |
Logical; If |
... |
ignored |
Value
A <dist_spec>
where all distributions with constant parameters are
nonparametric.
Methodological details
The probability mass function of the discretised probability distribution is a vector where the first entry corresponds to the integral over the (0,1] interval of the corresponding continuous distribution (probability of integer 0), the second entry corresponds to the (0,2] interval (probability mass of integer 1), the third entry corresponds to the (1, 3] interval (probability mass of integer 2), etc. This approximates the true probability mass function of a double censored distribution which arises from the difference of two censored events.
References
Charniga, K., et al. “Best practices for estimating and reporting epidemiological delay distributions of infectious diseases using public health surveillance and healthcare data”, arXiv e-prints, 2024. doi:10.48550/arXiv.2405.08841 Park, S. W., et al., "Estimating epidemiological delay distributions for infectious diseases", medRxiv, 2024. doi:10.1101/2024.01.12.24301247
Examples
# A fixed gamma distribution with mean 5 and sd 1.
dist1 <- Gamma(mean = 5, sd = 1, max = 20)
# An uncertain lognormal distribution with meanlog and sdlog normally
# distributed as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- LogNormal(
meanlog = Normal(3, 0.5),
sdlog = Normal(2, 0.5),
max = 20
)
# The maxf the sum of two distributions
discretise(dist1 + dist2, strict = FALSE)
Fit an Integer Adjusted Exponential, Gamma or Lognormal distributions
Description
Fits an integer adjusted exponential, gamma or lognormal distribution using
stan.
Usage
dist_fit(
values = NULL,
samples = 1000,
cores = 1,
chains = 2,
dist = "exp",
verbose = FALSE,
backend = "rstan"
)
Arguments
values |
Numeric vector of values |
samples |
Numeric, number of samples to take. Must be >= 1000. Defaults to 1000. |
cores |
Numeric, defaults to 1. Number of CPU cores to use (no effect if greater than the number of chains). |
chains |
Numeric, defaults to 2. Number of MCMC chains to use. More is better with the minimum being two. |
dist |
Character string, which distribution to fit. Defaults to
exponential ( |
verbose |
Logical, defaults to FALSE. Should verbose progress messages be printed. |
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
Value
A stan fit of an interval censored distribution
Examples
# integer adjusted exponential model
dist_fit(rexp(1:100, 2),
samples = 1000, dist = "exp",
cores = ifelse(interactive(), 4, 1), verbose = TRUE
)
# integer adjusted gamma model
dist_fit(rgamma(1:100, 5, 5),
samples = 1000, dist = "gamma",
cores = ifelse(interactive(), 4, 1), verbose = TRUE
)
# integer adjusted lognormal model
dist_fit(rlnorm(1:100, log(5), 0.2),
samples = 1000, dist = "lognormal",
cores = ifelse(interactive(), 4, 1), verbose = TRUE
)
Distribution Skeleton
Description
This function acts as a skeleton for a truncated distribution defined by
model type, maximum value and model parameters.
Usage
dist_skel(
n,
dist = FALSE,
cum = TRUE,
model,
discrete = FALSE,
params,
max_value = 120
)
Arguments
n |
Numeric vector, number of samples to take (or days for the probability density). |
dist |
Logical, defaults to |
cum |
Logical, defaults to |
model |
Character string, defining the model to be used. Supported options are exponential ("exp"), gamma ("gamma"), and log normal ("lognormal") |
discrete |
Logical, defaults to |
params |
A list of parameters values (by name) required for each model. For the exponential model this is a rate parameter and for the gamma model this is alpha and beta. |
max_value |
Numeric, the maximum value to allow. Defaults to 120. Samples outside of this range are resampled. |
Value
A vector of samples or a probability distribution.
Real-time Rt Estimation, Forecasting and Reporting
Description
This function wraps the functionality of
estimate_infections()
in order
to estimate Rt and cases by date of infection and forecast these infections
into the future. In addition to the functionality of
estimate_infections()
it produces additional summary output useful for
reporting results and interpreting them as well as error catching and
reporting, making it particularly useful for production use e.g. running at
set intervals on a dedicated server.
Usage
epinow(
data,
generation_time = gt_opts(),
delays = delay_opts(),
truncation = trunc_opts(),
rt = rt_opts(),
backcalc = backcalc_opts(),
gp = gp_opts(),
obs = obs_opts(),
forecast = forecast_opts(),
stan = stan_opts(),
CrIs = c(0.2, 0.5, 0.9),
return_output = is.null(target_folder),
output = c("samples", "plots", "latest", "fit", "timing"),
plot_args = list(),
target_folder = NULL,
target_date,
logs = tempdir(),
id = "epinow",
verbose = interactive(),
filter_leading_zeros = TRUE,
zero_threshold = Inf,
horizon
)
Arguments
data |
A |
generation_time |
A call to |
delays |
A call to |
truncation |
A call to |
rt |
A list of options as generated by |
backcalc |
A list of options as generated by |
gp |
A list of options as generated by |
obs |
A list of options as generated by |
forecast |
A list of options as generated by |
stan |
A list of stan options as generated by |
CrIs |
Numeric vector of credible intervals to calculate. |
return_output |
Logical, defaults to FALSE. Should output be returned, this automatically updates to TRUE if no directory for saving is specified. |
output |
A character vector of optional output to return. Supported
options are samples ("samples"), plots ("plots"), the run time ("timing"),
copying the dated folder into a latest folder (if |
plot_args |
A list of optional arguments passed to |
target_folder |
Character string specifying where to save results (will create if not present). |
target_date |
Date, defaults to maximum found in the data if not specified. |
logs |
Character path indicating the target folder in which to store log
information. Defaults to the temporary directory if not specified. Default
logging can be disabled if |
id |
A character string used to assign logging information on error.
Used by |
verbose |
Logical, defaults to |
filter_leading_zeros |
Logical, defaults to TRUE. Should zeros at the start of the time series be filtered out. |
zero_threshold |
|
horizon |
Deprecated; use |
Value
A list of output from estimate_infections with additional elements summarising results and reporting errors if they have occurred.
See Also
estimate_infections()
forecast_infections()
regional_epinow()
Examples
# set number of cores to use
old_opts <- options()
options(mc.cores = ifelse(interactive(), 4, 1))
# set an example generation time. In practice this should use an estimate
# from the literature or be estimated from data
generation_time <- Gamma(
shape = Normal(1.3, 0.3),
rate = Normal(0.37, 0.09),
max = 14
)
# set an example incubation period. In practice this should use an estimate
# from the literature or be estimated from data
incubation_period <- LogNormal(
meanlog = Normal(1.6, 0.06),
sdlog = Normal(0.4, 0.07),
max = 14
)
# set an example reporting delay. In practice this should use an estimate
# from the literature or be estimated from data
reporting_delay <- LogNormal(mean = 2, sd = 1, max = 10)
# example case data
reported_cases <- example_confirmed[1:40]
# estimate Rt and nowcast/forecast cases by date of infection
out <- epinow(
data = reported_cases,
generation_time = gt_opts(generation_time),
rt = rt_opts(prior = LogNormal(mean = 2, sd = 0.1)),
delays = delay_opts(incubation_period + reporting_delay)
)
# summary of the latest estimates
summary(out)
# plot estimates
plot(out)
# summary of R estimates
summary(out, type = "parameters", params = "R")
options(old_opts)
Load and compile an EpiNow2 cmdstanr model
Description
The function has been adapted from a similar function in the epinowcast package (Copyright holder: epinowcast authors, under MIT License).
Usage
epinow2_cmdstan_model(
model = "estimate_infections",
dir = system.file("stan", package = "EpiNow2"),
verbose = FALSE,
...
)
Arguments
model |
A character string indicating the model to use. Needs to be
present in |
dir |
A character string specifying the path to any stan files to include in the model. If missing the package default is used. |
verbose |
Logical, defaults to |
... |
Additional arguments passed to |
Value
A cmdstanr
model.
Load an EpiNow2 rstan model.
Description
The models are pre-compiled upon package install and is returned here.
Usage
epinow2_rstan_model(model = "estimate_infections")
Arguments
model |
A character string indicating the model to use. Needs to be
amongst the compiled models shipped with "EpiNow2" (see the |
Value
An rstan
model.
Return a stan model object for the appropriate backend
Description
Return a stan model object for the appropriate backend
Usage
epinow2_stan_model(
backend = c("rstan", "cmdstanr"),
model = c("estimate_infections", "simulate_infections", "estimate_secondary",
"simulate_secondary", "estimate_truncation", "dist_fit")
)
Arguments
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
model |
A character string indicating the model to use. One of "estimate_infections" (default), "simulate_infections", "estimate_secondary", "simulate_secondary", "estimate_truncation" or "dist_fit". |
Value
A stan model object (either rstan::stanmodel
or
cmdstanr::CmdStanModel
, depending on the backend)
Estimate a Delay Distribution
Description
Estimate a log normal delay distribution from a vector of integer delays.
Currently this function is a simple wrapper for
bootstrapped_dist_fit()
.
Usage
estimate_delay(delays, ...)
Arguments
delays |
Integer vector of delays |
... |
Arguments to pass to internal methods. |
Value
A <dist_spec>
summarising the bootstrapped distribution
See Also
Examples
delays <- rlnorm(500, log(5), 1)
estimate_delay(delays, samples = 1000, bootstraps = 10)
Estimate Infections, the Time-Varying Reproduction Number and the Rate of Growth
Description
Uses a non-parametric approach to reconstruct cases by date of infection
from reported cases. It uses either a generative Rt model or non-parametric
back calculation to estimate underlying latent infections and then maps
these infections to observed cases via uncertain reporting delays and a
flexible observation model. See the examples and function arguments for the
details of all options. The default settings may not be sufficient for your
use case so the number of warmup samples (
stan_args = list(warmup)
) may
need to be increased as may the overall number of samples. Follow the links
provided by any warnings messages to diagnose issues with the MCMC fit. It
is recommended to explore several of the Rt estimation approaches supported
as not all of them may be suited to users own use cases. See
here
for an example of using estimate_infections
within the epinow
wrapper to
estimate Rt for Covid-19 in a country from the ECDC data source.
Usage
estimate_infections(
data,
generation_time = gt_opts(),
delays = delay_opts(),
truncation = trunc_opts(),
rt = rt_opts(),
backcalc = backcalc_opts(),
gp = gp_opts(),
obs = obs_opts(),
forecast = forecast_opts(),
stan = stan_opts(),
CrIs = c(0.2, 0.5, 0.9),
weigh_delay_priors = TRUE,
id = "estimate_infections",
verbose = interactive(),
filter_leading_zeros = TRUE,
zero_threshold = Inf,
horizon
)
Arguments
data |
A |
generation_time |
A call to |
delays |
A call to |
truncation |
A call to |
rt |
A list of options as generated by |
backcalc |
A list of options as generated by |
gp |
A list of options as generated by |
obs |
A list of options as generated by |
forecast |
A list of options as generated by |
stan |
A list of stan options as generated by |
CrIs |
Numeric vector of credible intervals to calculate. |
weigh_delay_priors |
Logical. If TRUE (default), all delay distribution priors will be weighted by the number of observation data points, in doing so approximately placing an independent prior at each time step and usually preventing the posteriors from shifting. If FALSE, no weight will be applied, i.e. delay distributions will be treated as a single parameters. |
id |
A character string used to assign logging information on error.
Used by |
verbose |
Logical, defaults to |
filter_leading_zeros |
Logical, defaults to TRUE. Should zeros at the start of the time series be filtered out. |
zero_threshold |
|
horizon |
Deprecated; use |
Value
A list of output including: posterior samples, summarised posterior samples, data used to fit the model, and the fit object itself.
See Also
epinow()
regional_epinow()
forecast_infections()
estimate_truncation()
Examples
# set number of cores to use
old_opts <- options()
options(mc.cores = ifelse(interactive(), 4, 1))
# get example case counts
reported_cases <- example_confirmed[1:60]
# set an example generation time. In practice this should use an estimate
# from the literature or be estimated from data
generation_time <- Gamma(
shape = Normal(1.3, 0.3),
rate = Normal(0.37, 0.09),
max = 14
)
# set an example incubation period. In practice this should use an estimate
# from the literature or be estimated from data
incubation_period <- LogNormal(
meanlog = Normal(1.6, 0.06),
sdlog = Normal(0.4, 0.07),
max = 14
)
# set an example reporting delay. In practice this should use an estimate
# from the literature or be estimated from data
reporting_delay <- LogNormal(mean = 2, sd = 1, max = 10)
# for more examples, see the "estimate_infections examples" vignette
def <- estimate_infections(reported_cases,
generation_time = gt_opts(generation_time),
delays = delay_opts(incubation_period + reporting_delay),
rt = rt_opts(prior = LogNormal(mean = 2, sd = 0.1))
)
# real time estimates
summary(def)
# summary plot
plot(def)
options(old_opts)
Estimate a Secondary Observation from a Primary Observation
Description
Estimates the relationship between a primary and secondary observation, for
example hospital admissions and deaths or hospital admissions and bed
occupancy. See
secondary_opts()
for model structure options. See parameter
documentation for model defaults and options. See the examples for case
studies using synthetic data and
here
for an example of forecasting Covid-19 deaths from Covid-19 cases. See
here for
a prototype function that may be used to estimate and forecast a secondary
observation from a primary across multiple regions and
here # nolint
for an application forecasting Covid-19 deaths in Germany and Poland.
Usage
estimate_secondary(
data,
secondary = secondary_opts(),
delays = delay_opts(LogNormal(meanlog = Normal(2.5, 0.5), sdlog = Normal(0.47, 0.25),
max = 30), weight_prior = FALSE),
truncation = trunc_opts(),
obs = obs_opts(),
stan = stan_opts(),
burn_in = 14,
CrIs = c(0.2, 0.5, 0.9),
priors = NULL,
model = NULL,
weigh_delay_priors = FALSE,
verbose = interactive(),
filter_leading_zeros = FALSE,
zero_threshold = Inf
)
Arguments
data |
A |
secondary |
A call to |
delays |
A call to |
truncation |
A call to |
obs |
A list of options as generated by |
stan |
A list of stan options as generated by |
burn_in |
Integer, defaults to 14 days. The number of data points to use for estimation but not to fit to at the beginning of the time series. This must be less than the number of observations. |
CrIs |
Numeric vector of credible intervals to calculate. |
priors |
A |
model |
A compiled stan model to override the default model. May be useful for package developers or those developing extensions. |
weigh_delay_priors |
Logical. If TRUE, all delay distribution priors will be weighted by the number of observation data points, in doing so approximately placing an independent prior at each time step and usually preventing the posteriors from shifting. If FALSE (default), no weight will be applied, i.e. delay distributions will be treated as a single parameters. |
verbose |
Logical, should model fitting progress be returned. Defaults
to |
filter_leading_zeros |
Logical, defaults to TRUE. Should zeros at the start of the time series be filtered out. |
zero_threshold |
|
Value
A list containing: predictions
(a <data.frame>
ordered by date
with the primary, and secondary observations, and a summary of the model
estimated secondary observations), posterior
which contains a summary of
the entire model posterior, data
(a list of data used to fit the
model), and fit
(the stanfit
object).
Examples
# set number of cores to use
old_opts <- options()
options(mc.cores = ifelse(interactive(), 4, 1))
# load data.table for manipulation
library(data.table)
#### Incidence data example ####
# make some example secondary incidence data
cases <- example_confirmed
cases <- as.data.table(cases)[, primary := confirm]
# Assume that only 40 percent of cases are reported
cases[, scaling := 0.4]
# Parameters of the assumed log normal delay distribution
cases[, meanlog := 1.8][, sdlog := 0.5]
# Simulate secondary cases
cases <- convolve_and_scale(cases, type = "incidence")
#
# fit model to example data specifying a weak prior for fraction reported
# with a secondary case
inc <- estimate_secondary(cases[1:60],
obs = obs_opts(scale = Normal(mean = 0.2, sd = 0.2), week_effect = FALSE)
)
plot(inc, primary = TRUE)
# forecast future secondary cases from primary
inc_preds <- forecast_secondary(
inc, cases[seq(61, .N)][, value := primary]
)
plot(inc_preds, new_obs = cases, from = "2020-05-01")
#### Prevalence data example ####
# make some example prevalence data
cases <- example_confirmed
cases <- as.data.table(cases)[, primary := confirm]
# Assume that only 30 percent of cases are reported
cases[, scaling := 0.3]
# Parameters of the assumed log normal delay distribution
cases[, meanlog := 1.6][, sdlog := 0.8]
# Simulate secondary cases
cases <- convolve_and_scale(cases, type = "prevalence")
# fit model to example prevalence data
prev <- estimate_secondary(cases[1:100],
secondary = secondary_opts(type = "prevalence"),
obs = obs_opts(
week_effect = FALSE,
scale = Normal(mean = 0.4, sd = 0.1)
)
)
plot(prev, primary = TRUE)
# forecast future secondary cases from primary
prev_preds <- forecast_secondary(
prev, cases[seq(101, .N)][, value := primary]
)
plot(prev_preds, new_obs = cases, from = "2020-06-01")
options(old_opts)
Estimate Truncation of Observed Data
Description
Estimates a truncation distribution from multiple snapshots of the same
data source over time. This distribution can then be used passed to the
truncation
argument in regional_epinow()
, epinow()
, and
estimate_infections()
to adjust for truncated data and propagate the
uncertainty associated with data truncation into the estimates.
See here
for an example of using this approach on Covid-19 data in England. The
functionality offered by this function is now available in a more principled
manner in the epinowcast
R package.
The model of truncation is as follows:
The truncation distribution is assumed to be discretised log normal wit a mean and standard deviation that is informed by the data.
The data set with the latest observations is adjusted for truncation using the truncation distribution.
Earlier data sets are recreated by applying the truncation distribution to the adjusted latest observations in the time period of the earlier data set. These data sets are then compared to the earlier observations assuming a negative binomial observation model with an additive noise term to deal with zero observations.
This model is then fit using stan
with standard normal, or half normal,
prior for the mean, standard deviation, 1 over the square root of the
overdispersion and additive noise term.
This approach assumes that:
Current truncation is related to past truncation.
Truncation is a multiplicative scaling of underlying reported cases.
Truncation is log normally distributed.
Usage
estimate_truncation(
data,
truncation = trunc_opts(LogNormal(meanlog = Normal(0, 1), sdlog = Normal(1, 1), max =
10)),
model = NULL,
stan = stan_opts(),
CrIs = c(0.2, 0.5, 0.9),
filter_leading_zeros = FALSE,
zero_threshold = Inf,
weigh_delay_priors = FALSE,
verbose = TRUE,
...,
obs
)
Arguments
data |
A list of |
truncation |
A call to |
model |
A compiled stan model to override the default model. May be useful for package developers or those developing extensions. |
stan |
A list of stan options as generated by |
CrIs |
Numeric vector of credible intervals to calculate. |
filter_leading_zeros |
Logical, defaults to TRUE. Should zeros at the start of the time series be filtered out. |
zero_threshold |
|
weigh_delay_priors |
Deprecated; use the |
verbose |
Logical, should model fitting progress be returned. |
... |
Additional parameters to pass to |
obs |
Deprecated; use |
Value
A list containing: the summary parameters of the truncation
distribution (dist
), which could be passed to the truncation
argument
of epinow()
, regional_epinow()
, and estimate_infections()
, the
estimated CMF of the truncation distribution (cmf
, can be used to
adjusted new data), a <data.frame>
containing the observed truncated
data, latest observed data and the adjusted for
truncation observations (obs
), a <data.frame>
containing the last
observed data (last_obs
, useful for plotting and validation), the data
used for fitting (data
) and the fit object (fit
).
Examples
# set number of cores to use
old_opts <- options()
options(mc.cores = ifelse(interactive(), 4, 1))
# fit model to example data
# See [example_truncated] for more details
est <- estimate_truncation(example_truncated,
verbose = interactive(),
chains = 2, iter = 2000
)
# summary of the distribution
est$dist
# summary of the estimated truncation cmf (can be applied to new data)
print(est$cmf)
# observations linked to truncation adjusted estimates
print(est$obs)
# validation plot of observations vs estimates
plot(est)
# Pass the truncation distribution to `epinow()`.
# Note, we're using the last snapshot as the observed data as it contains
# all the previous snapshots. Also, we're using the default options for
# illustrative purposes only.
out <- epinow(
generation_time = generation_time_opts(example_generation_time),
example_truncated[[5]],
truncation = trunc_opts(est$dist)
)
plot(out)
options(old_opts)
Estimate Cases by Report Date
Description
Either extracts or converts reported cases from an input data table. For
output from
estimate_infections
this is a simple filtering step.
Usage
estimates_by_report_date(
estimates,
CrIs = c(0.2, 0.5, 0.9),
target_folder = NULL,
samples = TRUE
)
Arguments
estimates |
List of data frames as output by |
CrIs |
Numeric vector of credible intervals to calculate. |
target_folder |
Character string specifying where to save results (will create if not present). |
samples |
Logical, defaults to TRUE. Should samples be saved |
Value
A list of samples and summarised estimates of estimated cases by date of report.
Example Confirmed Case Data Set
Description
An example data frame of observed cases
Usage
example_confirmed
Format
A data frame containing cases reported on each date.
Example generation time
Description
An example of a generation time estimate. See here for details:
https://github.com/epiforecasts/EpiNow2/blob/main/data-raw/generation-time.R
Usage
example_generation_time
Format
A dist_spec
object summarising the distribution
Example incubation period
Description
An example of an incubation period estimate. See here for details:
https://github.com/epiforecasts/EpiNow2/blob/main/data-raw/incubation-period.R # nolint
Usage
example_incubation_period
Format
A dist_spec
object summarising the distribution
Example reporting delay
Description
An example of an reporting delay estimate. See here for details:
https://github.com/epiforecasts/EpiNow2/blob/main/data-raw/reporting-delay # nolint
Usage
example_reporting_delay
Format
A dist_spec
object summarising the distribution
Example Case Data Set with Truncation
Description
An example dataset of observed cases with truncation applied.
This data is generated internally for use in the example of
estimate_truncation()
. For details on how the data is generated, see
https://github.com/epiforecasts/EpiNow2/blob/main/data-raw/truncated.R #nolint
Usage
example_truncated
Format
A list of data.table
s containing cases reported on each date until
a point of truncation.
Each element of the list is a data.table
with the following columns:
- date
Date of case report.
- confirm
Number of confirmed cases.
Expose internal package stan functions in R
Description
his function exposes internal stan functions in R from a user
supplied list of target files. Allows for testing of stan functions in R and
potentially user use in R code.
Usage
expose_stan_fns(files, target_dir, ...)
Arguments
files |
A character vector indicating the target files. |
target_dir |
A character string indicating the target directory for the file. |
... |
Additional arguments passed to |
Value
No return value, called for side effects
Extract Credible Intervals Present
Description
Helper function to extract the credible intervals present in a
<data.frame>
.
Usage
extract_CrIs(summarised)
Arguments
summarised |
A |
Value
A numeric vector of credible intervals detected in
the <data.frame>
.
Examples
samples <- data.frame(value = 1:10, type = "car")
summarised <- calc_CrIs(samples,
summarise_by = "type",
CrIs = c(seq(0.05, 0.95, 0.05))
)
extract_CrIs(summarised)
Generate initial conditions from a Stan fit
Description
Extracts posterior samples to use to initialise a full model fit. This may
be useful for certain data sets where the sampler gets stuck or cannot
easily be initialised. In
estimate_infections()
, epinow()
and
regional_epinow()
this option can be engaged by setting
stan_opts(init_fit = <stanfit>)
.
This implementation is based on the approach taken in epidemia authored by James Scott.
Usage
extract_inits(fit, current_inits, exclude_list = NULL, samples = 50)
Arguments
fit |
A |
current_inits |
A function that returns a list of initial conditions
(such as |
exclude_list |
A character vector of parameters to not initialise from
the fit object, defaulting to |
samples |
Numeric, defaults to 50. Number of posterior samples. |
Value
A function that when called returns a set of initial conditions as a named list.
Extract Samples for a Parameter from a Stan model
Description
Extracts a single from a list of stan output and returns it as a
<data.table>
.
Usage
extract_parameter(param, samples, dates)
Arguments
param |
Character string indicating the parameter to extract |
samples |
Extracted stan model (using |
dates |
A vector identifying the dimensionality of the parameter to extract. Generally this will be a date. |
Value
A <data.frame>
containing the parameter name, date, sample id and
sample value.
Extract Parameter Samples from a Stan Model
Description
Extracts a custom set of parameters from a stan object and adds
stratification and dates where appropriate.
Usage
extract_parameter_samples(
stan_fit,
data,
reported_dates,
imputed_dates,
reported_inf_dates,
drop_length_1 = FALSE,
merge = FALSE
)
Arguments
stan_fit |
A |
data |
A list of the data supplied to the |
reported_dates |
A vector of dates to report estimates for. |
imputed_dates |
A vector of dates to report imputed reports for. |
reported_inf_dates |
A vector of dates to report infection estimates for. |
drop_length_1 |
Logical; whether the first dimension should be dropped if it is off length 1; this is necessary when processing simulation results. |
merge |
if TRUE, merge samples and data so that parameters can be extracted from data. |
Value
A list of <data.frame>
's each containing the posterior of a
parameter
Extract parameter names
Description
Internal function for extracting given parameter names of a distribution
from the environment. Called by
new_dist_spec
Usage
extract_params(params, distribution)
Arguments
params |
Given parameters (obtained using |
distribution |
Character; the distribution to use. |
Value
A character vector of parameters and their values.
Extract all samples from a stan fit
Description
If the object
argument is a <stanfit>
object, it simply returns the
result of rstan::extract()
. If it is a <CmdStanMCMC>
it returns samples
in the same format as rstan::extract()
does for <stanfit>
objects.
Usage
extract_samples(stan_fit, pars = NULL, include = TRUE)
Arguments
stan_fit |
A |
pars |
Any selection of parameters to extract |
include |
whether the parameters specified in |
Value
List of data.tables with samples
Extract a single element of a composite <dist_spec>
Description
Usage
extract_single_dist(x, i)
Arguments
x |
A composite |
i |
The index to extract |
Value
A single dist_spec
object
Examples
dist1 <- LogNormal(mean = 1.6, sd = 0.5, max = 20)
# An uncertain gamma distribution with shape and rate normally distributed
# as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- Gamma(
shape = Normal(3, 0.5),
rate = Normal(2, 0.5),
max = 20
)
# Multiple distributions
## Not run:
dist <- dist1 + dist2
extract_single_dist(dist, 2)
## End(Not run)
Extract a Parameter Summary from a Stan Object
Description
Extracts summarised parameter posteriors from a
stanfit
object using
rstan::summary()
in a format consistent with other summary functions
in {EpiNow2}
.
Usage
extract_stan_param(
fit,
params = NULL,
CrIs = c(0.2, 0.5, 0.9),
var_names = FALSE
)
Arguments
fit |
A |
params |
A character vector of parameters to extract. Defaults to all parameters. |
CrIs |
Numeric vector of credible intervals to calculate. |
var_names |
Logical defaults to |
Value
A <data.table>
summarising parameter posteriors. Contains a
following variables: variable
, mean
, mean_se
, sd
, median
, and
lower_
, upper_
followed by credible interval labels indicating the
credible intervals present.
Extract Samples from a Parameter with a Single Dimension
Description
Extract Samples from a Parameter with a Single Dimension
Usage
extract_static_parameter(param, samples)
Arguments
param |
Character string indicating the parameter to extract |
samples |
Extracted stan model (using |
Value
A <data.frame>
containing the parameter name, sample id and sample
value
Fill missing data in a data set to prepare it for use within the package
Description
This function ensures that all days between the first and last date in the
data are present. It adds an
accumulate
column that indicates whether
modelled observations should be accumulated onto a later data point.
point. This is useful for modelling data that is reported less frequently
than daily, e.g. weekly incidence data, as well as other reporting
artifacts such as delayed weekedn reporting. The function can also be used
to fill in missing observations with zeros.
Usage
fill_missing(
data,
missing_dates = c("ignore", "accumulate", "zero"),
missing_obs = c("ignore", "accumulate", "zero"),
initial_accumulate,
obs_column = "confirm",
by = NULL
)
Arguments
data |
Data frame with a |
missing_dates |
Character. Options are "ignore" (the default),
"accumulate" and "zero". This determines how missing dates in the data are
interpreted. If set to "ignore", any missing dates in the observation
data will be interpreted as missing and skipped in the likelihood. If set
to "accumulate", modelled observations on dates that are missing in the
data will be accumulated and added to the next non-missing data point.
This can be used to model incidence data that is reported less frequently
than daily. In that case, the first data point is not included in the
likelihood (unless |
missing_obs |
Character. How to process dates that exist in the data
but have observations with NA values. The options available are the same
ones as for the |
initial_accumulate |
Integer. The number of initial dates to accumulate
if |
obs_column |
Character (default: "confirm"). If given, only the column specified here will be used for checking missingness. This is useful if using a data set that has multiple columns of hwich one of them corresponds to observations that are to be processed here. |
by |
Character vector. Name(s) of any additional column(s) where data processing should be done separately for each value in the column. This is useful when using data representing e.g. multiple geographies. If NULL (default) no such grouping is done. |
Value
a data.table with an accumulate
column that indicates whether
values are accumulated (see the documentation of the data
argument in
estimate_infections()
)
Examples
cases <- data.table::copy(example_confirmed)
## calculate weekly sum
cases[, confirm := data.table::frollsum(confirm, 7)]
## limit to dates once a week
cases <- cases[seq(7, nrow(cases), 7)]
## set the second observation to missing
cases[2, confirm := NA]
## fill missing data
fill_missing(cases, missing_dates = "accumulate", initial_accumulate = 7)
Filter leading zeros from a data set.
Description
Filter leading zeros from a data set.
Usage
filter_leading_zeros(data, obs_column = "confirm", by = NULL)
Arguments
data |
A |
obs_column |
Character (default: "confirm"). If given, only the column specified here will be used for checking missingness. This is useful if using a data set that has multiple columns of hwich one of them corresponds to observations that are to be processed here. |
by |
Character vector. Name(s) of any additional column(s) where data processing should be done separately for each value in the column. This is useful when using data representing e.g. multiple geographies. If NULL (default) no such grouping is done. |
Value
A data.table with leading zeros removed.
Examples
cases <- data.frame(
date = as.Date("2020-01-01") + 0:10,
confirm = c(0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
)
filter_leading_zeros(cases)
Filter Options for a Target Region
Description
A helper function that allows the selection of region specific settings if
present and otherwise applies the overarching settings.
Usage
filter_opts(opts, region)
Arguments
opts |
Either a list of calls to an |
region |
A character string indicating a region of interest. |
Value
A list of options
Fit a model using the chosen backend.
Description
Internal function for dispatch to fitting with NUTS or VB.
Usage
fit_model(args, id = "stan")
Arguments
args |
List of stan arguments. |
id |
A character string used to assign logging information on error.
Used by |
Fit a Stan Model using an approximate method
Description
Fits a stan model using variational inference.
Usage
fit_model_approximate(args, future = FALSE, id = "stan")
Arguments
args |
List of stan arguments. |
future |
Logical, defaults to |
id |
A character string used to assign logging information on error.
Used by |
Value
A stan model object
Fit a Stan Model using the NUTs sampler
Description
Fits a stan model using
rstan::sampling()
. Provides the optional ability to
run chains using future
with error catching, timeouts and merging of
completed chains.
Usage
fit_model_with_nuts(
args,
future = FALSE,
max_execution_time = Inf,
id = "stan"
)
Arguments
args |
List of stan arguments. |
future |
Logical, defaults to |
max_execution_time |
Numeric, defaults to Inf. What is the maximum execution time per chain in seconds. Results will still be returned as long as at least 2 chains complete successfully within the timelimit. |
id |
A character string used to assign logging information on error.
Used by |
Value
A stan model object
Remove uncertainty in the parameters of a <dist_spec>
Description
This function has been renamed to
fix_parameters()
as a more appropriate
name.
Usage
fix_dist(x, strategy = c("mean", "sample"))
Arguments
x |
A |
strategy |
Character; either "mean" (use the mean estimates of the
mean and standard deviation) or "sample" (randomly sample mean and
standard deviation from uncertainty given in the |
Value
A <dist_spec>
object without uncertainty
Fix the parameters of a <dist_spec>
Description
If the given
<dist_spec>
has any uncertainty, it is removed and the
corresponding distribution converted into a fixed one.
Usage
## S3 method for class 'dist_spec'
fix_parameters(x, strategy = c("mean", "sample"), ...)
Arguments
x |
A |
strategy |
Character; either "mean" (use the mean estimates of the
mean and standard deviation) or "sample" (randomly sample mean and
standard deviation from uncertainty given in the |
... |
ignored |
Value
A <dist_spec>
object without uncertainty
Examples
# An uncertain gamma distribution with shape and rate normally distributed
# as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist <- Gamma(
shape = Normal(3, 0.5),
rate = Normal(2, 0.5),
max = 20
)
fix_parameters(dist)
Forecast infections from a given fit and trajectory of the time-varying reproduction number
Description
This function simulates infections using an existing fit to observed cases
but with a modified time-varying reproduction number. This can be used to
explore forecast models or past counterfactuals. Simulations can be run in
parallel using
future::plan()
.
Usage
forecast_infections(
estimates,
R = NULL,
model = NULL,
samples = NULL,
batch_size = 10,
backend = "rstan",
verbose = interactive()
)
Arguments
estimates |
The |
R |
A numeric vector of reproduction numbers; these will overwrite the
reproduction numbers contained in |
model |
A compiled stan model as returned by |
samples |
Numeric, number of posterior samples to simulate from. The
default is to use all samples in the |
batch_size |
Numeric, defaults to 10. Size of batches in which to simulate. May decrease run times due to reduced IO costs but this is still being evaluated. If set to NULL then all simulations are done at once. |
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
verbose |
Logical defaults to |
Value
A list of output as returned by estimate_infections()
but based on
results from the specified scenario rather than fitting.
See Also
generation_time_opts()
delay_opts()
rt_opts()
estimate_infections()
trunc_opts()
stan_opts()
obs_opts()
gp_opts()
Examples
# set number of cores to use
old_opts <- options()
options(mc.cores = ifelse(interactive(), 4, 1))
# get example case counts
reported_cases <- example_confirmed[1:50]
# fit model to data to recover Rt estimates
est <- estimate_infections(reported_cases,
generation_time = generation_time_opts(example_generation_time),
delays = delay_opts(example_incubation_period + example_reporting_delay),
rt = rt_opts(prior = LogNormal(mean = 2, sd = 0.1), rw = 7),
obs = obs_opts(scale = Normal(mean = 0.1, sd = 0.01)),
gp = NULL,
forecast = forecast_opts(horizon = 0)
)
# update Rt trajectory and simulate new infections using it
R <- c(rep(NA_real_, 26), rep(0.5, 10), rep(0.8, 14))
sims <- forecast_infections(est, R)
plot(sims)
# with a data.frame input of samples
R_dt <- data.frame(
date = seq(
min(summary(est, type = "parameters", param = "R")$date),
by = "day", length.out = length(R)
),
value = R
)
sims <- forecast_infections(est, R_dt)
plot(sims)
#' # with a data.frame input of samples
R_samples <- summary(est, type = "samples", param = "R")
R_samples <- R_samples[
,
.(date, sample, value)
][sample <= 1000][date <= "2020-04-10"]
R_samples <- R_samples[date >= "2020-04-01", value := 1.1]
sims <- forecast_infections(est, R_samples)
plot(sims)
options(old_opts)
Forecast options
Description
Defines a list specifying the arguments passed to underlying stan
backend functions via
stan_sampling_opts()
and stan_vb_opts()
. Custom
settings can be supplied which override the defaults.
Usage
forecast_opts(horizon = 7, accumulate)
Arguments
horizon |
Numeric, defaults to 7. Number of days into the future to forecast. |
accumulate |
Integer, the number of days to accumulate in forecasts, if any. If not given and observations are accumulated at constant frequency in the data used for fitting then the same accumulation will be used in forecasts unless set explicitly here. |
Value
A <forecast_opts>
object of forecast setting.
See Also
fill_missing
Examples
forecast_opts(horizon = 28, accumulate = 7)
Forecast Secondary Observations Given a Fit from estimate_secondary
Description
This function forecasts secondary observations using the output of
estimate_secondary()
and either observed primary data or a forecast of
primary observations. See the examples of estimate_secondary()
for one use case. It can also be combined with estimate_infections()
to
produce a forecast for a secondary observation from a forecast of a primary
observation. See the examples of estimate_secondary()
for
example use cases on synthetic data. See
here
for an example of forecasting Covid-19 deaths from Covid-19 cases.
Usage
forecast_secondary(
estimate,
primary,
primary_variable = "reported_cases",
model = NULL,
backend = "rstan",
samples = NULL,
all_dates = FALSE,
CrIs = c(0.2, 0.5, 0.9)
)
Arguments
estimate |
An object of class "estimate_secondary" as produced by
|
primary |
A |
primary_variable |
A character string indicating the primary variable,
defaulting to "reported_cases". Only used when primary is of class
|
model |
A compiled stan model as returned by |
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
samples |
Numeric, number of posterior samples to simulate from. The
default is to use all samples in the |
all_dates |
Logical, defaults to FALSE. Should a forecast for all dates and not just those in the forecast horizon be returned. |
CrIs |
Numeric vector of credible intervals to calculate. |
Value
A list containing: predictions
(a <data.frame>
ordered by date
with the primary, and secondary observations, and a summary of the forecast
secondary observations. For primary observations in the forecast horizon
when uncertainty is present the median is used), samples
a <data.frame>
of forecast secondary observation posterior samples, and forecast
a summary
of the forecast secondary observation posterior.
See Also
Format Posterior Samples
Description
Summaries posterior samples and adds additional custom variables.
Usage
format_fit(posterior_samples, horizon, shift, burn_in, start_date, CrIs)
Arguments
posterior_samples |
A list of posterior samples as returned by
|
horizon |
Numeric, forecast horizon. |
shift |
Numeric, the shift to apply to estimates. |
burn_in |
Numeric, number of days to discard estimates for. |
start_date |
Date, earliest date with data. |
CrIs |
Numeric vector of credible intervals to calculate. |
Value
A list of samples and summarised posterior parameter estimates.
Get the distribution of a <dist_spec>
Description
Usage
get_distribution(x, id = NULL)
Arguments
x |
A |
id |
Integer; the id of the distribution to use (if x is a composite
distribution). If |
Value
A character string naming the distribution (or "nonparametric")
Examples
dist <- Gamma(shape = 3, rate = 2, max = 10)
get_distribution(dist)
Extracts an element of a <dist_spec>
Description
Extracts an element of a <dist_spec>
Usage
get_element(x, id = NULL, element)
Arguments
x |
A |
id |
Integer; the id of the distribution to use (if x is a composite
distribution). If |
element |
The element, i.e. "parameters", "pmf" or "distribution". |
Value
The id to use.
Get parameters of a parametric distribution
Description
Usage
get_parameters(x, id = NULL)
Arguments
x |
A |
id |
Integer; the id of the distribution to use (if x is a composite
distribution). If |
Value
A list of parameters of the distribution.
Examples
dist <- Gamma(shape = 3, rate = 2)
get_parameters(dist)
Get the probability mass function of a nonparametric distribution
Description
Usage
get_pmf(x, id = NULL)
Arguments
x |
A |
id |
Integer; the id of the distribution to use (if x is a composite
distribution). If |
Value
The pmf of the distribution
Examples
dist <- discretise(Gamma(shape = 3, rate = 2, max = 10))
get_pmf(dist)
Get a Single Raw Result
Description
Usage
get_raw_result(file, region, date, result_dir)
Arguments
file |
Character string giving the result files name. |
region |
Character string giving the region of interest. |
date |
Target date (in the format |
result_dir |
Character string giving the location of the target directory. |
Value
An R object read in from the targeted .rds
file
Get Combined Regional Results
Description
Summarises results across regions either from input or from disk. See the
examples for details.
Usage
get_regional_results(
regional_output,
results_dir,
date,
samples = TRUE,
forecast = FALSE
)
Arguments
regional_output |
A list of output as produced by |
results_dir |
A character string indicating the folder containing the
|
date |
A Character string (in the format "yyyy-mm-dd") indicating the date to extract data for. Defaults to "latest" which finds the latest results available. |
samples |
Logical, defaults to |
forecast |
Logical, defaults to |
Value
A list of estimates, forecasts and estimated cases by date of report.
Examples
# get example multiregion estimates
regional_out <- readRDS(system.file(
package = "EpiNow2", "extdata", "example_regional_epinow.rds"
))
# from output
results <- get_regional_results(regional_out$regional, samples = FALSE)
Get Folders with Results
Description
Usage
get_regions(results_dir)
Arguments
results_dir |
A character string giving the directory in which results
are stored (as produced by |
Value
A named character vector containing the results to plot.
Get Regions with Most Reported Cases
Description
Extract a vector of regions with the most reported cases in a set time
window.
Usage
get_regions_with_most_reports(data, time_window = 7, no_regions = 6)
Arguments
data |
A |
time_window |
Numeric, number of days to include from latest date in data. Defaults to 7 days. |
no_regions |
Numeric, number of regions to return. Defaults to 6. |
Value
A character vector of regions with the highest reported cases
Estimate seeding time from delays and generation time
Description
The seeding time is set to the mean of the specified delays, constrained to be at least the maximum generation time
Usage
get_seeding_time(delays, generation_time, rt = rt_opts())
Arguments
delays |
A call to |
generation_time |
A call to |
rt |
A list of options as generated by |
Value
An integer seeding time
Approximate Gaussian Process Settings
Description
Defines a list specifying the structure of the approximate Gaussian
process. Custom settings can be supplied which override the defaults.
Usage
gp_opts(
basis_prop = 0.2,
boundary_scale = 1.5,
ls_mean = 21,
ls_sd = 7,
ls_min = 0,
ls_max = 60,
ls = LogNormal(mean = 21, sd = 7, max = 60),
alpha = Normal(mean = 0, sd = 0.01),
kernel = c("matern", "se", "ou", "periodic"),
matern_order = 3/2,
matern_type,
w0 = 1,
alpha_mean,
alpha_sd
)
Arguments
basis_prop |
Numeric, the proportion of time points to use as basis functions. Defaults to 0.2. Decreasing this value results in a decrease in accuracy but a faster compute time (with increasing it having the first effect). In general smaller posterior length scales require a higher proportion of basis functions. See (Riutort-Mayol et al. 2020 https://arxiv.org/abs/2004.11408) for advice on updating this default. |
boundary_scale |
Numeric, defaults to 1.5. Boundary scale of the approximate Gaussian process. See (Riutort-Mayol et al. 2020 https://arxiv.org/abs/2004.11408) for advice on updating this default. |
ls_mean |
Deprecated; use |
ls_sd |
Deprecated; use |
ls_min |
Deprecated; use |
ls_max |
Deprecated; use |
ls |
A |
alpha |
A |
kernel |
Character string, the type of kernel required. Currently supporting the Matern kernel ("matern"), squared exponential kernel ("se"), periodic kernel, Ornstein-Uhlenbeck #' kernel ("ou"), and the periodic kernel ("periodic"). |
matern_order |
Numeric, defaults to 3/2. Order of Matérn Kernel to use.
Common choices are 1/2, 3/2, and 5/2. If |
matern_type |
Deprecated; Numeric, defaults to 3/2. Order of Matérn Kernel to use. Currently, the orders 1/2, 3/2, 5/2 and Inf are supported. |
w0 |
Numeric, defaults to 1.0. Fundamental frequency for periodic
kernel. They are only used if |
alpha_mean |
Deprecated; use |
alpha_sd |
Deprecated; use |
Value
A <gp_opts>
object of settings defining the Gaussian process
Examples
# default settings
gp_opts()
# add a custom length scale
gp_opts(ls = LogNormal(mean = 4, sd = 1, max = 20))
# use linear kernel
gp_opts(kernel = "periodic")
Convert Growth Rates to Reproduction numbers.
Description
See here # nolint
for justification. Now handled internally by stan so may be removed in
future updates if no user demand.
Usage
growth_to_R(r, gamma_mean, gamma_sd)
Arguments
r |
Numeric, rate of growth estimates. |
gamma_mean |
Numeric, mean of the gamma distribution |
gamma_sd |
Numeric, standard deviation of the gamma distribution . |
Value
Numeric vector of reproduction number estimates
Examples
growth_to_R(0.2, 4, 1)
Generation Time Distribution Options
Description
Returns generation time parameters in a format for lower level model use.
Usage
gt_opts(dist = Fixed(1), default_cdf_cutoff = 0.001, weight_prior = TRUE)
generation_time_opts(
dist = Fixed(1),
default_cdf_cutoff = 0.001,
weight_prior = TRUE
)
Arguments
dist |
A delay distribution or series of delay distributions . If no distribution is given a fixed generation time of 1 will be assumed. If passing a nonparametric distribution the first element should be zero (see Details section) |
default_cdf_cutoff |
Numeric; default CDF cutoff to be used if an
unconstrained distribution is passed as |
weight_prior |
Logical; if TRUE (default), any priors given in |
Details
Because the discretised renewal equation used in the package does not support zero generation times, any distribution specified here will be left-truncated at one, i.e. the first element of the nonparametric or discretised probability distribution used for the generation time is set to zero and the resulting distribution renormalised.
Value
A <generation_time_opts>
object summarising the input delay
distributions.
See Also
convert_to_logmean()
convert_to_logsd()
bootstrapped_dist_fit()
Gamma()
LogNormal()
Fixed()
Examples
# default settings with a fixed generation time of 1
generation_time_opts()
# A fixed gamma distributed generation time
generation_time_opts(Gamma(mean = 3, sd = 2, max = 14))
# An uncertain gamma distributed generation time
generation_time_opts(
Gamma(
shape = Normal(mean = 3, sd = 1),
rate = Normal(mean = 2, sd = 0.5),
max = 14
)
)
# An example generation time
gt_opts(example_generation_time)
Check if a <dist_spec> is constrained, i.e. has a finite maximum or nonzero CDF cutoff.
Description
Usage
## S3 method for class 'dist_spec'
is_constrained(x, ...)
Arguments
x |
A |
... |
ignored |
Value
Logical; TRUE if x
is constrained
Examples
# A fixed gamma distribution with mean 5 and sd 1.
dist1 <- Gamma(mean = 5, sd = 1, max = 20)
# An uncertain lognormal distribution with meanlog and sdlog normally
# distributed as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- LogNormal(
meanlog = Normal(3, 0.5),
sdlog = Normal(2, 0.5),
max = 20
)
# both distributions are constrained and therefore so is the sum
is_constrained(dist1 + dist2)
Choose a parallel or sequential apply function
Description
Internal function that chooses an appropriate "apply"-type function (either
lapply()
or future.apply::future_lapply()
)
Usage
lapply_func(..., backend = "rstan", future.opts = list())
Arguments
... |
Additional parameters to pass to underlying option functions,
|
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
Value
A function that can be used to apply a function to a list
Get the lower bounds of the parameters of a distribution
Description
This is used to avoid sampling parameter values that have no support.
Usage
lower_bounds(distribution)
Arguments
distribution |
Character; the distribution to use. |
Value
A numeric vector, the lower bounds.
Examples
## Not run:
lower_bounds("lognormal")
## End(Not run)
Format Credible Intervals
Description
Combines a list of values into formatted credible intervals.
Usage
make_conf(value, CrI = 90, reverse = FALSE)
Arguments
value |
List of value to map into a string. Requires,
|
CrI |
Numeric, credible interval to report. Defaults to 90. |
reverse |
Logical, defaults to FALSE. Should the reported credible interval be switched. |
Value
A character vector formatted for reporting
Examples
value <- list(median = 2, lower_90 = 1, upper_90 = 3)
make_conf(value)
Categorise the Probability of Change for Rt
Description
Categorises a numeric variable into "Increasing" (< 0.05),
"Likely increasing" (<0.4), "Stable" (< 0.6),
"Likely decreasing" (< 0.95), "Decreasing" (<= 1)
Usage
map_prob_change(var)
Arguments
var |
Numeric variable to be categorised |
Value
A character variable.
Examples
var <- seq(0.01, 1, 0.01)
var
map_prob_change(var)
Match User Supplied Arguments with Supported Options
Description
Match user supplied arguments with supported options and return a logical
list for internal usage.
Usage
match_output_arguments(
input_args = NULL,
supported_args = NULL,
logger = NULL,
level = "info"
)
Arguments
input_args |
A character vector of input arguments (can be partial). |
supported_args |
A character vector of supported output arguments. |
logger |
A character vector indicating the logger to target messages at. Defaults to no logging. |
level |
Character string defaulting to "info". Logging level see documentation of futile.logger for details. Supported options are "info" and "debug". |
Value
A logical vector of named output arguments
Returns the maximum of one or more delay distribution
Description
This works out the maximum of all the (parametric / nonparametric) delay
distributions combined in the passed <dist_spec> (ignoring any uncertainty
in parameters)
Usage
## S3 method for class 'dist_spec'
max(x, ...)
Arguments
x |
The <dist_spec> to use |
... |
Not used |
Value
A vector of means.
Examples
# A fixed gamma distribution with mean 5 and sd 1.
dist1 <- Gamma(mean = 5, sd = 1, max = 20)
max(dist1)
# An uncertain lognormal distribution with meanlog and sdlog normally
# distributed as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- LogNormal(
meanlog = Normal(3, 0.5),
sdlog = Normal(2, 0.5),
max = 20
)
max(dist2)
# The max the sum of two distributions
max(dist1 + dist2)
Returns the mean of one or more delay distribution
Description
This works out the mean of all the (parametric / nonparametric) delay
distributions combined in the passed <dist_spec>.
Usage
## S3 method for class 'dist_spec'
mean(x, ..., ignore_uncertainty = FALSE)
Arguments
x |
The |
... |
Not used |
ignore_uncertainty |
Logical; whether to ignore any uncertainty in parameters. If set to FALSE (the default) then the mean of any uncertain parameters will be returned as NA. |
Examples
# A fixed lognormal distribution with mean 5 and sd 1.
dist1 <- LogNormal(mean = 5, sd = 1, max = 20)
mean(dist1)
# An uncertain gamma distribution with shape and rate normally distributed
# as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- Gamma(
shape = Normal(3, 0.5),
rate = Normal(2, 0.5),
max = 20
)
mean(dist2)
# The mean of the sum of two distributions
mean(dist1 + dist2)
Get the names of the natural parameters of a distribution
Description
These are the parameters used in the stan models. All other parameter
representations are converted to these using
convert_to_natural()
before
being passed to the stan models.
Usage
natural_params(distribution)
Arguments
distribution |
Character; the distribution to use. |
Value
A character vector, the natural parameters.
Examples
## Not run:
natural_params("gamma")
## End(Not run)
Calculate the number of distributions in a <dist_spec>
Description
Calculate the number of distributions in a <dist_spec>
Usage
ndist(x)
Arguments
x |
A |
Value
The number of distributions.
Internal function for generating a dist_spec
given parameters and a
distribution.
Description
This will convert all parameters to natural parameters before generating
a
dist_spec
. If they have uncertainty this will be done using sampling.
Usage
new_dist_spec(params, distribution, max = Inf, cdf_cutoff = 0)
Arguments
params |
Parameters of the distribution (including |
distribution |
Character; the distribution to use. |
max |
Numeric, maximum value of the distribution. The distribution will
be truncated at this value. Default: |
cdf_cutoff |
Numeric; the desired CDF cutoff. Any part of the
cumulative distribution function beyond 1 minus the value of this argument is
removed. Default: |
Value
A dist_spec
of the given specification.
Examples
new_dist_spec(
params = list(mean = 2, sd = 1),
distribution = "normal"
)
Observation Model Options
Description
Defines a list specifying the structure of the observation
model. Custom settings can be supplied which override the defaults.
Usage
obs_opts(
family = c("negbin", "poisson"),
dispersion = Normal(mean = 0, sd = 0.25),
weight = 1,
week_effect = TRUE,
week_length = 7,
scale = Fixed(1),
na = c("missing", "accumulate"),
likelihood = TRUE,
return_likelihood = FALSE,
phi
)
Arguments
family |
Character string defining the observation model. Options are Negative binomial ("negbin"), the default, and Poisson. |
dispersion |
A |
weight |
Numeric, defaults to 1. Weight to give the observed data in the log density. |
week_effect |
Logical defaulting to |
week_length |
Numeric assumed length of the week in days, defaulting to 7 days. This can be modified if data aggregated over a period other than a week or if data has a non-weekly periodicity. |
scale |
A |
na |
Deprecated; use the |
likelihood |
Logical, defaults to |
return_likelihood |
Logical, defaults to |
phi |
deprecated; use |
Value
An <obs_opts>
object of observation model settings.
Examples
# default settings
obs_opts()
# Turn off day of the week effect
obs_opts(week_effect = TRUE)
# Scale reported data
obs_opts(scale = Normal(mean = 0.2, sd = 0.02))
Forecast optiong
Description
Define a list of
_opts()
to pass to regional_epinow()
_opts()
accepting
arguments. This is useful when different settings are needed between regions
within a single regional_epinow()
call. Using opts_list()
the defaults
can be applied to all regions present with an override passed to regions as
necessary (either within opts_list()
or externally).
Usage
opts_list(opts, reported_cases, ...)
Arguments
opts |
An |
reported_cases |
A data frame containing a |
... |
Optional override for region defaults. See the examples for use case. |
Value
A named list of options per region which can be passed to the _opt
accepting arguments of regional_epinow
.
See Also
Examples
# uses example case vector
cases <- example_confirmed[1:40]
cases <- data.table::rbindlist(list(
data.table::copy(cases)[, region := "testland"],
cases[, region := "realland"]
))
# default settings
opts_list(rt_opts(), cases)
# add a weekly random walk in realland
opts_list(rt_opts(), cases, realland = rt_opts(rw = 7))
# add a weekly random walk externally
rt <- opts_list(rt_opts(), cases)
rt$realland$rw <- 7
rt
Plot PMF and CDF for a dist_spec object
Description
This function takes a
<dist_spec>
object and plots its probability mass
function (PMF) and cumulative distribution function (CDF) using {ggplot2}
.
Usage
## S3 method for class 'dist_spec'
plot(x, samples = 50L, res = 1, cumulative = TRUE, ...)
Arguments
x |
A |
samples |
Integer; Number of samples to generate for distributions with uncertain parameters (default: 50). |
res |
Numeric; Resolution of the PMF and CDF (default: 1, i.e. integer discretisation). |
cumulative |
Logical; whether to plot the cumulative distribution in addition to the probability mass function |
... |
ignored |
Examples
# A fixed lognormal distribution with mean 5 and sd 1.
dist1 <- LogNormal(mean = 1.6, sd = 0.5, max = 20)
# Plot discretised distribution with 1 day discretisation window
plot(dist1)
# Plot discretised distribution with 0.01 day discretisation window
plot(dist1, res = 0.01, cumulative = FALSE)
# An uncertain gamma distribution with shape and rate normally distributed
# as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- Gamma(
shape = Normal(3, 0.5),
rate = Normal(2, 0.5),
max = 20
)
plot(dist2)
# Multiple distributions with 0.1 discretisation window and do not plot the
# cumulative distribution
plot(dist1 + dist2, res = 0.1, cumulative = FALSE)
Plot method for epinow
Description
plot
method for class <epinow>
.
Usage
## S3 method for class 'epinow'
plot(x, type = "summary", ...)
Arguments
x |
A list of output as produced by |
type |
A character vector indicating the name of the plot to return. Defaults to "summary" with supported options being "infections", "reports", "R", "growth_rate", "summary", "all". If "all" is supplied all plots are generated. |
... |
Pass additional arguments to report_plots |
Value
List of plots as produced by report_plots()
See Also
plot plot.estimate_infections report_plots estimate_infections
Plot method for estimate_infections
Description
plot
method for class <estimate_infections>
.
Usage
## S3 method for class 'estimate_infections'
plot(
x,
type = c("summary", "infections", "reports", "R", "growth_rate", "all"),
...
)
Arguments
x |
A list of output as produced by |
type |
A character vector indicating the name of the plot to return. Defaults to "summary" with supported options being "infections", "reports", "R", "growth_rate", "summary", "all". If "all" is supplied all plots are generated. |
... |
Pass additional arguments to report_plots |
Value
List of plots as produced by report_plots()
See Also
plot report_plots estimate_infections
Plot method for estimate_secondary
Description
plot
method for class "estimate_secondary".
Usage
## S3 method for class 'estimate_secondary'
plot(x, primary = FALSE, from = NULL, to = NULL, new_obs = NULL, ...)
Arguments
x |
A list of output as produced by |
primary |
Logical, defaults to |
from |
Date object indicating when to plot from. |
to |
Date object indicating when to plot up to. |
new_obs |
A |
... |
Pass additional arguments to plot function. Not currently in use. |
Value
A ggplot
object.
See Also
plot estimate_secondary
Plot method for estimate_truncation
Description
plot()
method for class <estimate_truncation>
. Returns
a plot faceted over each dataset used in fitting with the latest
observations as columns, the data observed at the time (and so truncated)
as dots and the truncation adjusted estimates as a ribbon.
Usage
## S3 method for class 'estimate_truncation'
plot(x, ...)
Arguments
x |
A list of output as produced by |
... |
Pass additional arguments to plot function. Not currently in use. |
Value
ggplot2
object
See Also
plot estimate_truncation
Plot EpiNow2 Credible Intervals
Description
Adds lineranges for user specified credible intervals
Usage
plot_CrIs(plot, CrIs, alpha, linewidth)
Arguments
plot |
A |
CrIs |
Numeric list of credible intervals present in the data. As
produced by |
alpha |
Numeric, overall alpha of the target line range |
linewidth |
Numeric, line width of the default line range. |
Value
A {ggplot2}
plot.
Plot Estimates
Description
Allows users to plot the output from
estimate_infections()
easily.
In future releases it may be depreciated in favour of increasing the
functionality of the S3 plot methods.
Usage
plot_estimates(
estimate,
reported,
ylab,
hline,
obs_as_col = TRUE,
max_plot = 10,
estimate_type = c("Estimate", "Estimate based on partial data", "Forecast")
)
Arguments
estimate |
A |
reported |
A |
ylab |
Character string. Title for the plot y axis. |
hline |
Numeric, if supplied gives the horizontal intercept for a indicator line. |
obs_as_col |
Logical, defaults to |
max_plot |
Numeric, defaults to 10. A multiplicative upper bound on the\ number of cases shown on the plot. Based on the maximum number of reported cases. |
estimate_type |
Character vector indicating the type of data to plot. Default to all types with supported options being: "Estimate", "Estimate based on partial data", and "Forecast". |
Value
A ggplot2
object
Examples
# get example model results
out <- readRDS(system.file(
package = "EpiNow2", "extdata", "example_estimate_infections.rds"
))
# plot infections
plot_estimates(
estimate = out$summarised[variable == "infections"],
reported = out$observations,
ylab = "Cases", max_plot = 2
) + ggplot2::facet_wrap(~type, scales = "free_y")
# plot reported cases estimated via Rt
plot_estimates(
estimate = out$summarised[variable == "reported_cases"],
reported = out$observations,
ylab = "Cases"
)
# plot Rt estimates
plot_estimates(
estimate = out$summarised[variable == "R"],
ylab = "Effective Reproduction No.",
hline = 1
)
#' # plot Rt estimates without forecasts
plot_estimates(
estimate = out$summarised[variable == "R"],
ylab = "Effective Reproduction No.",
hline = 1, estimate_type = "Estimate"
)
Plot a Summary of the Latest Results
Description
Used to return a summary plot across regions (using results generated by
summarise_results()
).
May be depreciated in later releases in favour of enhanced S3 methods.
Usage
plot_summary(summary_results, x_lab = "Region", log_cases = FALSE, max_cases)
Arguments
summary_results |
A data.table as returned by |
x_lab |
A character string giving the label for the x axis, defaults to region. |
log_cases |
Logical, should cases be shown on a logged scale. Defaults
to |
max_cases |
Numeric, no default. The maximum number of cases to plot. |
Value
A {ggplot2}
object
Prints the parameters of one or more delay distributions
Description
This displays the parameters of the uncertain and probability mass
functions of fixed delay distributions combined in the passed <dist_spec>.
Usage
## S3 method for class 'dist_spec'
print(x, ...)
Arguments
x |
The |
... |
Not used |
Value
invisible
Examples
#' # A fixed lognormal distribution with mean 5 and sd 1.
dist1 <- LogNormal(mean = 1.5, sd = 0.5, max = 20)
print(dist1)
# An uncertain gamma distribution with shape and rate normally distributed
# as Normal(3, 0.5) and Normal(2, 0.5) respectively
dist2 <- Gamma(
shape = Normal(3, 0.5), rate = Normal(2, 0.5), max = 20
)
print(dist2)
Process regional estimate
Description
Internal function that removes output that is not required, and returns
logging information.
Usage
process_region(
out,
target_region,
timing,
return_output = TRUE,
return_timing = TRUE,
complete_logger = "EpiNow2.epinow"
)
Arguments
out |
List of output returned by |
target_region |
Character string indicating the region being evaluated |
timing |
Output from |
return_output |
Logical, defaults to FALSE. Should output be returned, this automatically updates to TRUE if no directory for saving is specified. |
return_timing |
Logical, should runtime be returned |
complete_logger |
Character string indicating the logger to output the completion of estimation to. |
Value
A list of processed output
See Also
Process all Region Estimates
Description
Internal function that processes the output from multiple
epinow()
runs,
adds summary logging information.
Usage
process_regions(regional_out, regions)
Arguments
regional_out |
A list of output from multiple runs of
|
regions |
A character vector identifying the regions that have been run |
Value
A list of all regional estimates and successful regional estimates
See Also
Real-time Rt Estimation, Forecasting and Reporting by Region
Description
Efficiently runs
epinow()
across multiple regions in an efficient manner
and conducts basic data checks and cleaning such as removing regions with
fewer than non_zero_points
as these are unlikely to produce reasonable
results whilst consuming significant resources. See the documentation for
epinow()
for further information.
By default all arguments supporting input from _opts()
functions are
shared across regions (including delays, truncation, Rt settings, stan
settings, and gaussian process settings). Region specific settings are
supported by passing a named list of _opts()
calls (with an entry per
region) to the relevant argument. A helper function (opts_list()
) is
available to facilitate building this list.
Regions can be estimated in parallel using the {future}
package (see
setup_future()
). The progress of producing estimates across multiple
regions can be tracked using the {progressr}
package. Modify this behaviour
using progressr::handlers()
and enable it in batch by setting
R_PROGRESSR_ENABLE=TRUE
as an environment variable.
Usage
regional_epinow(
data,
generation_time = gt_opts(),
delays = delay_opts(),
truncation = trunc_opts(),
rt = rt_opts(),
backcalc = backcalc_opts(),
gp = gp_opts(),
obs = obs_opts(),
forecast = forecast_opts(),
stan = stan_opts(),
horizon,
CrIs = c(0.2, 0.5, 0.9),
target_folder = NULL,
target_date,
non_zero_points = 2,
output = c("regions", "summary", "samples", "plots", "latest"),
return_output = is.null(target_folder),
summary_args = list(),
verbose = FALSE,
logs = tempdir(check = TRUE),
...
)
Arguments
data |
A |
generation_time |
A call to |
delays |
A call to |
truncation |
A call to |
rt |
A list of options as generated by |
backcalc |
A list of options as generated by |
gp |
A list of options as generated by |
obs |
A list of options as generated by |
forecast |
A list of options as generated by |
stan |
A list of stan options as generated by |
horizon |
Deprecated; use |
CrIs |
Numeric vector of credible intervals to calculate. |
target_folder |
Character string specifying where to save results (will create if not present). |
target_date |
Date, defaults to maximum found in the data if not specified. |
non_zero_points |
Numeric, the minimum number of time points with non-zero cases in a region required for that region to be evaluated. Defaults to 7. |
output |
A character vector of optional output to return. Supported
options are the individual regional estimates ("regions"), samples
("samples"), plots ("plots"), copying the individual region dated folder into
a latest folder (if |
return_output |
Logical, defaults to FALSE. Should output be returned, this automatically updates to TRUE if no directory for saving is specified. |
summary_args |
A list of arguments passed to |
verbose |
Logical defaults to FALSE. Outputs verbose progress messages
to the console from |
logs |
Character path indicating the target folder in which to store log
information. Defaults to the temporary directory if not specified. Default
logging can be disabled if |
... |
Pass additional arguments to |
Value
A list of output stratified at the top level into regional output and across region output summary output
See Also
epinow()
estimate_infections()
setup_future()
regional_summary()
Examples
# set number of cores to use
old_opts <- options()
options(mc.cores = ifelse(interactive(), 4, 1))
# uses example case vector
cases <- example_confirmed[1:60]
cases <- data.table::rbindlist(list(
data.table::copy(cases)[, region := "testland"],
cases[, region := "realland"]
))
# run epinow across multiple regions and generate summaries
# samples and warmup have been reduced for this example
# for more examples, see the "estimate_infections examples" vignette
def <- regional_epinow(
data = cases,
generation_time = gt_opts(example_generation_time),
delays = delay_opts(example_incubation_period + example_reporting_delay),
rt = rt_opts(prior = LogNormal(mean = 2, sd = 0.2)),
stan = stan_opts(
samples = 100, warmup = 200
),
verbose = interactive()
)
options(old_opts)
Summarise Regional Runtimes
Description
Used internally by
regional_epinow
to summarise region run times.
Usage
regional_runtimes(
regional_output = NULL,
target_folder = NULL,
target_date = NULL,
return_output = FALSE
)
Arguments
regional_output |
A list of output as produced by |
target_folder |
Character string specifying where to save results (will create if not present). |
target_date |
A character string giving the target date for which to extract results (in the format "yyyy-mm-dd"). Defaults to latest available estimates. |
return_output |
Logical, defaults to FALSE. Should output be returned, this automatically updates to TRUE if no directory for saving is specified. |
Value
A data.table of region run times
See Also
regional_summary regional_epinow
Examples
regional_out <- readRDS(system.file(
package = "EpiNow2", "extdata", "example_regional_epinow.rds"
))
regional_runtimes(regional_output = regional_out$regional)
Regional Summary Output
Description
Used to produce summary output either internally in
regional_epinow
or
externally.
Usage
regional_summary(
regional_output = NULL,
data,
results_dir = NULL,
summary_dir = NULL,
target_date = NULL,
region_scale = "Region",
all_regions = TRUE,
return_output = is.null(summary_dir),
plot = TRUE,
max_plot = 10,
...
)
Arguments
regional_output |
A list of output as produced by |
data |
A |
results_dir |
An optional character string indicating the location of the results directory to extract results from. |
summary_dir |
A character string giving the directory in which to store summary of results. |
target_date |
A character string giving the target date for which to extract results (in the format "yyyy-mm-dd"). Defaults to latest available estimates. |
region_scale |
A character string indicating the name to give the regions being summarised. |
all_regions |
Logical, defaults to |
return_output |
Logical, defaults to FALSE. Should output be returned, this automatically updates to TRUE if no directory for saving is specified. |
plot |
Logical, defaults to |
max_plot |
Numeric, defaults to 10. A multiplicative upper bound on the\ number of cases shown on the plot. Based on the maximum number of reported cases. |
... |
Additional arguments passed to |
Value
A list of summary measures and plots
See Also
regional_epinow
Examples
# get example output from regional_epinow model
regional_out <- readRDS(system.file(
package = "EpiNow2", "extdata", "example_regional_epinow.rds"
))
regional_summary(
regional_output = regional_out$regional,
data = regional_out$summary$reported_cases
)
Report plots
Description
Returns key summary plots for estimates. May be depreciated in later
releases as current S3 methods are enhanced.
Usage
report_plots(summarised_estimates, reported, target_folder = NULL, ...)
Arguments
summarised_estimates |
A data.table of summarised estimates containing the following variables: variable, median, bottom, and top. It should also contain the following estimates: R, infections, reported_cases_rt, and r (rate of growth). |
reported |
A |
target_folder |
Character string specifying where to save results (will create if not present). |
... |
Additional arguments passed to |
Value
A named list of ggplot2
objects, list(infections, reports, R, growth_rate, summary)
, which correspond to a summary combination (last
item) and for the leading items.
See Also
plot_estimates()
of
summarised_estimates[variable == "infections"]
,
summarised_estimates[variable == "reported_cases"]
,
summarised_estimates[variable == "R"]
, and
summarised_estimates[variable == "growth_rate"]
, respectively.
Examples
# get example output form estimate_infections
out <- readRDS(system.file(
package = "EpiNow2", "extdata", "example_estimate_infections.rds"
))
# plot infections
plots <- report_plots(
summarised_estimates = out$summarised,
reported = out$observations
)
plots
Provide Summary Statistics for Estimated Infections and Rt
Description
Creates a snapshot summary of estimates. May be removed in later releases as
S3 methods are enhanced.
Usage
report_summary(
summarised_estimates,
rt_samples,
target_folder = NULL,
return_numeric = FALSE
)
Arguments
summarised_estimates |
A data.table of summarised estimates containing the following variables: variable, median, bottom, and top. It should contain the following estimates: R, infections, and r (rate of growth). |
rt_samples |
A data.table containing Rt samples with the following variables: sample and value. |
target_folder |
Character string specifying where to save results (will create if not present). |
return_numeric |
Should numeric summary information be returned. |
Value
A data.table containing formatted and numeric summary measures
Time-Varying Reproduction Number Options
Description
Defines a list specifying the optional arguments for the time-varying
reproduction number. Custom settings can be supplied which override the
defaults.
Usage
rt_opts(
prior = LogNormal(mean = 1, sd = 1),
use_rt = TRUE,
rw = 0,
use_breakpoints = TRUE,
future = "latest",
gp_on = c("R_t-1", "R0"),
pop = 0
)
Arguments
prior |
A |
use_rt |
Logical, defaults to |
rw |
Numeric step size of the random walk, defaults to 0. To specify a
weekly random walk set |
use_breakpoints |
Logical, defaults to |
future |
A character string or integer. This argument indicates how to set future Rt values. Supported options are to project using the Rt model ("project"), to use the latest estimate based on partial data ("latest"), to use the latest estimate based on data that is over 50% complete ("estimate"). If an integer is supplied then the Rt estimate from this many days into the future (or past if negative) past will be used forwards in time. |
gp_on |
Character string, defaulting to "R_t-1". Indicates how the Gaussian process, if in use, should be applied to Rt. Currently supported options are applying the Gaussian process to the last estimated Rt (i.e Rt = Rt-1 * GP), and applying the Gaussian process to a global mean (i.e Rt = R0 * GP). Both should produced comparable results when data is not sparse but the method relying on a global mean will revert to this for real time estimates, which may not be desirable. |
pop |
Integer, defaults to 0. Susceptible population initially present. Used to adjust Rt estimates when otherwise fixed based on the proportion of the population that is susceptible. When set to 0 no population adjustment is done. |
Value
An <rt_opts>
object with settings defining the time-varying
reproduction number.
Examples
# default settings
rt_opts()
# add a custom length scale
rt_opts(prior = LogNormal(mean = 2, sd = 1))
# add a weekly random walk
rt_opts(rw = 7)
Run epinow with Regional Processing Code
Description
Internal function that handles calling
epinow()
. Future work will extend
this function to better handle stan logs and allow the user to modify
settings between regions.
Usage
run_region(
target_region,
generation_time,
delays,
truncation,
rt,
backcalc,
gp,
obs,
stan,
horizon,
CrIs,
data,
target_folder,
target_date,
return_output,
output,
complete_logger,
verbose,
progress_fn = NULL,
...
)
Arguments
target_region |
Character string indicating the region being evaluated |
generation_time |
A call to |
delays |
A call to |
truncation |
A call to |
rt |
A list of options as generated by |
backcalc |
A list of options as generated by |
gp |
A list of options as generated by |
obs |
A list of options as generated by |
stan |
A list of stan options as generated by |
horizon |
Deprecated; use |
CrIs |
Numeric vector of credible intervals to calculate. |
data |
A |
target_folder |
Character string specifying where to save results (will create if not present). |
target_date |
Date, defaults to maximum found in the data if not specified. |
return_output |
Logical, defaults to FALSE. Should output be returned, this automatically updates to TRUE if no directory for saving is specified. |
output |
A character vector of optional output to return. Supported
options are the individual regional estimates ("regions"), samples
("samples"), plots ("plots"), copying the individual region dated folder into
a latest folder (if |
complete_logger |
Character string indicating the logger to output the completion of estimation to. |
verbose |
Logical defaults to FALSE. Outputs verbose progress messages
to the console from |
progress_fn |
Function as returned by |
... |
Pass additional arguments to |
Value
A list of processed output as produced by process_region()
See Also
Save Estimated Infections
Description
Saves output from
estimate_infections
to a target directory.
Usage
save_estimate_infections(
estimates,
target_folder = NULL,
samples = TRUE,
return_fit = TRUE
)
Arguments
estimates |
List of data frames as output by |
target_folder |
Character string specifying where to save results (will create if not present). |
samples |
Logical, defaults to TRUE. Should samples be saved |
return_fit |
Logical, defaults to TRUE. Should the fit stan object be returned. |
Value
No return value, called for side effects
See Also
estimate_infections
Save Observed Data
Description
Saves observed data to a target location if given.
Usage
save_input(data, target_folder)
Arguments
data |
A |
target_folder |
Character string specifying where to save results (will create if not present). |
Value
No return value, called for side effects
Returns the standard deviation of one or more delay distribution
Description
This works out the standard deviation of all the (parametric /
nonparametric) delay distributions combined in the passed <dist_spec>.
If any of the parameters are themselves uncertain then
NA
is returned.
Usage
## S3 method for class 'dist_spec'
sd(x, ...)
Arguments
x |
The <dist_spec> to use |
Value
A vector of standard deviations.
Examples
## Not run:
# A fixed lognormal distribution with sd 5 and sd 1.
dist1 <- LogNormal(mean = 5, sd = 1, max = 20)
sd(dist1)
# A gamma distribution with mean 3 and sd 2
dist2 <- Gamma(mean = 3, sd = 2)
sd(dist2)
# The sd of the sum of two distributions
sd(dist1 + dist2)
## End(Not run)
Secondary Reports Options
Description
Returns a list of options defining the secondary model used in
estimate_secondary()
. This model is a combination of a convolution of
previously observed primary reports combined with current primary reports
(either additive or subtractive). It can optionally be cumulative. See the
documentation of type
for sensible options to cover most use cases and the
returned values of secondary_opts()
for all currently supported options.
Usage
secondary_opts(type = c("incidence", "prevalence"), ...)
Arguments
type |
A character string indicating the type of observation the secondary reports are. Options include:
|
... |
Overwrite options defined by type. See the returned values for all options that can be passed. |
Value
A <secondary_opts>
object of binary options summarising secondary
model used in estimate_secondary()
. Options returned are cumulative
(should the secondary report be cumulative), historic
(should a
convolution of primary reported cases be used to predict secondary reported
cases), primary_hist_additive
(should the historic convolution of primary
reported cases be additive or subtractive), current
(should currently
observed primary reported cases contribute to current secondary reported
cases), primary_current_additive
(should current primary reported cases be
additive or subtractive).
See Also
Examples
# incidence model
secondary_opts("incidence")
# prevalence model
secondary_opts("prevalence")
Set to Single Threading
Description
This function sets the threads used by {data.table}
to 1 in the parent
function and then restores the initial {data.table}
threads when the
function exits. This is primarily used as an internal function inside of
other functions and will generally not be used on its own.
Usage
set_dt_single_thread()
Value
an environment in the parent frame named "dt_settings"
Examples
data.table::setDTthreads(2)
test_function <- function() {
set_dt_single_thread()
print(data.table::getDTthreads())
}
test_function()
data.table::getDTthreads()
Setup Default Logging
Description
Sets up default logging. Usage of logging is currently being explored as the
current setup cannot log stan errors or progress.
Usage
setup_default_logging(
logs = tempdir(check = TRUE),
mirror_epinow = FALSE,
target_date = NULL
)
Arguments
logs |
Character path indicating the target folder in which to store log
information. Defaults to the temporary directory if not specified. Default
logging can be disabled if |
mirror_epinow |
Logical, defaults to FALSE. Should internal logging be
returned from |
target_date |
Date, defaults to maximum found in the data if not specified. |
Value
No return value, called for side effects
Examples
setup_default_logging()
Convert to Data Table
Description
Convenience function that sets the number of
{data.table}
cores to 1 and
maps input to be a {data.table}
Usage
setup_dt(data)
Arguments
data |
A |
Value
A data table
Set up Future Backend
Description
A utility function that aims to streamline the set up
of the required future backend with sensible defaults for most users of
regional_epinow()
. More advanced users are recommended to setup their own
{future}
backend based on their available resources. Running this requires
the {future}
package to be installed.
Usage
setup_future(
data,
strategies = c("multisession", "multisession"),
min_cores_per_worker = 4
)
Arguments
data |
A |
strategies |
A vector length 1 to 2 of strategies to pass to
|
min_cores_per_worker |
Numeric, the minimum number of cores per worker. Defaults to 4 which assumes 4 MCMC chains are in use per region. |
Value
Numeric number of cores to use per worker. If greater than 1 pass to
stan_args = list(cores = "output from setup future")
or use
future = TRUE
. If only a single strategy is used then nothing is returned.
Setup Logging
Description
Sets up
{futile.logger}
logging, which is integrated into {EpiNow2}
.
See the documentation for {futile.logger}
for full details. By default
{EpiNow2}
prints all logs at the "INFO" level and returns them to the
console. Usage of logging is currently being explored as the current
setup cannot log stan errors or progress.
Usage
setup_logging(
threshold = "INFO",
file = NULL,
mirror_to_console = FALSE,
name = "EpiNow2"
)
Arguments
threshold |
Character string indicating the logging level see (?futile.logger for details of the available options). Defaults to "INFO". |
file |
Character string indicating the path to save logs to. By default logs will be written to the console. |
mirror_to_console |
Logical, defaults to |
name |
Character string defaulting to EpiNow2. This indicates the name
of the logger to setup. The default logger for EpiNow2 is called EpiNow2.
Nested options include: Epinow2.epinow which controls all logging for
|
Value
Nothing
Setup Target Folder for Saving
Description
Sets up a folders for saving results
Usage
setup_target_folder(target_folder = NULL, target_date)
Arguments
target_folder |
Character string specifying where to save results (will create if not present). |
target_date |
Date, defaults to maximum found in the data if not specified. |
Value
A list containing the path to the dated folder and the latest folder
Simulate infections using the renewal equation
Description
Simulations are done from given initial infections and, potentially
time-varying, reproduction numbers. Delays and parameters of the observation
model can be specified using the same options as in estimate_infections()
.
Usage
simulate_infections(
R,
initial_infections,
day_of_week_effect = NULL,
generation_time = generation_time_opts(),
delays = delay_opts(),
truncation = trunc_opts(),
obs = obs_opts(),
CrIs = c(0.2, 0.5, 0.9),
backend = "rstan",
seeding_time = NULL,
pop = 0
)
Arguments
R |
a data frame of reproduction numbers (column |
initial_infections |
numeric; the initial number of infections (i.e.
before |
day_of_week_effect |
either |
generation_time |
A call to |
delays |
A call to |
truncation |
A call to |
obs |
A list of options as generated by |
CrIs |
Numeric vector of credible intervals to calculate. |
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
seeding_time |
Integer; the number of days before the first time point
of |
pop |
Integer, defaults to 0. Susceptible population initially present. Used to adjust Rt estimates when otherwise fixed based on the proportion of the population that is susceptible. When set to 0 no population adjustment is done. |
Details
In order to simulate, all parameters that are specified such as the mean and standard deviation of delays or observation scaling, must be fixed. Uncertain parameters are not allowed.
Value
A data.table of simulated infections (variable infections
) and
reported cases (variable reported_cases
) by date.
Examples
R <- data.frame(
date = seq.Date(as.Date("2023-01-01"), length.out = 14, by = "day"),
R = c(rep(1.2, 7), rep(0.8, 7))
)
sim <- simulate_infections(
R = R,
initial_infections = 100,
generation_time = generation_time_opts(
fix_parameters(example_generation_time)
),
delays = delay_opts(fix_parameters(example_reporting_delay)),
obs = obs_opts(family = "poisson")
)
Simulate secondary observations from primary observations
Description
Simulations are done from a given trajectory of primary observations by applying any given delays and observation parameters.
Usage
simulate_secondary(
primary,
day_of_week_effect = NULL,
secondary = secondary_opts(),
delays = delay_opts(),
truncation = trunc_opts(),
obs = obs_opts(),
CrIs = c(0.2, 0.5, 0.9),
backend = "rstan"
)
Arguments
primary |
a data frame of primary reports (column |
day_of_week_effect |
either |
secondary |
A call to |
delays |
A call to |
truncation |
A call to |
obs |
A list of options as generated by |
CrIs |
Numeric vector of credible intervals to calculate. |
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
Details
In order to simulate, all parameters that are specified such as the mean and standard deviation of delays or observation scaling, must be fixed. Uncertain parameters are not allowed.
A function of the same name that was previously based on a reimplementation of that model in R with potentially time-varying scalings and delays is available as 'convolve_and_scale()
Value
A data.table of simulated secondary observations (column secondary
)
by date.
Examples
## load data.table to manipulate `example_confirmed` below
library(data.table)
cases <- as.data.table(example_confirmed)[, primary := confirm]
sim <- simulate_secondary(
cases,
delays = delay_opts(fix_parameters(example_reporting_delay)),
obs = obs_opts(family = "poisson")
)
Stan Laplace algorithm Options
Description
Defines a list specifying the arguments passed to
cmdstanr::laplace()
.
Usage
stan_laplace_opts(backend = "cmdstanr", trials = 10, ...)
Arguments
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
trials |
Numeric, defaults to 10. Number of attempts to use rstan::vb()] before failing. |
... |
Additional parameters to pass to |
Value
A list of arguments to pass to cmdstanr::laplace()
.
Examples
stan_laplace_opts()
Stan Options
Description
Defines a list specifying the arguments passed to underlying stan
backend functions via
stan_sampling_opts()
and stan_vb_opts()
. Custom
settings can be supplied which override the defaults.
Usage
stan_opts(
object = NULL,
samples = 2000,
method = c("sampling", "vb", "laplace", "pathfinder"),
backend = c("rstan", "cmdstanr"),
return_fit = TRUE,
...
)
Arguments
object |
Stan model object. By default uses the compiled package
default if using the "rstan" backend, and the default model obtained using
|
samples |
Numeric, defaults to 2000. Number of posterior samples. |
method |
A character string, defaulting to sampling. Currently supports MCMC sampling ("sampling") or approximate posterior sampling via variational inference ("vb") and, as experimental features if the "cmdstanr" backend is used, approximate posterior sampling with the laplace algorithm ("laplace") or pathfinder ("pathfinder"). |
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
return_fit |
Logical, defaults to TRUE. Should the fit stan model be returned. |
... |
Additional parameters to pass to underlying option functions,
|
Value
A <stan_opts>
object of arguments to pass to the appropriate
rstan functions.
See Also
stan_sampling_opts()
stan_vb_opts()
Examples
# using default of [rstan::sampling()]
stan_opts(samples = 1000)
# using vb
stan_opts(method = "vb")
Stan pathfinder algorithm Options
Description
Defines a list specifying the arguments passed to
cmdstanr::laplace()
.
Usage
stan_pathfinder_opts(backend = "cmdstanr", samples = 2000, trials = 10, ...)
Arguments
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
samples |
Numeric, defaults to 2000. Number of posterior samples. |
trials |
Numeric, defaults to 10. Number of attempts to use rstan::vb()] before failing. |
... |
Additional parameters to pass to |
Value
A list of arguments to pass to cmdstanr::laplace()
.
Examples
stan_laplace_opts()
Stan Sampling Options
Description
Defines a list specifying the arguments passed to either
rstan::sampling()
or cmdstanr::sample()
. Custom settings can be supplied which override the
defaults.
Usage
stan_sampling_opts(
cores = getOption("mc.cores", 1L),
warmup = 250,
samples = 2000,
chains = 4,
control = list(),
save_warmup = FALSE,
seed = as.integer(runif(1, 1, 1e+08)),
future = FALSE,
max_execution_time = Inf,
backend = c("rstan", "cmdstanr"),
...
)
Arguments
cores |
Number of cores to use when executing the chains in parallel, which defaults to 1 but it is recommended to set the mc.cores option to be as many processors as the hardware and RAM allow (up to the number of chains). |
warmup |
Numeric, defaults to 250. Number of warmup samples per chain. |
samples |
Numeric, default 2000. Overall number of posterior samples. When using multiple chains iterations per chain is samples / chains. |
chains |
Numeric, defaults to 4. Number of MCMC chains to use. |
control |
List, defaults to empty. control parameters to pass to
underlying |
save_warmup |
Logical, defaults to FALSE. Should warmup progress be saved. |
seed |
Numeric, defaults uniform random number between 1 and 1e8. Seed of sampling process. |
future |
Logical, defaults to |
max_execution_time |
Numeric, defaults to Inf (seconds). If set wil kill off processing of each chain if not finished within the specified timeout. When more than 2 chains finish successfully estimates will still be returned. If less than 2 chains return within the allowed time then estimation will fail with an informative error. |
backend |
Character string indicating the backend to use for fitting stan models. Supported arguments are "rstan" (default) or "cmdstanr". |
... |
Additional parameters to pass to |
Value
A list of arguments to pass to rstan::sampling()
or
cmdstanr::sample()
.
Examples
stan_sampling_opts(samples = 2000)
Stan Variational Bayes Options
Description
Defines a list specifying the arguments passed to
rstan::vb()
or
cmdstanr::variational()
. Custom settings can be supplied which override the
defaults.
Usage
stan_vb_opts(samples = 2000, trials = 10, iter = 10000, ...)
Arguments
samples |
Numeric, default 2000. Overall number of approximate posterior samples. |
trials |
Numeric, defaults to 10. Number of attempts to use rstan::vb()] before failing. |
iter |
Numeric, defaulting to 10000. Number of iterations to use in
|
... |
Additional parameters to pass to |
Value
A list of arguments to pass to rstan::vb()
or
cmdstanr::variational()
, depending on the chosen backend.
Examples
stan_vb_opts(samples = 1000)
Summarise rt and cases
Description
Produces summarised
<data.frame>
s of output across regions.
Used internally by regional_summary
.
Usage
summarise_key_measures(
regional_results = NULL,
results_dir = NULL,
summary_dir = NULL,
type = "region",
date = "latest"
)
Arguments
regional_results |
A list of dataframes as produced by
|
results_dir |
Character string indicating the directory from which to extract results. |
summary_dir |
Character string the directory into which to save results as a csv. |
type |
Character string, the region identifier to apply (defaults to region). |
date |
A Character string (in the format "yyyy-mm-dd") indicating the date to extract data for. Defaults to "latest" which finds the latest results available. |
Value
A list of summarised Rt, cases by date of infection and cases by date of report
See Also
regional_summary
Summarise Real-time Results
Description
Used internally by
regional_summary
to produce a summary table of results.
May be streamlined in later releases.
Usage
summarise_results(
regions,
summaries = NULL,
results_dir = NULL,
target_date = "latest",
region_scale = "Region"
)
Arguments
regions |
An character string containing the list of regions to extract results for (must all have results for the same target date). |
summaries |
A list of summary |
results_dir |
An optional character string indicating the location of the results directory to extract results from. |
target_date |
A character string indicating the target date to extract results for. All regions must have results for this date. |
region_scale |
A character string indicating the name to give the regions being summarised. |
Value
A list of summary data
Summary output from epinow
Description
summary
method for class "epinow".
Usage
## S3 method for class 'epinow'
summary(
object,
output = c("estimates", "forecast", "estimated_reported_cases"),
date = NULL,
params = NULL,
...
)
Arguments
object |
A list of output as produced by "epinow". |
output |
A character string of output to summarise. Defaults to "estimates" but also supports "forecast", and "estimated_reported_cases". |
date |
A date in the form "yyyy-mm-dd" to inspect estimates for. |
params |
A character vector of parameters to filter for. |
... |
Pass additional summary arguments to lower level methods |
Value
Returns a <data.frame>
of summary output
See Also
summary.estimate_infections epinow
Summary output from estimate_infections
Description
summary
method for class "estimate_infections".
Usage
## S3 method for class 'estimate_infections'
summary(
object,
type = c("snapshot", "parameters", "samples"),
date = NULL,
params = NULL,
...
)
Arguments
object |
A list of output as produced by "estimate_infections". |
type |
A character vector of data types to return. Defaults to
"snapshot" but also supports "parameters", and "samples". "snapshot" return
a summary at a given date (by default the latest date informed by data).
"parameters" returns summarised parameter estimates that can be further
filtered using |
date |
A date in the form "yyyy-mm-dd" to inspect estimates for. |
params |
A character vector of parameters to filter for. |
... |
Pass additional arguments to |
Value
Returns a <data.frame>
of summary output
See Also
summary estimate_infections report_summary
Truncation Distribution Options
Description
Returns a truncation distribution formatted for usage by
downstream functions. See
estimate_truncation()
for an approach to
estimate these distributions.
Usage
trunc_opts(dist = Fixed(0), default_cdf_cutoff = 0.001, weight_prior = FALSE)
Arguments
dist |
A delay distribution or series of delay distributions reflecting
the truncation. It can be specified using the probability distributions
interface in |
default_cdf_cutoff |
Numeric; default CDF cutoff to be used if an
unconstrained distribution is passed as |
weight_prior |
Logical; if TRUE, the truncation prior will be weighted by the number of observation data points, in doing so approximately placing an independent prior at each time step and usually preventing the posteriors from shifting. If FALSE (default), no weight will be applied, i.e. the truncation distribution will be treated as a single parameter. |
Value
A <trunc_opts>
object summarising the input truncation
distribution.
See Also
convert_to_logmean()
convert_to_logsd()
bootstrapped_dist_fit()
Distributions
Examples
# no truncation
trunc_opts()
# truncation dist
trunc_opts(dist = LogNormal(mean = 3, sd = 2, max = 10))
Updates Forecast Horizon Based on Input Data and Target
Description
Makes sure that a forecast is returned for the user specified time period
beyond the target date.
Usage
update_horizon(horizon, target_date, data)
Arguments
horizon |
Numeric, defaults to 7. Number of days into the future to forecast. |
target_date |
Date, defaults to maximum found in the data if not specified. |
data |
A |
Value
Numeric forecast horizon adjusted for the users intention
Update estimate_secondary default priors
Description
This functions allows the user to more easily specify data driven or model
based priors for
estimate_secondary()
from example from previous model
fits using a <data.frame>
to overwrite other default settings. Note that
default settings are still required.
Usage
update_secondary_args(data, priors, verbose = TRUE)
Arguments
data |
A list of data and arguments as returned by |
priors |
A |
verbose |
Logical, defaults to |
Value
A list as produced by create_stan_data()
.
Examples
priors <- data.frame(variable = "frac_obs", mean = 3, sd = 1)
data <- list(obs_scale_mean = 4, obs_scale_sd = 3)
update_secondary_args(data, priors)