The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Title: Quantify and Monetize the Burden of Disease Attributable to Exposure
Version: 0.2.1
Description: This R package has been developed with a focus on air pollution and noise but can applied to other exposures. The initial development has been funded by the European Union project BEST-COST. Disclaimer: It is work in progress and the developers are not liable for any calculation errors or inaccuracies resulting from the use of this package. References (in chronological order): WHO (2003a) "Assessing the environmental burden of disease at national and local levels" https://www.who.int/publications/i/item/9241546204 (accessed October 2025); WHO (2003b) "Comparative quantification of health risks: Conceptual framework and methodological issues" <doi:10.1186/1478-7954-1-1> (accessed October 2025); Miller & Hurley (2003) "Life table methods for quantitative impact assessments in chronic mortality" <doi:10.1136/jech.57.3.200> (accessed October 2025); Steenland & Armstrong (2006) "An Overview of Methods for Calculating the Burden of Disease Due to Specific Risk Factors" <doi:10.1097/01.ede.0000229155.05644.43> (accessed October 2025); Miller (2010) "Report on estimation of mortality impacts of particulate air pollution in London" https://cleanair.london/app/uploads/CAL-098-Mayors-health-study-report-June-2010-1.pdf (accessed October 2025); WHO (2011) "Burden of disease from environmental noise" https://iris.who.int/items/723ab97c-5c33-4e3b-8df1-744aa5bc1c27 (accessed October 2025); Jerrett et al. (2013) "Spatial Analysis of Air Pollution and Mortality in California" <doi:10.1164/rccm.201303-0609OC> (accessed October 2025); GBD 2019 Risk Factors Collaborators (2020) "Global burden of 87 risk factors in 204 countries and territories, 1990–2019" <doi:10.1016/S0140-6736(20)30752-2> (accessed October 2025); VanderWeele (2019) "Optimal Approximate Conversions of Odds Ratios and Hazard Ratios to Risk Ratios" <doi:10.1111/biom.13197> (accessed October 2025); WHO (2020) "Health impact assessment of air pollution: AirQ+ life table manual" https://iris.who.int/bitstream/handle/10665/337683/WHO-EURO-2020-1559-41310-56212-eng.pdf?sequence=1 (accessed October 2025); ETC HE (2022) "Health risk assessment of air pollution and the impact of the new WHO guidelines" https://www.eionet.europa.eu/etcs/all-etc-reports (accessed October 2025); Kim et al. (2022) "DALY Estimation Approaches: Understanding and Using the Incidence-based Approach and the Prevalence-based Approach" <doi:10.3961/jpmph.21.597> (accessed October 2025); Pozzer et al. (2022) "Mortality Attributable to Ambient Air Pollution: A Review of Global Estimates" <doi:10.1029/2022GH000711> (accessed October 2025); Teaching group in EBM (2022) "Evidence-based medicine research helper" https://ebm-helper.cn/en/Conv/HR_RR.html (accessed October 2025).
License: GPL (≥ 3)
Encoding: UTF-8
RoxygenNote: 7.3.3
Imports: dplyr, purrr, tibble, tidyr
Depends: R (≥ 4.2.0)
LazyData: true
Config/testthat/edition: 3
VignetteBuilder: knitr
Suggests: sf, terra, testthat, devtools, knitr, rmarkdown
NeedsCompilation: no
Packaged: 2025-11-06 12:44:41 UTC; castal
Author: Alberto Castro ORCID iD [cre, aut], Axel Luyten ORCID iD [aut], Arno Pauwels ORCID iD [ctb], Liliana Vazquez Fernandez ORCID iD [ctb], Vanessa Gorasso ORCID iD [ctb], Carl Michael Baravelli ORCID iD [ctb], Susanne Breitner ORCID iD [ctb], Maria Lepnurm ORCID iD [ctb], Maria Jose Rueda Lopez ORCID iD [ctb], Iracy Pimenta ORCID iD [ctb], Andreia Novais ORCID iD [ctb], Ana Barbosa ORCID iD [ctb], Joao Vasco Santos ORCID iD [ctb], Anette Kocbach Bolling ORCID iD [ctb]
Maintainer: Alberto Castro <alberto.castrofernandez@swisstph.ch>
Repository: CRAN
Date/Publication: 2025-11-11 10:00:09 UTC

Add meta-information to the data frame containing the input data

Description

This function adds meta-information of the input data within the data frame containing the input data.

Usage

add_info(df, info)

Arguments

df

Data frame containing the input data

info

String or Data frame with one row or Vector of length 1 showing additional information or id for the pollutant.

Value

This function returns a data frame with binding the input data with the info columns (info_ is added to the column names)

Author(s)

Alberto Castro & Axel Luyten


Attribute health impacts to an environmental stressor

Description

This function calculates the attributable health impacts (mortality or morbidity) due to exposure to an environmental stressor (air pollution or noise), using either relative risk (RR) or absolute risk (AR).

Arguments for both RR & AR pathways

Arguments only for RR pathways

Argument for AR pathways

Optional arguments for both RR & AR pathways

Usage

attribute_health(
  approach_risk = "relative_risk",
  exp_central,
  exp_lower = NULL,
  exp_upper = NULL,
  cutoff_central = 0,
  cutoff_lower = NULL,
  cutoff_upper = NULL,
  pop_exp = NULL,
  erf_eq_central = NULL,
  erf_eq_lower = NULL,
  erf_eq_upper = NULL,
  rr_central = NULL,
  rr_lower = NULL,
  rr_upper = NULL,
  rr_increment = NULL,
  erf_shape = NULL,
  bhd_central = NULL,
  bhd_lower = NULL,
  bhd_upper = NULL,
  prop_pop_exp = 1,
  geo_id_micro = "a",
  geo_id_macro = NULL,
  age_group = "all",
  sex = "all",
  dw_central = NULL,
  dw_lower = NULL,
  dw_upper = NULL,
  duration_central = NULL,
  duration_lower = NULL,
  duration_upper = NULL,
  info = NULL,
  population = NULL
)

Arguments

approach_risk

String value specifying the risk method. Options: "relative_risk" (default) or "absolute_risk".

exp_central, exp_lower, exp_upper

Numeric value or numeric vector specifying the exposure level(s) to the environmental stressor and (optionally) the corresponding lower and upper bound of the 95% confidence interval. See Details for more info.

cutoff_central, cutoff_lower, cutoff_upper

Numeric value specifying the exposure cut-off value and (optionally) the corresponding lower and upper 95% confidence interval bounds. Default: 0. See Details for more info.

pop_exp

Numeric vector specifying the absolute size of the population(s) exposed to each exposure category. See Details for more info. Only applicable in AR pathways; always required.

erf_eq_central, erf_eq_lower, erf_eq_upper

String or function specifying the exposure-response function and (optionally) the corresponding lower and upper 95% confidence interval functions. See Details for more info. Required in AR pathways; in RR pathways required only if rr_... argument(s) not specified.

rr_central, rr_lower, rr_upper

Numeric value specifying the central relative risk estimate and (optionally) the corresponding lower and upper 95% confidence interval bounds. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

rr_increment

Numeric value specifying the exposure increment for which the provided relative risk is valid. See Details for more info. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

erf_shape

String value specifying the exposure-response function shape to be assumed. Options (no default): "linear", log_linear", "linear_log", "log_log". Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

bhd_central, bhd_lower, bhd_upper

Numeric value or numeric vector providing the baseline health data of the health outcome of interest in the study population and (optionally) the corresponding lower bound and the upper 95% confidence interval bounds. See Details for more info. Only applicable in RR pathways; always required.

prop_pop_exp

Numeric value or numeric vector specifying the population fraction(s) exposed for each exposure (category). Default: 1. See Details for more info. Only applicable in RR pathways.

geo_id_micro, geo_id_macro

Numeric vector or string vector providing unique IDs of the geographic area considered in the assessment (geo_id_micro) and (optionally) providing higher-level IDs (geo_id_macro) to aggregate the geographic areas at. See Details for more info. Only applicable in assessments with multiple geographic units.

age_group

Numeric vector or string vector providing the age groups considered in the assessment. In case of use in attribute_lifetable)(), it must be a numeric and contain single year age groups. See Details for more info. Optional argument for attribute_health(); needed for attribute_lifetable().

sex

Numeric vector or string vector specifying the sex of the groups considered in the assessment.Optional argument.

dw_central, dw_lower, dw_upper

Numeric value or numeric vector providing the disability weight associated with the morbidity health outcome of interest and (optionally) the corresponding lower bound and the upper 95% confidence interval bounds. Only applicable in assessments of YLD (years lived with disability).

duration_central, duration_lower, duration_upper

Numeric value or numeric vector providing the duration associated with the morbidity health outcome of interest in years and (optionally) the corresponding lower and upper bounds of the 95% confidence interval. Default: 1. See Details for more info. Only applicable in assessments of YLD (years lived with disability).

info

String, data frame or tibble providing information about the assessment. See Details for more info. Optional argument.

population

Numeric vector For attribute_lifetable(), it is an obligatory argument specifying the mid-year populations per age (i.e. age group size = 1 year) for the (first) year of analysis. For attribute_health() it is an optional argument which specifies the population used to calculate attributable impacts rate per 100 000 population. See Details for more info.

Details

What you put in is what you get out

The health metric you put in (e.g. absolute disease cases, deaths per 100 000 population, DALYs, prevalence, incidence, ...) is the one you get out.

Exception: if you enter a disability weight (via the dw_... arguments) the attributable impact will be in YLD.

Function arguments

exp_central, exp_lower, exp_upper

In case of exposure bands enter only one exposure value per band (e.g. the means of the lower and upper bounds of the exposure bands).

cutoff_central, cutoff_lower, cutoff_upper

The cutoff level refers to the exposure level below which no health effects occur in the same unit as the exposure. If exposure categories are used, the length of this input must be the same as in the exp_... argument(s).

pop_exp

Only applicable in AR pathways; always required. In AR pathways the population exposed per exposure category is multiplied with the corresonding category-specific risk to obtain the absolute number of people affected by the health outcome.

erf_eq_central, erf_eq_lower, erf_eq_upper

Required in AR pathways; in RR pathways required only if rr_... arguments not specified. Enter the exposure-response function as a function, e.g. output from stats::splinefun() or stats::approxfun(), or as a string formula, e.g. "3+c+c^2" (with the c representing the concentration/exposure).

If you have x-axis (exposure) and y-axis (relative risk) value pairs of multiple points lying on the the exposure-response function, you could use e.g. stats::splinefun(x, y, method="natural") to derive the exposure-response function (in this example using a cubic spline natural interpolation).

rr_increment

Only applicable in RR pathways. Relative risks from the literature are valid for a specific increment in the exposure, in case of air pollution often 10 or 5 \mu g/m^3).

bhd_central, bhd_lower, bhd_upper

Only applicable in RR pathways. Baseline health data for each exposure category must be entered.

prop_pop_exp

Only applicable in RR pathways. In RR pathways indicates the fraction(s) (value(s) from 0 until and including 1) of the total population exposed to the exposure categories. See equation of the population attributable fraction for categorical exposure below.

geo_id_macro, geo_id_micro

Only applicable in assessments with multiple geographic units. For example, if you provide the names of the municipalities under analysis to geo_id_micro, you might provide to geo_id_macro the corresponding region / canton / province names. Consequently, the vectors fed to geo_id_micro and geo_id_macro must be of the same length.

age_group

Can be either numeric or character. If it is numeric, it refers to the first age of the age group. E.g. c(0, 40, 80) means age groups [0, 40), [40, 80), >=80].

info

Optional argument. Information entered to this argument will be added as column(s) names info_1, info_2, info_... to the results table. These additional columns can be used to further stratify the analysis in a secondary step (see example below).

population

Optional argument. The population entered here is used to determine impact rate per 100 000 population. Note the requirement for the vector length in the paragraph Assessment of multiple geographic units below.

duration_central, duration_lower, duration_upper

Only applicable in assessments of YLD (years lived with disability). If the value of duration_central is 1 year, it refers to the prevalence-based approach, while a value above 1 year to the incidence-based approach (Kim et al. 2022, https://doi.org/10.3961/jpmph.21.597).

Assessment of multiple geographic units

To assess the attributable health impact/burden across multiple geographic units with attribute_health(), you must specify the argument geo_id_micro and (optionally) geo_id_macro, in addition to the other required function arguments.

The length of the input vectors to the function arguments must be:

\text{length input vectors} = \text{number of geo units} \times \text{number of exposure categories}

( \times \text{number of age groups (if entered)} \times \text{number of sex groups (if entered) )}

I.e. there must be one line / observation for each specific combination of geo unit, exposure category, age and sex group.

Alternatively, for those arguments that are independent of location (e.g. approach_risk, rr_..., erf_shape, ...), you can enter a single value, which will be recycled to match the length of the other geo unit-specific input data. Additional categories can be passed on via the info argument.

Equations (relative risk)

The most general equation describing the population attributable fraction (PAF) mathematically is an integral form (GBD 2019 Risk Factors Collaborators 2020, https://doi.org/10.1016/S0140-6736(20)30752-2):

PAF = \frac{\int RR(x)PE(x)dx - 1}{\int RR(x)PE(x)dx}

Where:

x = exposure level

PE(x) = population distribution of exposure

RR(x) = relative risk at exposure level compared to the reference level

If the population exposure is described as a categorical rather than continuous exposure, the integrals in this equation may be converted to sums, resulting in the following equation for the PAF (WHO 2003a, https://www.who.int/publications/i/item/9241546204; WHO 2011, https://iris.who.int/items/723ab97c-5c33-4e3b-8df1-744aa5bc1c27):

PAF = \frac{\sum RR_i \times PE_i - 1}{\sum RR_i \times PE_i}

Where:

i = is the exposure category (e.g. in bins of 1 \mu g/m^3 PM2.5 or 5 dB noise exposure)

PE_i = fraction of population in exposure category i

RR_i = relative risk associated with the mean exposure level in exposure category i compared to the reference level

There is one alternative for the PAF for categorical exposure distribution that is commonly used, which is mathematically equivalent to the equation right above, meaning that numerical estimates based on these equations are identical (WHO 2003b, https://doi.org/10.1186/1478-7954-1-1; WHO 2011, https://iris.who.int/items/723ab97c-5c33-4e3b-8df1-744aa5bc1c27):

PAF = \frac{\sum PE_i(RR_i - 1)}{\sum PE_i(RR_i - 1) + 1}

Where:

i = is the exposure category (e.g. in bins of 1 \mu g/m^3 PM2.5 or 5 dB noise exposure)

PE_i = fraction of population in exposure category i

RR_i = relative risk associated with the mean exposure level in exposure category i compared to the reference level

Finally, if the exposure is provided as the population weighted mean concentration (PWC), the equation for the PAF is reduced to (ETC HE 2022, https://www.eionet.europa.eu/etcs/all-etc-reports:

PAF = \frac{RR_{PWC} - 1}{RR_{PWC}}

Where RR_{PWC} is the relative risk associated with the population weighted mean exposure.

Equation (absolute risk)

N = \sum AR_i\times PE_i

Where:

N = the number of cases of the exposure-specific health outcome that are attributed to the exposure

AR_i = absolute risk associated with the mean exposure level of exposure category i

PE_i = population exposed (absolute number) to exposure levels of exposure category i

Conversion of alternative risk measures to relative risks For conversion of hazard ratios and/or odds ratios to relative risks refer to VanderWeele 2019 (https://doi.org/10.1111/biom.13197) and/or use the conversion tools developed by the Teaching group in EBM in 2022 for hazard ratios (https://ebm-helper.cn/en/Conv/HR_RR.html) and/or odds ratios (https://ebm-helper.cn/en/Conv/OR_RR.html).

Value

This function returns a list containing:

1) health_main (tibble) containing the main results;

2) health_detailed (list) containing detailed (and interim) results.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: attribute lung cancer cases to population-weighted PM2.5 exposure
# using relative risk

results <- attribute_health(
  erf_shape = "log_linear",
  rr_central = 1.369,            # Central relative risk estimate
  rr_increment = 10,             # per \mu g / m^3 increase in PM2.5 exposure
  exp_central = 8.85,            # Central exposure estimate in \mu g / m^3
  cutoff_central = 5,            # \mu g / m^3
  bhd_central = 30747            # Baseline health data: lung cancer incidence
 )

results$health_main$impact_rounded # Attributable cases

# Goal: attribute cases of high annoyance to (road traffic) noise exposure
# using absolute risk

results <- attribute_health(
  approach_risk = "absolute_risk",
  exp_central = c(57.5, 62.5, 67.5, 72.5, 77.5),
  pop_exp = c(387500, 286000, 191800, 72200, 7700),
  erf_eq_central = "78.9270-3.1162*c+0.0342*c^2"
)

results$health_main$impact_rounded # Attributable high annoyance cases

# Goal: attribute disease cases to PM2.5 exposure in multiple geographic
# units, such as municipalities, provinces, countries, ...

results <- attribute_health(
  geo_id_micro = c("Zurich", "Basel", "Geneva", "Ticino"),
  geo_id_macro = c("Ger","Ger","Fra","Ita"),
  rr_central = 1.369,
  rr_increment = 10,
  cutoff_central = 5,
  erf_shape = "log_linear",
  exp_central = c(11, 11, 10, 8),
  bhd_central = c(4000, 2500, 3000, 1500)
)

# Attributable cases (aggregated)
results$health_main$impact_rounded

# Attributable cases (disaggregated)
results$health_detailed$results_raw |> dplyr::select(
  geo_id_micro,
  geo_id_macro,
  impact_rounded
)

# Goal: determine attributable YLD (years lived with disability)
results  <- attribute_health(
  exp_central = 8.85,
  prop_pop_exp = 1,
  cutoff_central = 5,
  bhd_central = 1000,
  rr_central = 1.1,
  rr_increment = 10,
  erf_shape = "log_linear",
  duration_central = 100,
  dw_central = 1,
  info = "pm2.5_yld"
)

results$health_main$impact

# Goal: determine mean attributable health impacts by education level
info <- data.frame(
  education = rep(c("secondary", "bachelor", "master"), each = 5) # education level
)
output_attribute <- attribute_health(
  rr_central = 1.063,
  rr_increment = 10,
  erf_shape = "log_linear",
  cutoff_central =  0,
  exp_central = sample(6:10, 15, replace = TRUE),
  bhd_central = sample(100:500, 15, replace = TRUE),
  geo_id_micro = c(1:nrow(info)), # a vector of (random) unique IDs must be entered
  info = info
)
output_stratified <- output_attribute$health_detailed$results_raw |>
 dplyr::group_by(info_column_1) |>
 dplyr::summarize(mean_impact = mean(impact)) |>
 print()

Attribute premature deaths or YLL to an environmental stressor using a life table approach

Description

This function assesses premature deaths or years of life lost (YLL) attributable to exposure to an environmental stressor using a life table approach.

Usage

attribute_lifetable(
  age_group,
  sex,
  bhd_central,
  bhd_lower = NULL,
  bhd_upper = NULL,
  population,
  health_outcome = NULL,
  min_age = NULL,
  max_age = NULL,
  approach_exposure = "single_year",
  approach_newborns = "without_newborns",
  year_of_analysis,
  time_horizon = NULL,
  exp_central = NULL,
  exp_lower = NULL,
  exp_upper = NULL,
  cutoff_central = 0,
  cutoff_lower = NULL,
  cutoff_upper = NULL,
  erf_eq_central = NULL,
  erf_eq_lower = NULL,
  erf_eq_upper = NULL,
  rr_central = NULL,
  rr_lower = NULL,
  rr_upper = NULL,
  rr_increment = NULL,
  erf_shape = NULL,
  prop_pop_exp = 1,
  geo_id_micro = "a",
  geo_id_macro = NULL,
  info = NULL
)

Arguments

age_group

Numeric vector or string vector providing the age groups considered in the assessment. In case of use in attribute_lifetable)(), it must be a numeric and contain single year age groups. See Details for more info. Optional argument for attribute_health(); needed for attribute_lifetable().

sex

Numeric vector or string vector specifying the sex of the groups considered in the assessment.Optional argument.

bhd_central, bhd_lower, bhd_upper

Numeric value or numeric vector providing the baseline health data of the health outcome of interest in the study population and (optionally) the corresponding lower bound and the upper 95% confidence interval bounds. See Details for more info. Only applicable in RR pathways; always required.

population

Numeric vector For attribute_lifetable(), it is an obligatory argument specifying the mid-year populations per age (i.e. age group size = 1 year) for the (first) year of analysis. For attribute_health() it is an optional argument which specifies the population used to calculate attributable impacts rate per 100 000 population. See Details for more info.

health_outcome

String specifying the desired result of the life table assessment. Options: "deaths" (premature deaths), "yll" (years of life lost).

min_age, max_age

Numberic value specifying the minimum and maximum age for which the exposure will affect the exposed population, respectively. Default min_age: 30. Default max_age: none. See Details for more info.

approach_exposure

String specifying whether exposure is constant or only in one year. Options: "single_year" (default), "constant".

approach_newborns

String specifying whether newborns are to be considered in the years after the year of analysis or not. Options: "without_newborns" (default), "with_newborns". See Details for more info.

year_of_analysis

Numeric value providing the first with exposure to the environmental stressor.

time_horizon

Numeric value specifying the time horizon (number of years) for which the attributable YLL or premature deaths are to be considered. See Details for more info. Optional argument.

exp_central, exp_lower, exp_upper

Numeric value or numeric vector specifying the exposure level(s) to the environmental stressor and (optionally) the corresponding lower and upper bound of the 95% confidence interval. See Details for more info.

cutoff_central, cutoff_lower, cutoff_upper

Numeric value specifying the exposure cut-off value and (optionally) the corresponding lower and upper 95% confidence interval bounds. Default: 0. See Details for more info.

erf_eq_central, erf_eq_lower, erf_eq_upper

String or function specifying the exposure-response function and (optionally) the corresponding lower and upper 95% confidence interval functions. See Details for more info. Required in AR pathways; in RR pathways required only if rr_... argument(s) not specified.

rr_central, rr_lower, rr_upper

Numeric value specifying the central relative risk estimate and (optionally) the corresponding lower and upper 95% confidence interval bounds. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

rr_increment

Numeric value specifying the exposure increment for which the provided relative risk is valid. See Details for more info. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

erf_shape

String value specifying the exposure-response function shape to be assumed. Options (no default): "linear", log_linear", "linear_log", "log_log". Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

prop_pop_exp

Numeric value or numeric vector specifying the population fraction(s) exposed for each exposure (category). Default: 1. See Details for more info. Only applicable in RR pathways.

geo_id_micro, geo_id_macro

Numeric vector or string vector providing unique IDs of the geographic area considered in the assessment (geo_id_micro) and (optionally) providing higher-level IDs (geo_id_macro) to aggregate the geographic areas at. See Details for more info. Only applicable in assessments with multiple geographic units.

info

String, data frame or tibble providing information about the assessment. See Details for more info. Optional argument.

Details

Function arguments

age_group

The numeric values must refer to 1 year age groups, e.g. c(0:99). To convert multi-year/larger age groups to 1 year age groups use the function prepare_lifetable() (see its function documentation for more info).

bhd_central,bhd_lower,bhd_upper

Deaths per age must be inputted with 1 value per age (i.e. age group size = 1 year). There must be greater than or equal to 1 deaths per age to avoid issues during the calculation of survival probabilities.

population

The population data must be inputted with 1 value per age (i.e. age group size = 1 year). The values must be greater than or equal to 1 per age to avoid issues during the calculation of survival probabilities.

Mid-year population of year x can be approximated as the mean of either end-year populations of years x-1 and x or start-of-year populations of years x and x+1. For each age, the inputted values must be greater than or equal to 1 to avoid issues during the calculation of survival probabilities.

approach_newborns

If "with_newborns" is selected, it is assumed that for each year after the year of analysis n babies (population aged 0) are born.

time_horizon

Applicable for the following cases: #'

For example, if 10 is entered one is interested in the impacts of exposure during the year of analysis and the next 9 years (= 10 years in total). Default value: length of the numeric vector specified in the age_group argument.

min_age, max_age The min_age default value 30 implies that all adults aged 30 or older will be affected by the exposure; max_age analogeously specifies the age above which no health effects of the exposure are considered.

Conversion of multi-year to single year age groups

To convert multi-year/larger age groups to 1 year age groups use the function prepare_lifetable() and see its function documentation for more info.

Life table methodology

The life table methodology of attribute_lifetable() follows that of the WHO tool AirQ+, and is described in more detail by Miller & Hurley (2003, https://doi.org/10.1136/jech.57.3.200).

In short, two scenarios are compared: 1) a scenario with the exposure level specified in the function ("exposed scenario") and 2) a scenario with no exposure ("unexposed scenario"). First, the entry and mid-year populations of the (first) year of analysis in the unexposed scenario is determined using modified survival probabilities. Second, age-specific population projections using scenario-specific survival probabilities are done for both scenarios. Third, by subtracting the populations in the unexposed scenario from the populations in the exposed scenario the premature deaths/years of life lost attributable to the exposure are determined.

An expansive life table case study by Miller (2010) is available here: https://cleanair.london/app/uploads/CAL-098-Mayors-health-study-report-June-2010-1.pdf.

Determination of populations in the (first) year of analysis

The entry (i.e. start of year) populations in both scenarios is determined as follows:

entry\_population_{year_1} = midyear\_population_{year_1} + \frac{deaths_{year_1}}{2}

Exposed scenario The survival probabilities in the exposed scenario from start of year i to start of year i+1 are calculated as follows:

prob\_survival = \frac{midyear\_population_i - \frac{deaths_i}{2}}{midyear\_population_i + \frac{deaths_i}{2}}

Analogously, the probability of survival from start of year i to mid-year i:

prob\_survival\_until\_midyear = 1 - \frac{1 - prob\_survival}{2}

Unexposed scenario The survival probabilities in the unexposed scenario are calculated as follows:

First, the age-group specific hazard rate in the exposed scenario is calculated using the inputted age-specific mid-year populations and deaths.

hazard\_rate = \frac{deaths}{mid\_year\_population}

Second, the hazard rate is multiplied with the modification factor (= 1 - PAF) to obtain the age-specific hazard rate in the unexposed scenario.

hazard\_rate\_mod = hazard\_rate \times modification\_factor

Third, the the age-specific survival probabilities (from the start until the end in a given age group) in the unexposed scenario are calculated as follows (cf. Miller & Hurley 2003):

prob\_survival\_mod = \frac{2-hazard\_rate\_mod}{2+hazard\_rate\_mod}

Then the mid-year populations of the (first) year of analysis (year_1) in the unexposed scenario are determined as follows:

First, the survival probabilities from start of year i to mid-year i in the unexposed scenario is calculated as:

prob\_survival\_until\_midyear\_{mod} = 1 - \frac{1 - prob\_survival\_mod}{2}

Second, the mid-year populations of the (first) year of analysis (year_1) in the unexposed scenario is calculated:

midyear\_population\_unexposed_{year_1} = entry\_population_{year_1} \times prob\_survival\_until\_midyear_{mod}

Population projection

Using the age group-specific and scenario-specific survival probabilities calculated above, future populations of each age-group under each scenario are calculated.

Unexposed scenario The entry and mid-year population projections of in the exposed scenario is done as follows:

First, the entry population of year i+1 is calculated (which is the same as the end of year population of year i) by multiplying the entry population of year i and the modified survival probabilities.

entry\_population_{i+1} = entry\_population_i \times prob\_survival\_mod

Second, the mid-year population of year i+1 is calculated.

midyear\_population_{i+1} = entry\_population_{i+1} \times prob\_survival\_until\_midyear

Exposed scenario The population projections for the two possible options of approach_exposure ("single_year" and "constant") for the unexposed scenario are different. In the case of "single_year" exposure, the population projection for the years after the year of exposure is the same as in the unexposed scenario.

In the case of "constant" the population projection is done as follows:

First, the entry population of year i+1 is calculated (which is the same as the end of year population of year i) using the entry population of year i.

entry\_population_{i+1} = entry\_population_i \times prob\_survival

Second, the mid-year population of year i+1 is calculated.

midyear\_population_{i+1} = entry\_population_{i+1} \times prob\_survival\_until\_midyear

Conversion of alternative risk measures to relative risks

For conversion of hazard ratios and/or odds ratios to relative risks refer to https://doi.org/10.1111/biom.13197 and/or use the conversion tool for hazard ratios (https://ebm-helper.cn/en/Conv/HR_RR.html) and/or odds ratios (https://ebm-helper.cn/en/Conv/OR_RR.html).

Value

This function returns a list containing:

1) health_main (tibble) containing the main results;

2) health_detailed (list) containing detailed (and interim) results.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: determine YLL attributable to air pollution exposure during one year
# using the life table approach
results <- attribute_lifetable(
  health_outcome = "yll",
  approach_exposure = "single_year",
  approach_newborns = "without_newborns",
  exp_central = 8.85,
  prop_pop_exp = 1,
  cutoff_central = 5,
  rr_central =  1.118,
  rr_increment = 10,
  erf_shape = "log_linear",
  age_group = exdat_lifetable$age_group,
  sex = exdat_lifetable$sex,
  bhd_central = exdat_lifetable$deaths,
  population = exdat_lifetable$midyear_population,
  year_of_analysis = 2019,
  min_age = 20
)
results$health_main$impact # Attributable YLL

# Goal: determine attributable premature deaths due to air pollution exposure
# during one year using the life table approach
results_pm_deaths <- attribute_lifetable(
  health_outcome = "deaths",
  approach_exposure = "single_year",
  exp_central = 8.85,
  prop_pop_exp = 1,
  cutoff_central = 5,
  rr_central =  1.118,
  rr_increment = 10,
  erf_shape = "log_linear",
  age_group = exdat_lifetable$age_group,
  sex = exdat_lifetable$sex,
  bhd_central = exdat_lifetable$deaths,
  population = exdat_lifetable$midyear_population,
  year_of_analysis = 2019,
  min_age = 20
)
results_pm_deaths$health_main$impact # Attributable premature deaths

# Goal: determine YLL attributable to air pollution exposure (exposure distribution)
# during one year using the life table approach
results <- attribute_lifetable(
  health_outcome = "yll",
  exp_central = rep(c(8, 9, 10), each = 100*2), # each = length of sex or age_group vector
  prop_pop_exp = rep(c(0.2, 0.3, 0.5), each = 100*2), # each = length of sex or age_group vector
  cutoff_central = 5,
  rr_central = 1.118,
  rr_lower = 1.06,
  rr_upper = 1.179,
  rr_increment = 10,
  erf_shape = "log_linear",
  age_group = rep(
    exdat_lifetable$age_group,
    times = 3), # times = number of exposure categories
  sex = rep(
    exdat_lifetable$sex,
    times = 3), # times = number of exposure categories
  population = rep(
    exdat_lifetable$midyear_population,
    times = 3), # times = number of exposure categories
  bhd_central = rep(
    exdat_lifetable$deaths,
    times = 3), # times = number of exposure categories
  year_of_analysis = 2019,
  min_age = 20
)
results$health_main$impact_rounded # Attributable YLL

Attributabe health impact to an environmental stressor

Description

This INTERNAL function calculates the health impacts, mortality or morbidity, of an environmental stressor using a single value for baseline heath data, i.e. without life table.

Usage

attribute_master(
  approach_risk = NULL,
  exp_central,
  exp_lower = NULL,
  exp_upper = NULL,
  cutoff_central = NULL,
  cutoff_lower = NULL,
  cutoff_upper = NULL,
  pop_exp = NULL,
  erf_eq_central = NULL,
  erf_eq_lower = NULL,
  erf_eq_upper = NULL,
  rr_central = NULL,
  rr_lower = NULL,
  rr_upper = NULL,
  rr_increment = NULL,
  erf_shape = NULL,
  bhd_central = NULL,
  bhd_lower = NULL,
  bhd_upper = NULL,
  prop_pop_exp = NULL,
  geo_id_micro = NULL,
  geo_id_macro = NULL,
  age_group = "all",
  sex = "all",
  population = NULL,
  info = NULL,
  dw_central = NULL,
  dw_lower = NULL,
  dw_upper = NULL,
  duration_central = NULL,
  duration_lower = NULL,
  duration_upper = NULL,
  is_lifetable = NULL,
  health_outcome = NULL,
  min_age = NULL,
  max_age = NULL,
  approach_newborns = NULL,
  approach_exposure = NULL,
  year_of_analysis = NULL,
  time_horizon = NULL,
  input_args = NULL
)

Arguments

approach_risk

String value specifying the risk method. Options: "relative_risk" (default) or "absolute_risk".

exp_central, exp_lower, exp_upper

Numeric value or numeric vector specifying the exposure level(s) to the environmental stressor and (optionally) the corresponding lower and upper bound of the 95% confidence interval. See Details for more info.

cutoff_central, cutoff_lower, cutoff_upper

Numeric value specifying the exposure cut-off value and (optionally) the corresponding lower and upper 95% confidence interval bounds. Default: 0. See Details for more info.

pop_exp

Numeric vector specifying the absolute size of the population(s) exposed to each exposure category. See Details for more info. Only applicable in AR pathways; always required.

erf_eq_central, erf_eq_lower, erf_eq_upper

String or function specifying the exposure-response function and (optionally) the corresponding lower and upper 95% confidence interval functions. See Details for more info. Required in AR pathways; in RR pathways required only if rr_... argument(s) not specified.

rr_central, rr_lower, rr_upper

Numeric value specifying the central relative risk estimate and (optionally) the corresponding lower and upper 95% confidence interval bounds. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

rr_increment

Numeric value specifying the exposure increment for which the provided relative risk is valid. See Details for more info. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

erf_shape

String value specifying the exposure-response function shape to be assumed. Options (no default): "linear", log_linear", "linear_log", "log_log". Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

bhd_central, bhd_lower, bhd_upper

Numeric value or numeric vector providing the baseline health data of the health outcome of interest in the study population and (optionally) the corresponding lower bound and the upper 95% confidence interval bounds. See Details for more info. Only applicable in RR pathways; always required.

prop_pop_exp

Numeric value or numeric vector specifying the population fraction(s) exposed for each exposure (category). Default: 1. See Details for more info. Only applicable in RR pathways.

geo_id_micro, geo_id_macro

Numeric vector or string vector providing unique IDs of the geographic area considered in the assessment (geo_id_micro) and (optionally) providing higher-level IDs (geo_id_macro) to aggregate the geographic areas at. See Details for more info. Only applicable in assessments with multiple geographic units.

age_group

Numeric vector or string vector providing the age groups considered in the assessment. In case of use in attribute_lifetable)(), it must be a numeric and contain single year age groups. See Details for more info. Optional argument for attribute_health(); needed for attribute_lifetable().

sex

Numeric vector or string vector specifying the sex of the groups considered in the assessment.Optional argument.

population

Numeric vector For attribute_lifetable(), it is an obligatory argument specifying the mid-year populations per age (i.e. age group size = 1 year) for the (first) year of analysis. For attribute_health() it is an optional argument which specifies the population used to calculate attributable impacts rate per 100 000 population. See Details for more info.

info

String, data frame or tibble providing information about the assessment. See Details for more info. Optional argument.

dw_central, dw_lower, dw_upper

Numeric value or numeric vector providing the disability weight associated with the morbidity health outcome of interest and (optionally) the corresponding lower bound and the upper 95% confidence interval bounds. Only applicable in assessments of YLD (years lived with disability).

duration_central, duration_lower, duration_upper

Numeric value or numeric vector providing the duration associated with the morbidity health outcome of interest in years and (optionally) the corresponding lower and upper bounds of the 95% confidence interval. Default: 1. See Details for more info. Only applicable in assessments of YLD (years lived with disability).

is_lifetable

Boolean INTERNAL argument specifying if the life table approach is applied (TRUE) or not (FALSE)

health_outcome

String specifying the desired result of the life table assessment. Options: "deaths" (premature deaths), "yll" (years of life lost).

min_age, max_age

Numberic value specifying the minimum and maximum age for which the exposure will affect the exposed population, respectively. Default min_age: 30. Default max_age: none. See Details for more info.

approach_newborns

String specifying whether newborns are to be considered in the years after the year of analysis or not. Options: "without_newborns" (default), "with_newborns". See Details for more info.

approach_exposure

String specifying whether exposure is constant or only in one year. Options: "single_year" (default), "constant".

year_of_analysis

Numeric value providing the first with exposure to the environmental stressor.

time_horizon

Numeric value specifying the time horizon (number of years) for which the attributable YLL or premature deaths are to be considered. See Details for more info. Optional argument.

Value

This function returns a list containing:

1) health_main (tibble) containing the main results;

2) health_detailed (list) containing detailed (and interim) results.

Author(s)

Alberto Castro & Axel Luyten


Create a scenario 2 by modifying an existing scenario 1 and determine attributable health impacts in it

Description

This function assesses the attributable health impacts in a new scenario 2 which is obtained by modifying an existing scenario 1. Supply an existing attribute output and specify how scenario 1 should be modified to create scenario 2.

Usage

attribute_mod(
  output_attribute,
  erf_shape = NULL,
  rr_central = NULL,
  rr_lower = NULL,
  rr_upper = NULL,
  rr_increment = NULL,
  erf_eq_central = NULL,
  erf_eq_lower = NULL,
  erf_eq_upper = NULL,
  exp_central = NULL,
  exp_lower = NULL,
  exp_upper = NULL,
  prop_pop_exp = NULL,
  pop_exp = NULL,
  cutoff_central = NULL,
  cutoff_lower = NULL,
  cutoff_upper = NULL,
  bhd_central = NULL,
  bhd_lower = NULL,
  bhd_upper = NULL,
  geo_id_micro = NULL,
  geo_id_macro = NULL,
  age_group = NULL,
  sex = NULL,
  population = NULL,
  info = NULL,
  min_age = NULL,
  max_age = NULL,
  approach_exposure = NULL,
  approach_newborns = NULL,
  year_of_analysis = NULL
)

Arguments

output_attribute

List containing the output of the function attribute() for scenario 1.

erf_shape

String value specifying the exposure-response function shape to be assumed. Options (no default): "linear", log_linear", "linear_log", "log_log". Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

rr_central, rr_lower, rr_upper

Numeric value specifying the central relative risk estimate and (optionally) the corresponding lower and upper 95% confidence interval bounds. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

rr_increment

Numeric value specifying the exposure increment for which the provided relative risk is valid. See Details for more info. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

erf_eq_central, erf_eq_lower, erf_eq_upper

String or function specifying the exposure-response function and (optionally) the corresponding lower and upper 95% confidence interval functions. See Details for more info. Required in AR pathways; in RR pathways required only if rr_... argument(s) not specified.

exp_central, exp_lower, exp_upper

Numeric value or numeric vector specifying the exposure level(s) to the environmental stressor and (optionally) the corresponding lower and upper bound of the 95% confidence interval. See Details for more info.

prop_pop_exp

Numeric value or numeric vector specifying the population fraction(s) exposed for each exposure (category). Default: 1. See Details for more info. Only applicable in RR pathways.

pop_exp

Numeric vector specifying the absolute size of the population(s) exposed to each exposure category. See Details for more info. Only applicable in AR pathways; always required.

cutoff_central, cutoff_lower, cutoff_upper

Numeric value specifying the exposure cut-off value and (optionally) the corresponding lower and upper 95% confidence interval bounds. Default: 0. See Details for more info.

bhd_central, bhd_lower, bhd_upper

Numeric value or numeric vector providing the baseline health data of the health outcome of interest in the study population and (optionally) the corresponding lower bound and the upper 95% confidence interval bounds. See Details for more info. Only applicable in RR pathways; always required.

geo_id_micro, geo_id_macro

Numeric vector or string vector providing unique IDs of the geographic area considered in the assessment (geo_id_micro) and (optionally) providing higher-level IDs (geo_id_macro) to aggregate the geographic areas at. See Details for more info. Only applicable in assessments with multiple geographic units.

age_group

Numeric vector or string vector providing the age groups considered in the assessment. In case of use in attribute_lifetable)(), it must be a numeric and contain single year age groups. See Details for more info. Optional argument for attribute_health(); needed for attribute_lifetable().

sex

Numeric vector or string vector specifying the sex of the groups considered in the assessment.Optional argument.

population

Numeric vector For attribute_lifetable(), it is an obligatory argument specifying the mid-year populations per age (i.e. age group size = 1 year) for the (first) year of analysis. For attribute_health() it is an optional argument which specifies the population used to calculate attributable impacts rate per 100 000 population. See Details for more info.

info

String, data frame or tibble providing information about the assessment. See Details for more info. Optional argument.

min_age, max_age

Numberic value specifying the minimum and maximum age for which the exposure will affect the exposed population, respectively. Default min_age: 30. Default max_age: none. See Details for more info.

approach_exposure

String specifying whether exposure is constant or only in one year. Options: "single_year" (default), "constant".

approach_newborns

String specifying whether newborns are to be considered in the years after the year of analysis or not. Options: "without_newborns" (default), "with_newborns". See Details for more info.

year_of_analysis

Numeric value providing the first with exposure to the environmental stressor.

Details

Please see the function documentation of attribute_health for the methods used.

Value

This function returns a list containing:

1) health_main (tibble) containing the main results;

2) health_detailed (list) containing detailed (and interim) results.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: adjust an existing healthiar scenario and determine the health
# impacts in the modified scenario

## First create a scenario to be modified
scenario_A <- attribute_health(
  exp_central = 8.85,   # EXPOSURE 1
  cutoff_central = 5,
  bhd_central = 25000,
  approach_risk = "relative_risk",
  erf_shape = "log_linear",
  rr_central = 1.118,
  rr_increment = 10
)

scenario_A$health_main$impact # Attributable impact in scenario A

## Modify scenario (adjust exposure value)
scenario_B <- attribute_mod(
  output_attribute = scenario_A,
  exp_central = 6       # EXPOSURE 2
)

scenario_B$health_main$impact # Attributable impact in scenario B

Cost-benefit analysis

Description

This function performs a cost-benefit analysis

Usage

cba(
  output_attribute = NULL,
  impact_benefit = NULL,
  valuation,
  cost,
  discount_rate_benefit = NULL,
  discount_rate_cost = NULL,
  inflation_rate = NULL,
  discount_shape = "exponential",
  n_years_benefit = 1,
  n_years_cost = 1
)

Arguments

output_attribute

List produced by healthiar::attribute() or healthiar::compare() as results.

impact_benefit

Numeric value referring to the positive health impact as result of a reduction of harmful exposure.

valuation

Numberic value referring to unit value of a health impact.

cost

Numeric value referring to the investment cost to achieve the reduction of exposure.

discount_rate_benefit, discount_rate_cost

Numeric value referring to the the discount rate used in the benefit and the cost side (respectively). Their values determine the approach of cost-benefit analysis: direct approach (if the same discount_rate is used for cost and benefit) and indirect approach (different discount rates).

inflation_rate

Numeric value between 0 and 1 referring to the annual inflation (increase of prices). Only to be entered if nominal (not real) discount rate is entered in the function. Default value = NULL (assuming no nominal discount rate).

discount_shape

String referring to the assumed equation for the discount factor. By default: "exponential". Otherwise: "hyperbolic_harvey_1986" or "hyperbolic_mazur_1987".

n_years_benefit, n_years_cost

Numeric value referring to number of years in the future to be considered in the benefit and cost side (respectively). Years for discounting and/or inflation. Be aware that the year 0 (without discounting/inflation, i.e. the present) is not be counted here. If a vector is entered in the argument impact, n_years does not need to be entered (length of impact = n_years + 1)

Details

Equation cost-benefit analysis

net\_benefit = benefit - cost

cost\_benefit\_ratio = \frac{benefit}{cost}

return\_on\_investment = \frac{benefit - cost}{cost } \times 100

For the equations regarding the monetization of the cost and the benefit please see the function documentation of monetize().

Value

This function returns a list containing:

1) cba_main (tibble) containing the main CBA results;

2) cba_detailed (list) containing detailed (and interim) results.

If the argument output_attribute was specified, then the two results elements are added to the existing output.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: performs a cost-benefit analysis using an existing output
# of a attribute_... function

output_attribute <- attribute_health(
  erf_shape = "log_linear",
  rr_central = 1.369,
  rr_increment = 10,
  exp_central = 8.85,
  cutoff_central = 5,
  bhd_central = 30747
)

results <- cba(
  output_attribute = output_attribute,
  valuation = 50000,
  cost = 100000000,
  discount_shape = "exponential",
  discount_rate_benefit = 0.03,
  discount_rate_cost = 0.03,
  n_years_benefit = 5,
  n_years_cost = 5
)

results$cba_main |>
  dplyr::select(benefit, cost, net_benefit)

Check if arguments are identical before stop

Description

This function checks if two different sets of arguments are identical

Usage

check_if_args_identical(args_a, args_b, names_to_check)

Arguments

args_a

List first list of arguments

args_b

List second list of arguments

names_to_check

Vector with the names of arguments to be checked

Author(s)

Alberto Castro & Axel Luyten


Collapse rows by grouping columns

Description

This function paste rows with different values and keep only unique rows following grouping columns

Usage

collapse_df_by_group(
  df,
  group_col_names,
  multi_value_col_names = NULL,
  ci_col_names = NULL,
  only_unique_rows = TRUE
)

Arguments

df

Data frame or tibble containing the data

group_col_names

String vector containing the column names in df that serve as grouping columns.

multi_value_col_names

String vector containing the columns names in df that do not have a unique value (but different values).

ci_col_names

String vector containing the column names in df that refer to confidence interval bounds (suffix _ci).

only_unique_rows

Boolean that determines if the final data frame or tibble must only keep unique rows

Value

This function returns a data frame or tibble with lower number rows after collapsing

Author(s)

Alberto Castro & Axel Luyten


Compare the attributable health impacts between two scenarios

Description

This function calculates the health impacts between two scenarios (e.g. before and after a intervention in a health impact assessments) using either the delta or pif approach.

Usage

compare(
  output_attribute_scen_1,
  output_attribute_scen_2,
  approach_comparison = "delta"
)

Arguments

output_attribute_scen_1

Scenario 1 as in the output of attribute()

output_attribute_scen_2

Scenario 2 as in the output of attribute()

approach_comparison

String showing the method of comparison. Options: "delta" or "pif".

Details

Note that the PIF comparison approach assumes same baseline health data for scenario 1 and 2 (e.g. comparison of two scenarios at the same time).

Equations population impact fraction (PIF)

The Population Impact Fraction (PIF) is defined as the proportional change in disease or mortality when exposure to a risk factor is changed (for instance due to an intervention). The most general equation describing this mathematically is an integral form (WHO 2003a, https://www.who.int/publications/i/item/9241546204; WHO 2003b, https://doi.org/10.1186/1478-7954-1-1):

PIF = \frac{\int RR(x)PE(x)dx - \int RR(x)PE'(x)dx}{\int RR(x)PE(x)dx}

Where:

x = exposure level

PE(x) = population distribution of exposure

PE'(x) = alternative population distribution of exposure

RR(x) = relative risk at exposure level compared to the reference level

If the population exposure is described as a categorical rather than continuous exposure, the integrals in equation (5) may be converted to sums, resulting in the following equations for the PIF (WHO 2003a, https://www.who.int/publications/i/item/9241546204; WHO 2003b, https://doi.org/10.1186/1478-7954-1-1):

PIF = \frac{\sum RR_{i} \times PE_{i} - \sum RR_{i}PE'_{i}}{\sum RR_{i}PE_{i}}

Where:

i = is the exposure category (e.g. in bins of 1 \mu g/m^3 PM2.5 or 5 dB noise exposure)

PE_i = fraction of population in exposure category i

PE'_i = fraction of population in category i for alternative (ideal) exposure scenario

RR_i = relative risk for exposure category level i compared to the reference level

Finally, if the exposure is provided as the population weighted mean concentration (PWC), the equation for the PIF is reduced to:

PIF = \frac{RR_{PWC} - RR_{alt PWC}}{RR_{PWC}}

Where:

RR_{PWC} = relative risk associated with the population weighted mean exposure

RR_{PWC} = relative risk associated with the population weighted mean for the alternative exposure scenario

Delta comparison approach

With the delta comparison the difference between two scenarios is obtained by subtraction. The delta approach is suited for all comparison cases, and specifically for comparison of a situation now with a situation in the future.

Value

This function returns a list containing:

1) health_main (tibble) containing the main results from the comparison;

2) health_detailed (list) containing detailed (and interim) results from the comparison.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: comparison of two scenarios with delta approach
scenario_A <- attribute_health(
  exp_central = 8.85,   # EXPOSURE 1
  cutoff_central = 5,
  bhd_central = 25000,
  approach_risk = "relative_risk",
  erf_shape = "log_linear",
  rr_central = 1.118,
  rr_increment = 10
)
scenario_B <- attribute_health(
  exp_central = 6,     # EXPOSURE 2
  cutoff_central = 5,
  bhd_central = 25000,
  approach_risk = "relative_risk",
  erf_shape = "log_linear",
  rr_central = 1.118,
  rr_increment = 10
)
results <- compare(
approach_comparison = "delta",
output_attribute_scen_1 = scenario_A,
output_attribute_scen_2 = scenario_B
)
# Inspect the difference, stored in the \code{impact} column
results$health_main |>
  dplyr::select(impact, impact_scen_1, impact_scen_2) |>
  print()

# Goal: comparison of two scenarios with population impact fraction (pif) approach
output_attribute_scen_1 <- attribute_health(
  exp_central = 8.85,   # EXPOSURE 1
  cutoff_central = 5,
  bhd_central = 25000,
  approach_risk = "relative_risk",
  erf_shape = "log_linear",
  rr_central = 1.118, rr_lower = 1.060, rr_upper = 1.179,
  rr_increment = 10
)
output_attribute_scen_2 <- attribute_health(
  exp_central = 6,      # EXPOSURE 2
  cutoff_central = 5,
  bhd_central = 25000,
  approach_risk = "relative_risk",
  erf_shape = "log_linear",
  rr_central = 1.118, rr_lower = 1.060, rr_upper = 1.179,
  rr_increment = 10
)
results <- compare(
  output_attribute_scen_1 = output_attribute_scen_1,
  output_attribute_scen_2 = output_attribute_scen_2,
  approach_comparison = "pif"
)
# Inspect the difference, stored in the impact column
results$health_main$impact

Compile input

Description

This function compiles the input data of the main function and calculates the population attributable fraction based on the input data (all in one data frame)

Usage

compile_input(input_args, is_lifetable)

Arguments

input_args

List with all input data by argument

is_lifetable

Boolean INTERNAL argument specifying if the life table approach is applied (TRUE) or not (FALSE)

Value

This function returns a data.frame with all input data together Moreover, the data frame includes columns such as:

Author(s)

Alberto Castro & Axel Luyten


Attributable disability-adjusted life years

Description

This function calculates the disability-adjusted life years (DALY) attributable to the exposure to an environmental stressor by adding the two DALY components YLL and YLD.

Usage

daly(output_attribute_yll, output_attribute_yld)

Arguments

output_attribute_yll, output_attribute_yld

variable containing YLL or YLD results of a attribute_...() function call, respectively.

Value

This function returns a list containing:

1) health_main (tibble) containing the main results;

2) health_detailed (list) containing detailed (and interim) results.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: obtain DALY (disability-adjusted life years) from two existing \code{attribute_...} outputs
# Step 1: Create YLL (years of life lost) assessment
results_yll <- attribute_lifetable(
  health_outcome = "yll",
  approach_exposure = "single_year",
  approach_newborns = "without_newborns",
  exp_central = 8.85,
  prop_pop_exp = 1,
  cutoff_central = 5,
  rr_central =  1.118,
  rr_increment = 10,
  erf_shape = "log_linear",
  age_group = exdat_lifetable$age_group,
  sex = exdat_lifetable$sex,
  bhd_central = exdat_lifetable$deaths,
  population = exdat_lifetable$midyear_population,
  year_of_analysis = 2019,
  min_age = 20
)
# Step 2: Create YLD (years lived with disability) assessment
results_yld  <- attribute_health(
  exp_central = 8.85,
  prop_pop_exp = 1,
  cutoff_central = 5,
  bhd_central = 1000,
  rr_central = 1.1,
  rr_increment = 10,
  erf_shape = "log_linear",
  duration_central = 100,
  dw_central = 0.5,
  info = "pm2.5_yld"
)
# Step 3: obtain DALY
results <- daly(
  output_attribute_yll = results_yll,
  output_attribute_yld = results_yld
)
# Attributable impact in DALY
results$health_main |>
  dplyr::select(impact, impact_yll, impact_yld)

Discount health impacts

Description

This function calculates discounted health impacts (without valuation).

Usage

discount(
  output_attribute = NULL,
  impact = NULL,
  discount_rate = NULL,
  n_years = 1,
  discount_shape = NULL,
  inflation_rate = NULL
)

Arguments

output_attribute

List produced by healthiar::attribute() or healthiar::compare() as results.

impact

Numberic value referring to the health impacts to be monetized (without attribute function). If a Numberic vector is entered multiple assessments (by year) will be carried out. Be aware that the value for year 0 (current) must be entered, while n_years does not include the year 0. Thus, length of impact = n_years + 1.

discount_rate

Numeric value showing the discount rate for future years. If it is a nominal discount rate, no inflation is to be entered. If it is a real discount rate, the result can be adjusted by entering inflation in this function.

n_years

Numeric value referring to number of years in the future to be considered in the discounting and/or inflation. Be aware that the year 0 (without discounting/inflation, i.e. the present) is not be counted here. If a vector is entered in the argument impact, n_years does not need to be entered (length of impact = n_years + 1).

discount_shape

String referring to the assumed equation for the discount factor. By default: "exponential". Otherwise: "hyperbolic_harvey_1986" or "hyperbolic_mazur_1987".

inflation_rate

Numeric value between 0 and 1 referring to the annual inflation (increase of prices). Only to be entered if nominal (not real) discount rate is entered in the function. Default value = NULL (assuming no nominal discount rate).

Value

This function returns a list containing:

1) monetization_main (tibble) containing the main monetized results;

2) monetization_detailed (list) containing detailed (and interim) results.

If the argument output_attribute was specified, then the two results elements are added to the existing output.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: discount attributable health impacts
results <- discount(
  impact = 20000,
  discount_shape = "exponential",
  discount_rate = 0.03,
  n_years = 20
)
results$monetization_main$monetized_impact

PM2.5 exposure and COPD incidence in Switzerland

Description

This tibble contains PM2.5 exposure and COPD incidence data from Switzerland.

Usage

data(exdat_cantons)

Format

exdat_cantons

year

year

canton

abbreviation of Swiss cantons

lung_cancer_incidence

lung cancer incidence

exposure

mean country-wide population-weighted exposure level

pollutant

PM2.5

exposure_type

exposure type

population

number of inhabitants per canton

rr

central relative risk estimate

rr_l

lower 95% confidence interval bound of the relative risk estimate

rr_u

upper 95% confidence interval bound of the relative risk estimate

increment

exposure increment in \mu g/m^3 for which the relative risk estimates are valid

function_shape

shape of the exposure-response function

cutoff

cutoff level below which no health effects are attributable to the exposure

language_main

language spoken by the majority of inhabitants in the canton

canton_long

full (English) name of the canton

Author(s)

Alberto Castro & Axel Luyten

Source

Real-world data


Population data per age and sex in Switzerland

Description

This tibble contains population per age and sex for Switzerland.

Usage

data(exdat_lifetable)

Format

exdat_lifetable

age_group

single year age groups

sex

female or male

midyear_population

mid-year populations

deaths

annual deaths

Author(s)

Alberto Castro & Axel Luyten

Source

Real-world data


Noise exposure in urban and rural regions in Norway

Description

This tibble contains noise exposure data from urban and rural regions in Norway.

Usage

data(exdat_noise)

Format

exdat_noise

exposure_category

noise exposure range of the exposure category

exposure_mean

mean noise exposure in the exposure category

region

region for which exposure is valid

exposed

number of exposed persons

Author(s)

Anette Kocbach Bolling & Vázquez Fernández

Source

Real-world data


PM2.5 exposure and COPD incidence in Switzerland

Description

This tibble contains modelled ozone (O_3) exposure and chronic obstructive pulmonary disease (COPD) incidence data from the Germany in 2016.

Usage

data(exdat_ozone)

Format

exdat_ozone

pollutant

O_3

exposure

mean exposure level in the exposure category

exp_unit

unit of the exposure

proportion_population_exposed

proportion of the total population exposed to each exposure category

mortality_copd_tota_yearl

mortality due to chronic obstructive pulmonary disease (ICD-10 J40-44)

rr_central

central relative risk estimate

rr_lower

lower 95% confidence interval bound of the relative risk estimate

rr_upper

upper 95% confidence interval bound of the relative risk estimate

rr_increment

exposure increment in \mu g/m^3 for which the relative risk estimates are valid

cutoff

cutoff level below which no health effects are attributable to the exposure

erf_shape

shape of the exposure-response function

exposure_type

exposure type

rr_source

source of the relative risk estimates

country

country

year

year of the data

Author(s)

Alberto Castro & Axel Luyten

Source

Real-world data


PM2.5 exposure and COPD incidence in Switzerland

Description

This tibble contains PM2.5 exposure and COPD incidence data from Switzerland.

Usage

data(exdat_pm)

Format

exdat_pm

year_of_analysis

year that the exposure and incidence data is from

mean_concentration

population-weighted annual mean concentration

relative_risk

central relative risk estimate

relative_risk_lower

lower 95% confidence interval bound of the relative risk estimate

relative_risk_upper

upper 95% confidence interval bound of the relative risk estimate

erf_shape

shape of the exposure-response function

incidence

COPD incidence in the year of analysis

cutoff_value

cut-off value

rr_increment

exposure increment in \mu g/m^3 for which the relative risk estimates are valid

rr_source

source of the relative risk

rr_doi

DOI linking to the publication from which the relative risk was taken

Author(s)

Alberto Castro & Axel Luyten

Source

Real-world data


Social indicators of the BEST-COST Multidimensional Deprivation Index (MDI)

Description

This tibble contains social indicators of the BEST-COST Multidimensional Deprivation Index (MDI) of geo units in Belgium.

Usage

data(exdat_prepare_mdi)

Format

exdat_prepare_mdi

id

id of the geographic unit

geo_name

name of the geographic unit

edu, unemployed, single_parent, no_heating, pop_change

single social indicators that make up the MDI

norm_...

normalized single social indicators of the MDI

MDI

BEST-COST Multidimensional Deprivation Index (MDI)

MDI_decile

decile of the MDI rankig

MDI_quartile

quartile of the MDI ranking

Author(s)

Arno Pauwels & Vanessa Gorasso

Source

Real-world data


Municipalities in Belgium ranked by BEST-COST Multidimensional Deprivation Index (MDI)

Description

This tibble contains data for municipalities in Belgium ranked by BEST-COST Multidimensional Deprivation Index (MDI).

Usage

data(exdat_socialize)

Format

exdat_socialize

CS01012020

unique identifier of the geographic unit

NUTS1

NUTS1 region tag

PM25_MEAN

mean PM2.5 exposure

RR

relative risk estimate from the literature

score

BEST-COST Multidimensional Deprivation Index (MDI)

rank

rank of the observation based on column score; note that the rank is not continuous, as some observations are missing

deciles

deciles of the geo units based on the MDI

POPULATION_below_40

(fake) populations up until and including 39 years of age

POPULATION_40_plus

(fake) populations from 40 years of age onwards

MORTALITY_below_40

(fake) mortality up until and including 39 years of age

MORTALITY_40_plus

(fake) mortality from 40 years of age onwards

Author(s)

Arno Pauwels & Vanessa Gorasso

Source

Real-world data combined with fake populatoin and mortality data


Find joining columns

Description

This function finds columns in two data frames that have same name and identical values (excluding exceptions).- The resulting joining columns are used to dplyr::x_join data frames and add the vector to by=.

Usage

find_joining_columns(df_1, df_2, except = NULL)

Arguments

df_1

First Data frame

df_2

Second Data frame

except

Vector of strings showing columns that have to be excluded although they fulfill the inclusion criteria (same name and value)

Value

This function returns a vector of strings

Author(s)

Alberto Castro & Axel Luyten


Find columns with multiple values

Description

This function find data frame or tibble column names with different values in their rows (i.e. not a unique value)

Usage

find_multi_value_col_names(df, group_col_names = NULL)

Arguments

df

Data frame or tibble containing the data

group_col_names

String vector that refers to the column names in df that serve as grouping columns.

Value

This function returns a string vector with the names of the columns with multiple values

Author(s)

Alberto Castro & Axel Luyten


Get discount factor

Description

This function calculates the discount factor based on discount rate. If the argument inflation_rate is NULL (default), it is assumed that the discount rate is already corrected for inflation). Otherwise (if a value for inflation_rate is entered), the resulted discount factor is adjusted for inflation.

Usage

get_discount_factor(
  discount_rate,
  n_years,
  discount_shape = "exponential",
  inflation_rate = NULL
)

Arguments

discount_rate

Numeric value showing the discount rate for future years. If it is a nominal discount rate, no inflation is to be entered. If it is a real discount rate, the result can be adjusted by entering inflation in this function.

n_years

Numeric value referring to number of years in the future to be considered in the discounting and/or inflation. Be aware that the year 0 (without discounting/inflation, i.e. the present) is not be counted here. If a vector is entered in the argument impact, n_years does not need to be entered (length of impact = n_years + 1).

discount_shape

String referring to the assumed equation for the discount factor. By default: "exponential". Otherwise: "hyperbolic_harvey_1986" or "hyperbolic_mazur_1987".

inflation_rate

Numeric value between 0 and 1 referring to the annual inflation (increase of prices). Only to be entered if nominal (not real) discount rate is entered in the function. Default value = NULL (assuming no nominal discount rate).

Details

Equations discount factors (without inflation)

Exponential discounting (no inflation)

discount\_factor = \frac{1}{(1 + discount\_rate) ^{n\_years}}

Hyperbolic discounting Harvey (no inflation)

discount\_factor = \frac{1}{(1 + n\_years)^{discount\_rate}}

Hyperbolic discounting Mazure (no inflation)

discount\_factor = \frac{1}{(1 + (discount\_rate \times n\_years)}

Equations discount factors with inflation

Exponential discounting (with inflation)

discount\_and\_inflation\_factor = \frac{1}{((1 + discount\_rate) \times (1 + inflation\_rate)) ^{n\_years}}

Hyperbolic discounting Harvey (with inflation)

discount\_and\_inflation\_factor = \frac{1}{(1 + n\_years)^{discount\_rate} \times (1 + inflation\_rate)^{n\_years}}

Hyperbolic discounting Mazure (with inflation)

discount\_and\_inflation\_factor = \frac{1}{(1 + (discount\_rate \times n\_years) \times (1 + inflation\_rate)^{n\_years}}

Value

This function returns the numeric discount factor.

Author(s)

Alberto Castro & Axel Luyten

Examples

get_discount_factor(
  discount_rate = 0.07,
  n_years = 5
 )

Attributable health cases based on relative risk

Description

This function calculates the health impacts for each uncertainty and geo area.

Usage

get_impact(input_table, pop_fraction_type)

Arguments

input_table

Data frame containing all input data.

Value

TBD. E.g. This function returns a data.frame with one row for each value of the concentration-response function (i.e. central, lower and upper bound confidence interval. Moreover, the data frame includes columns such as:

Author(s)

Alberto Castro & Axel Luyten


Get population impact over time

Description

Get population impact over time

Usage

get_impact_with_lifetable(input_with_risk_and_pop_fraction)

Arguments

input_with_risk_and_pop_fraction

Data frame with the input data (including risk and population fraction)

Value

This function returns a data.frame with one row for each value of the concentration-response function (i.e. central estimate, lower and upper bound confidence interval). Moreover, the data frame include columns such as:

Author(s)

Alberto Castro & Axel Luyten


Get inflation factor

Description

This function calculates the inflation factor based on inflation rate.

Usage

get_inflation_factor(n_years, inflation_rate = NULL)

Arguments

n_years

Numeric value referring to number of years in the future to be considered in the discounting and/or inflation. Be aware that the year 0 (without discounting/inflation, i.e. the present) is not be counted here. If a vector is entered in the argument impact, n_years does not need to be entered (length of impact = n_years + 1).

inflation_rate

Numeric value between 0 and 1 referring to the annual inflation (increase of prices). Only to be entered if nominal (not real) discount rate is entered in the function. Default value = NULL (assuming no nominal discount rate).

Details

Equation inflation factor (without discounting)

inflation\_factor = (1 + inflation\_rate)^{n\_years}

Value

This function returns the numeric inflation factor.

Author(s)

Alberto Castro & Axel Luyten

Examples

get_inflation_factor(
  inflation_rate = 0.02,
  n_years = 5
)

get_input_args

Description

This function compiles the input data of the main function and calculates the population attributable fraction based on the input data (all in one data frame)

Usage

get_input_args(environment, call)

Value

This function returns a list with all input data together by argument

Author(s)

Alberto Castro & Axel Luyten


Obtain and store output

Description

This function distributes and store outputs by level of detail by aggregating or filtering impacts.

Usage

get_output(
  input_args = NULL,
  input_table = NULL,
  intermediate_calculations = NULL,
  results_raw
)

Arguments

input_args

List containingall arguments and values entered in attribute().

input_table

Tibble containing the input_table data compiled and packed in a data frame.

intermediate_calculations

Tibble containing intermediate calculations (e.g. from life table pathway).

results_raw

Tibble containing all the calculation of health impacts.

Value

TBD. E.g. This function returns a data.frame with one row for each value of the concentration-response function (i.e. central, lower and upper bound confidence interval. Moreover, the data frame includes columns such as:

Author(s)

Alberto Castro & Axel Luyten


Get population attributable fraction

Description

This function calculates the population attributable fraction (PAF) of a health outcome due to exposure to an environmental stressor

Usage

get_paf(rr_at_exp, prop_pop_exp)

Arguments

rr_at_exp

Numerical value Risk estimate of the concentration response function for a specific concentration. The population attributable fraction is normally calculated using the risk estimate that refers to the concentration that reflects the population exposure and the cut-off. This risk estimate is obtained after re-scaling from the epidemiological study with a particular increment (e.g. for PM2.5 10 or 5 ug/m3) to the aimed concentration.

prop_pop_exp

Numeric value or numeric vector specifying the population fraction(s) exposed for each exposure (category). Default: 1. See Details for more info. Only applicable in RR pathways.

Details

For more information about the equations used by get_paf please see the function documentation of attribute_health.

Value

This function returns the population attributable fraction as a numeric value.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: calculate PAF based on RR and the proportion of population exposed
get_paf(rr = 1.062, prop_pop_exp = 1)

Get population impact fraction

Description

This function calculates the population impact fraction of a health outcome due to exposure to an environmental stressor

Usage

get_pif(rr_at_exp_1, rr_at_exp_2, prop_pop_exp_1, prop_pop_exp_2)

Arguments

rr_at_exp_1

Numerical value showing the risk estimate of the concentration response function for a specific concentration in the scenario 1. The population attributable fraction is normally calculated using the risk estimate that refers to the concentration that reflects the population exposure and the cut-off. This risk estimate is obtained after re-scaling from the epidemiological study with a particular increment (e.g. for PM2.5 10 or 5 ug/m3) to the aimed concentration.

rr_at_exp_2

Numerical value showing the risk estimate of the concentration response function for a specific concentration in the scenario 2. The population attributable fraction is normally calculated using the risk estimate that refers to the concentration that reflects the population exposure and the cut-off. This risk estimate is obtained after re-scaling from the epidemiological study with a particular increment (e.g. for PM2.5 10 or 5 ug/m3) to the aimed concentration.

prop_pop_exp_1

Numerical value showing the fraction ([0,1]) of population exposed to the environmental stressor in the scenario 1. Per default = 1 (i.e. 100% of population is exposed).

prop_pop_exp_2

Numerical value showing the fraction ([0,1]) of population exposed to the environmental stressor in the scenario 1. Per default = 1 (i.e. 100% of population is exposed).

Details

For more information about the equations used by get_pif please see the function documentation of compare.

Value

This function returns the population impact fraction as a numeric value.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: calculate the population impact fraction (PIF)
results <- get_pif(
  rr_at_exp_1 = 1.043879,
  rr_at_exp_2 = 1.011217,
  prop_pop_exp_1 = 1,
  prop_pop_exp_2 = 1
)
print(results)

Get population attributable or impact fraction

Description

This function calculates the population impact fraction of a health outcome due to exposure to an environmental stressor

Usage

get_pop_fraction(rr_at_exp_1, rr_at_exp_2, prop_pop_exp_1, prop_pop_exp_2)

Arguments

rr_at_exp_1

Numerical value showing the risk estimate of the concentration response function for a specific concentration in the scenario 1. The population attributable fraction is normally calculated using the risk estimate that refers to the concentration that reflects the population exposure and the cut-off. This risk estimate is obtained after re-scaling from the epidemiological study with a particular increment (e.g. for PM2.5 10 or 5 ug/m3) to the aimed concentration.

rr_at_exp_2

Numerical value showing the risk estimate of the concentration response function for a specific concentration in the scenario 2. The population attributable fraction is normally calculated using the risk estimate that refers to the concentration that reflects the population exposure and the cut-off. This risk estimate is obtained after re-scaling from the epidemiological study with a particular increment (e.g. for PM2.5 10 or 5 ug/m3) to the aimed concentration.

prop_pop_exp_1

Numerical value showing the fraction ([0,1]) of population exposed to the environmental stressor in the scenario 1. Per default = 1 (i.e. 100% of population is exposed).

prop_pop_exp_2

Numerical value showing the fraction ([0,1]) of population exposed to the environmental stressor in the scenario 1. Per default = 1 (i.e. 100% of population is exposed).

Details

For more information about the equations used please see the function documentation of attribute_health.

Value

This function returns a value corresponding to the population attributable fraction

Author(s)

Alberto Castro & Axel Luyten


Calculates reference proportion of population

Description

This function calculates reference proportion of population. To be used in socialize() and standardize() in case that ref_prop_pop is not provided.

Usage

get_ref_prop_pop(df)

Arguments

df

Data frame or tibble with the data by geo_id_micro and age_group including a column for population

Value

A tibble with the columns

Author(s)

Alberto Castro & Axel Luyten


Get the relative risk of an exposure level

Description

This function re-scales the relative risk from the increment value in the epidemiological study (e.g. for PM2.5 10 or 5 ug/m3) to the actual population exposure

Usage

get_risk(
  erf_shape = NULL,
  rr = NULL,
  rr_increment = NULL,
  erf_eq = NULL,
  cutoff = 0,
  exp
)

Arguments

erf_shape

String value specifying the exposure-response function shape to be assumed. Options (no default): "linear", log_linear", "linear_log", "log_log". Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

rr

Numeric vector containing the relative risk. The data frame must contain the central estimate as well as the lower and upper bound of the exposure-response function.

rr_increment

Numeric value specifying the exposure increment for which the provided relative risk is valid. See Details for more info. Only applicable in RR pathways; not required if erf_eq_... argument(s) already specified.

erf_eq

Equation of the user-defined exposure-response function that puts the relative risk (y) in relation with exposure (x). If the function is provided as string, it can only contains one variable: x (exposure). E.g. "3+x+x^2". If the function is provided as a function, the object should have a function class. If only the values of the x-axis (exposure) and y axis (relative risk) of the dots in the exposure-response function are available, a cubic spline natural interpolation can be assumed to get the function using, e.g., stats::splinefun(x, y, method="natural")

cutoff

Numeric value showing the cut-off exposure level in ug/m3 (i.e. the exposure level below which no health effects occur).

exp

Population exposure to the stressor (e.g. annual population-weighted mean).

Details

Equations for scaling of relative risk

linear ERF

rr\_at\_exp = 1 + \frac{(rr - 1)}{rr\_increment} \cdot (exp - cutoff)

log-linear ERF

rr\_at\_exp = e^{\frac{\log(\mathrm{rr})}{\mathrm{rr\_increment}} \cdot (\mathrm{exp} - \mathrm{cutoff})}

log-log ERF

rr\_at\_exp = (\frac{exp + 1}{cutoff + 1})^{\frac{\log(\mathrm{rr})}{\log(\mathrm{rr\_increment + cutoff + 1}) - \log(cutoff + 1)}}

linear-log ERF

rr\_at\_exp = 1 + \frac{\log(\mathrm{rr - 1})}{\log(\mathrm{rr\_increment + cutoff + 1}) - \log(cutoff + 1)} \cdot \frac{\log(exp + 1)}{\log(cutoff + 1)}

Sources

For the log-linear, log-log and linear-log exposure-response function equations see Pozzer et al. 2022 (https://doi.org/10.1029/2022GH000711).

Value

This function returns the numeric relative risk(s) at the specified exposure level(s), referred to as rr_at_exp in the equations above.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: scale relative risk to observed exposure level
get_risk(
  rr = 1.05,
  rr_increment = 10,
  erf_shape = "linear",
  exp = 10,
  cutoff = 5
)

Get input data and PAF

Description

This function calculates the population attributable fraction (PAF) based on the input data and puts the results in additional columns joined to the input data frame.

Usage

get_risk_and_pop_fraction(input_table, pop_fraction_type)

Arguments

input_table

Data frame with the input data

pop_fraction_type

String indicating the type of the population fraction. Options: "paf" or "pif"

Value

This function returns a data.frame with the input data adding a column for the population attributable fraction Moreover, the data frame includes columns such as:

Author(s)

Alberto Castro & Axel Luyten


Monetize health impacts

Description

This function monetizes health impacts

Usage

monetize(
  output_attribute = NULL,
  impact = NULL,
  valuation,
  discount_rate = NULL,
  discount_shape = "exponential",
  n_years = 0,
  inflation_rate = NULL,
  info = NULL
)

Arguments

output_attribute

List produced by healthiar::attribute() or healthiar::compare() as results.

impact

Numberic value referring to the health impacts to be monetized (without attribute function). If a Numberic vector is entered multiple assessments (by year) will be carried out. Be aware that the value for year 0 (current) must be entered, while n_years does not include the year 0. Thus, length of impact = n_years + 1.

valuation

Numberic value referring to unit value of a health impact.

discount_rate

Numeric value showing the discount rate for future years. If it is a nominal discount rate, no inflation is to be entered. If it is a real discount rate, the result can be adjusted by entering inflation in this function.

discount_shape

String referring to the assumed equation for the discount factor. By default: "exponential". Otherwise: "hyperbolic_harvey_1986" or "hyperbolic_mazur_1987".

n_years

Numeric value referring to number of years in the future to be considered in the discounting and/or inflation. Be aware that the year 0 (without discounting/inflation, i.e. the present) is not be counted here. If a vector is entered in the argument impact, n_years does not need to be entered (length of impact = n_years + 1).

inflation_rate

Numeric value between 0 and 1 referring to the annual inflation (increase of prices). Only to be entered if nominal (not real) discount rate is entered in the function. Default value = NULL (assuming no nominal discount rate).

info

String, data frame or tibble providing information about the assessment. Only attached if impact is entered by the users. If output_attribute is entered, use info in that function or add the column manually. Optional argument.

Details

Equation inflation factor (without discounting)

inflation\_factor = (1 + inflation\_rate)^{n\_years}

Equations discount factors (without inflation)

Exponential discounting (no inflation)

discount\_factor = \frac{1}{(1 + discount\_rate) ^{n\_years}}

Hyperbolic discounting Harvey (no inflation)

discount\_factor = \frac{1}{(1 + n\_years)^{discount\_rate}}

Hyperbolic discounting Mazure (no inflation)

discount\_factor = \frac{1}{(1 + (discount\_rate \times n\_years)}

Equations discount factors with inflation

Exponential discounting (with inflation)

discount\_and\_inflation\_factor = \frac{1}{((1 + discount\_rate) \times (1 + inflation\_rate)) ^{n\_years}}

Hyperbolic discounting Harvey (with inflation)

discount\_and\_inflation\_factor = \frac{1}{(1 + n\_years)^{discount\_rate} \times (1 + inflation\_rate)^{n\_years}}

Hyperbolic discounting Mazure (with inflation)

discount\_and\_inflation\_factor = \frac{1}{(1 + (discount\_rate \times n\_years) \times (1 + inflation\_rate)^{n\_years}}

Value

This function returns a list containing:

1) monetization_main (tibble) containing the main monetized results;

2) monetization_detailed (list) containing detailed (and interim) results.

If the argument output_attribute was specified, then the two results elements are added to the existing output.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: monetize the attributable impacts of an existing healthiar
# assessment
output_attribute <- attribute_health(
erf_shape = "log_linear",
rr_central = exdat_pm$relative_risk,
rr_increment = 10,
exp_central = exdat_pm$mean_concentration,
cutoff_central = exdat_pm$cut_off_value,
bhd_central = exdat_pm$incidence
)

results <- monetize(
  output_attribute = output_attribute,
  discount_shape = "exponential",
  discount_rate = 0.03,
  n_years = 5,
  valuation = 50000 # E.g. EURO
)

# Attributable COPD cases its monetized impact
results$monetization_main |>
  dplyr::select(impact, monetized_impact)

Aggregate health impacts from multiple exposures

Description

This function aggregates health impacts from multiple exposures to environmental stressors.

Usage

multiexpose(
  output_attribute_exp_1,
  output_attribute_exp_2,
  exp_name_1,
  exp_name_2,
  approach_multiexposure = "additive"
)

Arguments

output_attribute_exp_1, output_attribute_exp_2

Output of attribute() for exposure 1 and 2, respectively. Baseline health data and population must be identical in outputs 1 and 2.

exp_name_1, exp_name_2

String referring to the name of the environmental exposures 1 and 2

approach_multiexposure

String specifying the multiple exposures approach to be used in the assessment. Options: "additive" (default), "multiplicative" or "combined".

Details

Sources

For more information on the additive and combined approaches see Steenland & Armstrong 2006 (https://doi.org/10.1097/01.ede.0000229155.05644.43).

For more information on the multiplicative approach see Jerrett et al. 2013 (https://doi.org/10.1164/rccm.201303-0609OC).

Value

This function returns a list containing:

1) health_main (tibble) containing the main results;

2) health_detailed (list) containing detailed (and interim) results.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: determine aggregated health impacts from multiple exposures
# Step 1: create assessment with exposure 1
output_attribute_exp_1 <- attribute_health(
  erf_shape = "log_linear",
  rr_central = 1.369,
  rr_increment = 10,
  exp_central = 8.85,
  cutoff_central = 5,
  bhd_central = 30747
)
output_attribute_exp_1$health_main$impact
# Step 2: create assessment with exposure 2
output_attribute_exp_2 <- attribute_mod(
  output_attribute = output_attribute_exp_1,
  exp_central = 10.9,
  rr_central = 1.031
)
output_attribute_exp_2$health_main$impact
# Step 3: aggregate impacts of the two assessments
results <- multiexpose(
  output_attribute_exp_1 = output_attribute_exp_1,
  output_attribute_exp_2 = output_attribute_exp_2,
  exp_name_1 = "pm2.5",
  exp_name_2 = "no2",
  approach_multiexposure = "multiplicative"
)
results$health_main$impact

Prepare exposure data

Description

This function prepares tabular population exposure data compatible with the attribute() and compare() functions, based on gridded pollution concentration data and vector data representing geographic units. The function calculates an average concentration value in each geographic unit, weighted by the fraction of the population in each sub-unit.

Usage

prepare_exposure(poll_grid, geo_units, population, geo_id_macro)

Arguments

poll_grid

SpatRaster of the pollution concentration data.

geo_units

sf of the geographic sub-units.

population

Numeric vector containing the total population number in each geographic sub-unit.

geo_id_macro

Numeric or string vector containing the higher-level IDs of the geographic units the sub-unit belong to and will be aggregated at.

Value

This function returns a list containing:

1) main (tibble) containing the main results as vectors;

2) detailed (list) containing detailed (and interim) results.

Author(s)

Arno Pauwels

Examples

# Goal: determine population-weighted mean PM2.5 exposure for several
# neighborhoods of Brussels (Belgium)

exdat_pwm_1 <- terra::rast(system.file("extdata", "exdat_pwm_1.tif", package = "healthiar"))
exdat_pwm_2 <- sf::st_read(
    system.file("extdata", "exdat_pwm_2.gpkg", package = "healthiar"),
    quiet = TRUE
)

pwm <- prepare_exposure(
  poll_grid = exdat_pwm_1, # Formal class SpatRaster
  geo_units = exdat_pwm_2, # sf of the geographic sub-units
  population = sf::st_drop_geometry(exdat_pwm_2$population), # population per geographic sub-unit
  geo_id_macro = sf::st_drop_geometry(exdat_pwm_2$region) # higher-level IDs to aggregate at
)

pwm$main # population-weighted mean exposures for the (higher-level) geographic units

Convert multi-year life table to single year life table

Description

This function determines populations and deaths by one year age groups.

Usage

prepare_lifetable(age_group, population, bhd)

Arguments

age_group

Numeric vector referring to the first years of the age groups. E.g. c(0, 20, 40, 60) means [0, 20), [20, 40), [40, 60), [60, )

population

Numeric vector referring to mid-year populations by age group.

bhd

Numeric vector referring to the baseline health data (deaths) by age group.

Details

The conversion follows the methodology of the WHO tool which is outlined in WHO 2020 (https://iris.who.int/bitstream/handle/10665/337683/WHO-EURO-2020-1559-41310-56212-eng.pdf?sequence=1).

See the AirQ+ manual "Health impact assessment of air pollution: AirQ+ life table manual" for guidance on how to convert larger age groups to 1 year age groups (section "Estimation of yearly values"): https://iris.who.int/bitstream/handle/10665/337683/WHO-EURO-2020-1559-41310-56212-eng.pdf (accessed April 2025)

Value

This function returns a tibble containing the columns:

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: Convert 5-year population and death data into single year lifetable
results <- prepare_lifetable(
  age_group = c(0, 5, 10, 15),
  population = c(3387900, 3401300, 3212300, 3026100),
  bhd = c(4727, 472, 557, 1323)
)

Create the BEST-COST Multidimensional Deprivation Index (MDI)

Description

This function creates the BEST-COST Multidimensional Deprivation Index (MDI) and checks internal consistency of the single deprivation indicators using Cronbach's coefficient \alpha and other internal consistency checks

Usage

prepare_mdi(
  geo_id_micro,
  edu,
  unemployed,
  single_parent,
  pop_change,
  no_heating,
  n_quantile,
  verbose = TRUE
)

Arguments

geo_id_micro

Numeric vector or string vector specifying the unique ID codes of each geographic area considered in the assessment (geo_id_micro) Argument must be entered for iterations. See Details for more info.

edu

Numeric vector indicating educational attainment as % of individuals (at the age 18 or older) without a high school diploma (ISCED 0-2) per geo unit

unemployed

Numeric vector containing % of unemployed individuals in the active population (18-65) per geo unit

single_parent

Numeric vector containing single-parent households as % of total households headed by a single parent per geo unit

pop_change

Numeric vector containing population change as % change in population over the previous 5 years (e.g., 2017-2021) per geo unit

no_heating

Numeric vector containing % of households without central heating per geo unit

n_quantile

Integer value specifying the number of quantiles in the analysis.

verbose

Boolean indicating whether function output is printed to console. Default: TRUE.

Details

The function outputs Cronbach's \alpha.

\alpha \geq 0.9

Excellent reliability

0.8 \leq \alpha < 0.9

Good reliability

0.7 \leq \alpha < 0.8

Acceptable reliability

0.6 \leq \alpha < 0.7

Questionable reliability

\alpha < 0.6

Poor reliability

Data completeness and imputation: ensure the dataset is as complete as possible. You can try to impute missing data:

Imputation models should have an R^2 greater than or equal to 0.7. If R^2 lower than 0.7, consider alternative data sources or methods.

See the example below for how to reproduce the boxplots and the histogram after the 'prepare_mdi' function call.

Value

This function returns a list containing 1) mdi_main (tibble) with the columns (selection);

2) mdi_detailed (list) with several elements for the internal consistency check of the BEST-COST Multidimensional Deprivation Index.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: create the BEST-COST Multidimensional Deprivation Index for
# a selection of geographic units

results <- prepare_mdi(
  geo_id_micro = exdat_prepare_mdi$id,
  edu = exdat_prepare_mdi$edu,
  unemployed = exdat_prepare_mdi$unemployed,
  single_parent = exdat_prepare_mdi$single_parent,
  pop_change = exdat_prepare_mdi$pop_change,
  no_heating = exdat_prepare_mdi$no_heating,
  n_quantile = 10,
  verbose = TRUE
)

results$mdi_main |>
  dplyr::select(geo_id_micro, MDI, MDI_index) |>
  dplyr::slice(1:15)

# Reproduce plots after the function call
eval(results$mdi_detailed$boxplot)
eval(results$mdi_detailed$histogram)

Consider socio-economic aspects in healthiar assessments

Description

This function considers socio-economic aspects (e.g. multiple deprivation index) in the attributable health impacts. If nothing is entered in the argument output_attribute, it is assumed that all data come from a table and the argument refer to the columns of that table.

Usage

socialize(
  output_attribute = NULL,
  age_group,
  geo_id_micro,
  social_indicator = NULL,
  increasing_deprivation = TRUE,
  n_quantile = NULL,
  social_quantile = NULL,
  population = NULL,
  ref_prop_pop = NULL,
  impact = NULL,
  exp = NULL,
  bhd = NULL,
  pop_fraction = NULL
)

Arguments

output_attribute

List containing the outputs of the healthiar::attribute_health() assessments for each age group (each list element should be an age group-specific assessment).

age_group

String vector with the age groups included in the age standardization. The vector refers to age-dependent data in this function and to output_attribute (if provided).

geo_id_micro

Numeric vector or string vector specifying the unique ID codes of each geographic area considered in the assessment (geo_id_micro) Argument must be entered for iterations. See Details for more info.

social_indicator

Numeric vector showing the social indicator used for the analysis, e.g. a deprivation score (indicator of economic wealth) for each geographic unit. Based on this and n_quantile, social_quantile will be calculated.

increasing_deprivation

Boolean variable (TRUE/FALSE) specifying whether an increase in social_indicator corresponds to an increase (TRUE) or decrease FALSE in deprivation. Default: TRUE.

n_quantile

Integer value specifying the number of quantiles in the analysis.

social_quantile

Integer vector showing the values from 1 to the number of quantiles assigned to each geographic unit. Either enter social_indicator and n_quantile or social_quantile

population

Numeric vector specifying the population by age group and geographic unit.

ref_prop_pop

Numeric vector specifying with the reference proportion of population for each age group. If this argument is empty, the proportion of population by age group in the provided data will be used.

impact

(only if output_attribute not specified) Numeric vector containing the attributable health impacts by both age group and geo id.

exp

(only if output_attribute not specified) Numeric vector specifying the exposure level(s) to the environmental stressor.

bhd

(only if output_attribute not specified) Numeric vector specifying the baseline health data of the health outcome of interest per age group. See Details for more info.

pop_fraction

(only if output_attribute not specified) Numeric vector specifying the population attributable fraction by age group and geographic unit.

Value

This function returns a list containing the impact (absolute and relative) theoretically attributable to the difference in the social indicator (e.g. degree of deprivation) between the quantiles:

1) social_main (tibble) containing the main results;

2) social_detailed (list) containing detailed (and interim) results.

If the argument output_attribute was specified, then the two lists are added next to the existing attribute output.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: determine fraction of attributable health impact that can
# be attributed to differences in deprivation between the geographic
# units under analysis

## Create assessments for multiple geographic units for the age group
## 40 years and younger
results_age_groups <-
  healthiar::attribute_health(
    age_group = rep(c("below_40", "40_plus"), each = 9037),
    exp_central = c(exdat_socialize$PM25_MEAN, exdat_socialize$PM25_MEAN-0.1),
    cutoff_central = 0,
    rr_central = 1.08,
    erf_shape = "log_linear",
    rr_increment = 10,
    bhd_central =  c(exdat_socialize$MORTALITY_below_40, exdat_socialize$MORTALITY_40_plus),
    population = c(exdat_socialize$POPULATION_below_40, exdat_socialize$POPULATION_40_plus),
    geo_id_micro = rep(exdat_socialize$CS01012020, 2))

## Difference in attributable impacts between geographic units
## that is attributable to differences in deprivation
results <- socialize(
  age_group = c("below_40", "40_plus"),
  ref_prop_pop = c(0.5, 0.5),
  output_attribute = results_age_groups,
  geo_id_micro = exdat_socialize$CS01012020,
  social_indicator = exdat_socialize$score,
  n_quantile = 10,
  increasing_deprivation = TRUE)

results$social_main |>
  dplyr::filter(difference_type == "relative") |>
  dplyr::filter(difference_compared_with == "overall") |>
  dplyr::select(first, last, difference_type, difference_value, comment)

Obtain age-standardized health impacts

Description

This function obtains age-standardized health impacts based on multiple age-group specific assessments

Usage

standardize(output_attribute, age_group, ref_prop_pop = NULL)

Arguments

output_attribute

List containing the outputs of the healthiar::attribute_health() assessments for each age group (each list element should be an age group-specific assessment).

age_group

String vector with the age groups included in the age standardization. The vector refers to age-dependent data in this function and to output_attribute (if provided).

ref_prop_pop

Numeric vector specifying with the reference proportion of population for each age group. If this argument is empty, the proportion of population by age group in the provided data will be used.

Value

This function returns a list containing:

1) health_main (tibble) containing the age-standardized main results;

2) health_detailed (tibble) containing the results per age group.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: age-standardize two age group-specific impacts
output_attribute <- attribute_health(
  rr_central = 1.063,
  rr_increment = 10,
  erf_shape = "log_linear",
  cutoff_central =  0,
  age_group = c("below_40", "above_40"),
  exp_central = c(8.1, 10.9),
  bhd_central = c(1000, 4000),
  population = c(100000, 500000)
)
results <- standardize(
  output_attribute = output_attribute,
  age_group = c("below_40", "above_40"),
  ref_prop_pop = c(0.5, 0.5)
)
results$health_detailed$impact_per_100k_inhab # age group-specific impact rate
results$health_main$impact_per_100k_inhab # age-standardized impact rate

Get Monte Carlo confidence intervals

Description

This function determines summary uncertainty (based on central, lower and upper estimates of at least one input variable) using attribute() or compare() function output by Monte Carlo simulation.

Input variables that will be processed are:

Usage

summarize_uncertainty(output_attribute, n_sim, seed = NULL)

Arguments

output_attribute

variable in which the output of a healthiar::attribute_...() function call are stored.

n_sim

numeric value indicating the number of simulations to be performed.

seed

numeric value for fixing the randomization. Based on it, each geographic unit is assigned a different. If empty, 123 is used as the base seed per default. The function preserves and restores the user's original random seed (if set prior to calling the function) upon function completion.

Details

Method

For each processed input variable with a provided 95% confidence interval value, a distribution is fitted (see below). From these, n_sim input value sets are sampled to compute n_sim attributable impacts. The median value of these attributable impacts is reported as the central estimate, and the 2.5th and 97.5th percentiles define the lower and upper bounds of the 95% summary uncertainty confidence interval, respectively. Aggregated central, lower and upper estimates are obtained by summing the corresponding values of each lower level unit.

Distributions used for simulation

Relative risk values are simulated based on an optimized gamma distribution, which fits well as relative risks are positive and its distributions usually right-skewed. The gamma distribution best fitting the inputted central relative risk estimate and corresponding lower and upper 95% confidence interval values is fitted using stats::qgamma() (with rate = shape / rr_central) and then stats::optimize is used to optimize the distribution parameters. Finally, n_sim relative risk values are simulated using stats::rgamma().

Exposure values are simulated based on a normal distribution using stats::rnorm() with mean = exp_central and a standard deviation based on corresponding lower and upper 95% exposure confidence interval values.

Cutoff values are simulated based on a normal distribution using stats::rnorm() with mean = cutoff_central and a standard deviation based on corresponding lower and upper 95% cutoff confidence interval values.

Baseline health data values are simulated based on a normal distribution using stats::rnorm() with mean = bhd_central and a standard deviation based on corresponding lower and upper 95% exposure confidence interval values.

Disability weights values of the morbidity health outcome of interest are simulated based on a beta distribution, as both the disability weights and the beta distribution are bounded by 0 and 1. The beta distribution best fitting the inputted central disability weight estimate and corresponding lower and upper 95% confidence interval values is fitted using stats::qgamma() (the best fitting distribution parameters shape1 and shape2 are determined using stats::optimize()). Finally, n_sim disability weight values are simulated using stats::rbeta().

Duration values of the morbidity health outcome of interest are simulated based on a normal distribution using stats::rnorm() with mean = duration_central and a standard deviation based on corresponding lower and upper 95% exposure confidence interval values.

Function arguments

seed

If the seed argument is specified then the parallel package is used to generate independent L’Ecuyer random number streams. One stream is allocated per variable (or per variable–geography combination, as needed), ensuring reproducible and independent random draws across variables and scenarios.

Value

This function returns a list containing:

1) uncertainty_main (tibble) containing the numeric summary uncertainty central estimate and corresponding lower and upper confidence intervals for the attributable health impacts obtained through Monte Carlo simulation;

2) uncertainty_detailed (list) containing detailed (and interim) results.

The two results elements are added to the existing output.

Author(s)

Alberto Castro & Axel Luyten

Examples

# Goal: obtain summary uncertainty for an existing attribute_health() output
# First create an assessment
attribute_health_output <- attribute_health(
  erf_shape = "log_linear",
  rr_central = 1.369,
  rr_lower = 1.124,
  rr_upper = 1.664,
  rr_increment = 10,
  exp_central = 8.85,
  exp_lower = 8,
  exp_upper = 10,
  cutoff_central = 5,
  bhd_central = 30747,
  bhd_lower = 28000,
  bhd_upper = 32000
)
# Then run Monte Carlo simulation
results <- summarize_uncertainty(
  output_attribute = attribute_health_output,
  n_sim = 100
)
results$uncertainty_main$impact # Central, lower and upper estimates

Check the input_args data of attribute_master()

Description

Check the input_args data in attribute_master() and provides specific warnings or errors if needed.

Usage

validate_input_attribute(input_args, is_lifetable)

Arguments

input_args

List with the argument names and values entered in the function.

is_lifetable

Boolean INTERNAL argument specifying if the life table approach is applied (TRUE) or not (FALSE)

Value

This function returns warning or error messages if needed.

Author(s)

Alberto Castro & Axel Luyten

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.