The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
In this vignette, the CTX Exposure API will be explored.
Data provided by the Exposure API are broadly organized in four
different areas, Functional Use Information, Product Data, List Presence
Data, and Exposure estimates. Data from the Functional Use, Product
Data, and List Presence resources (aside from the Functional Use
Probability endpoint) are developed from publicly available documents
and are also accessible using the Chemical Exposure Knowledgebase (ChempExpo) interactive web
application developed by the United States Environmental Protection
Agency. The underlying database for the Functional Use, Product Data,
and List Presence endpoints of the Exposure API and ChemExpo is the
Chemicals and Products Database (CPDat). CPDat provides reported
information on how chemicals are used in commerce and (where possible)
at what quantities they occur in consumer and industrial products; see
(Dionisio et
al. 2018) for more information on CPDat. The data provided by the
Functional Use Probability endpoint are predictions from EPA’s
Quantitative Structure Use Relationship (QSUR) models (Phillips
et al. 2017). Exposure data is represented by predictions from the
httk
R package, introduced in (Pearce, R. et
al. 2017) and several exposure models including the SEEM models.
Information on the SEEM2 model can be found at (Wambaugh, J. et
al. 2014) and on the SEEM3 model can be found at (Ring, C. et
al. 2018)
Product Data are organized by harmonized Product Use Categories (PUCs). The PUCs are assigned to products (which are associated with Composition Documents) and indicate the type of product associated to each data record. They are organized hierarchicially, with General Category containing Product Family, which in turn contains Product Type. The Exposure API also provide information on how the PUC was assigned. Do note that a natural language processing model is used to assign PUCs with the “classificationmethod” equal to “Automatic”. As such, these assignments are less certain and may contain inaccuracies. More information on PUC categories can be found in (Isaacs et al. 2020).
List Presence Data reflect the occurrence of chemicals on lists present in publicly available documents (sourced from a variety of federal and state agencies and trade associations). These lists are tagged with List Presence Keywords (LPKs) that together describe information contained in the document relevant to how the chemical was used. LPKs are an updated version of the cassettes provided in the Chemical and Product Categories (CPCat) database; see (Dionisio et al. 2015). For the most up to date information on the current LPKs and to see how the CPCat cassettes were updated, see (Koval et al. 2022).
Both reported and predicted Function Use Information is available. Reported functional use information is organized by harmonized Function Categories (FCs) that describe the role a chemical serves in a product or industrial process. The harmonized technical function categories and definitions were developed by the Organisation for Economic Co-operation and Development (OECD) (with the exception of a few categories unique to consumer products which are noted as being developed by EPA). These categories have been augmented with additional categories needed to describe chemicals in personal care, pharmaceutical, or other commercial sectors. The reported function data form the basis for ORD’s QSUR models (Phillips et al. 2016). These models provide the structure-based predictions of chemical function available in the Functional Use Probability endpoint. Note that these models were developed prior to the OECD function categories, so their function categories are not yet aligned with the harmonized categories used in the reported data. Updated models for the harmonized categories are under development.
The R package httk
provides users with a variety of
tools to incorporate toxickinetics and in vitro-in vivo extrapolation
into bioinformatics and comes with pre-made models that can be used with
specific chemical data. The SEEM models were developed to provide
predictions for potential human exposure to chemicals with little or no
exposure data. For SEEM2, Bayesian methods were used to infer ranges of
exposure consistent with data from the National Health and Nutrition
Examination Survey. Predictions for different demographic groups were
made. For SEEM3, chemical exposures through four different pathways were
predicted and in turn weighting of different models through these
exposure pathways was conducted to produce consensus predictions.
Information for ChemExpo is sourced from: Sakshi Handa, Katherine A. Phillips, Kenta Baron-Furuyama, and Kristin K. Isaacs. 2023. “ChemExpo Knowledgebase User Guide”. https://comptox.epa.gov/chemexpo/static/user_guide/index.html.
NOTE: Please see the introductory vignette for an overview of the ctxR package and initial set up instruction with API key storage.
Several ctxR functions can be used to access the CTX Exposure API data, as described in the following sections. Tables output in each example have been filtered to only display the first few rows of data.
Functional uses for chemicals may be searched.
get_exposure_functional_use()
retrieves FCs and
associated metadata for a specific chemical (by DTXSID).
id | dtxsid | datatype | docid | doctitle | docdate | reportedfunction | functioncategory |
---|---|---|---|---|---|---|---|
22724 | DTXSID7020182 | Chemical presence list | 1371471 | The 25 Chemicals Found in All Nine of the Biosolids Studied | fire retardant | Flame retardant | |
22722 | DTXSID7020182 | Chemical presence list | 1497376 | A regional assessment of chemicals of concern in surface waters of four Midwestern United States national parks- Table 1 | 6 December 2016 | plastic component | NA |
22728 | DTXSID7020182 | Composition | 1389481 | halcyon radiant barrier shades | 2015-08-25 | monomer, polycarbonate | Monomers |
22726 | DTXSID7020182 | Composition | 1550827 | Thin-Set_Epoxy_Terrazzo_Flooring-Master_Terrazzo_Technologies-2017-02-02 | february 2, 2017 | curing agent | Hardener |
22727 | DTXSID7020182 | Composition | 1389695 | kerapoxy 410- part b | 2015-01-19 | resin amide | NA |
22732 | DTXSID7020182 | Composition | 1390773 | universal litter and recycling receptacle - 30 gallon with | 2016-03-15 | monomer, polycarbonate | Monomers |
get_exposure_functional_use_probability()
retrieves the
probability of functional use within different FCs for a given chemical
(by DTXSID). Each value represents the probability of the chemical being
classified as having this function, as predicted by the QSUR models.
harmonizedFunctionalUse | probability |
---|---|
antimicrobial | 0.3722 |
antioxidant | 0.8941 |
catalyst | 0.2031 |
colorant | 0.1560 |
crosslinker | 0.7743 |
flame_retardant | 0.2208 |
get_exposure_functional_use_categories()
retrieves
definitions of all the available FCs. This is not specific to a
chemical, but rather a list of all FCs.
id | title | description |
---|---|---|
28 | Coalescing agent | Chemical substance used in polymer emulsions that lower the glass-transition temperature (Tg) which results in the decrease in the minimum film-forming temperature (MFT) and upon evaporation, yields a hard film. Used in polishes; e.g., glycol; ether; pyrrolidines; and benzoates. Also referred to as a minimum film-forming temperature (MFT) modifier. |
29 | Conductive agent | Chemical substance used to conduct electrical current. Also referred to as an electrolyte; or electrode material. |
30 | Corrosion inhibitor | Chemical substance used to prevent or retard corrosion on metallic materials. Used in many products packaged in metal containers (such as aerosol products). Used in lubricants and other metal treatment products to provide protection to the substrates or surfaces on which the lubricants are used. Also referred to as a corrosion‑inhibiting additive; rust preventative; anticorrosion agent; or antirust agent. |
16 | Anti-static agent | Chemical substance that prevents or reduces the tendency of a material to accumulate a static charge or alters the electrical properties of materials by reducing their tendency to acquire an electrical charge. Used in diesel fuel to prevent the build-up of static electricity. Also referred to as a charge stabilizer. |
17 | Anti-streaking agent | Chemical substance which serves to enhance evaporation or reduce film formation in order to prevent the formation of streaks on a surface during cleaning. Also referred to as a film reducer. |
18 | Binder | Chemical substances that are either synthetic/polymeric resins that further polymerize, provide structure and cohesiveness or are substances added to compounded dry powders to provide adhesive qualities during and after compression to make tablets or cakes. Also referred to as a binding agent or resin. |
There are a few resources for retrieving product use data associated with chemical identifiers (DTXSID) or general use.
get_exposure_product_data()
retrieves the product data
(PUCs and related data) for products that use the specified chemical (by
DTXSID).
id | dtxsid | docid | doctitle | docdate | productname | gencat | prodfam | prodtype | classificationmethod | rawmincomp | rawmaxcomp | rawcentralcomp | unittype | lowerweightfraction | upperweightfraction | centralweightfraction | weightfractiontype | component |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
589086 | DTXSID7020182 | 1297201 | EPOCAST 87005 B-60, FPC2172 | 08/07/1991 | epocast 87005 b-60_ fpc2172 | Raw materials | adhesives | Manual | 5 | percent | NA | NA | 0.05 | reported | ||||
657348 | DTXSID7020182 | 1314861 | EPOCAST 87005 B-80, FPC2248 | 08/04/1992 | epocast 87005 b-80_ fpc2248 | Raw materials | adhesives | Manual | 2 | percent | NA | NA | 0.02 | reported | ||||
192133 | DTXSID7020182 | 1178630 | EPOCAST HARDENER 946, FPC 5000 | 01/11/1990 | epocast hardener 946_ fpc 5000 | NA | NA | NA | NA | 45 | percent | NA | NA | 0.45 | reported | |||
655734 | DTXSID7020182 | 1314342 | EPOCAST HARDNER 946 | 10/04/1994 | epocast hardner 946 | NA | NA | NA | NA | NA | NA | NA | NA | reported | ||||
85935 | DTXSID7020182 | 1135022 | EPOLITE 1301 HARDENER(FORM. 916 AD HARDENER | 01/01/1985 | epolite 1301 hardener(form. 916 ad hardener | NA | NA | NA | NA | NA | NA | NA | NA | reported | ||||
25284 | DTXSID7020182 | 1106842 | EPOLITE 1350 HANDENER | 08/09/1993 | epolite 1350 handener | NA | NA | NA | NA | 35 | 50 | percent | 0.35 | 0.5 | NA | reported |
get_exposure_product_data_puc()
retrieves the
definitions of all the PUCs. This is not specific to a chemical, but
rather a list of all PUCs.
id | kindName | genCat | prodfam | prodtype | definition |
---|---|---|---|---|---|
45 | Formulation | Cleaning products and household care | oven | cleaning or other products used in or on ovens that do not fit in a more refined category | |
44 | Formulation | Cleaning products and household care | metal specific | cleaning or care products specific to metals, which do not fit into a more refined category | |
43 | Formulation | Cleaning products and household care | laundry and fabric treatment | Cleaning or care products for laundry and fabric treatment which do not fit into a more refined category | |
42 | Formulation | Cleaning products and household care | laundry and fabric treatment | anti-static spray | anti-static sprays for fabrics (spray formulation assumed) |
291 | Formulation | Cleaning products and household care | bathroom | urinal cakes and deodorizers | Includes urinal cakes and screens for use in urinals, also includes deodorizers used in portable toilets |
292 | Formulation | Cleaning products and household care | jewelry | jewelry cleaner | Cleaning solutions specifically for jewelry products |
httk
dataThere is a single resource that returns httk
model data
when available
bpa_httk <- get_httk_data(DTXSID = 'DTXSID7020182')
head(bpa_httk)
#> id dtxsid parameter measuredText measured predictedText
#> 1 101171 DTXSID7020182 Css 0.0083 0.0083 1.114
#> 2 101172 DTXSID7020182 Css 0.0083 0.0083 0.5297
#> 3 101173 DTXSID7020182 Css 0.0083 0.0083 1.076
#> 4 101174 DTXSID7020182 Css 0.0083 0.0083 0.5116
#> 5 101175 DTXSID7020182 TK.Half.Life 0.19 0.1900 139.5
#> 6 101176 DTXSID7020182 Days.Css NA NA 112
#> predicted units model reference percentile species
#> 1 1.1140 mg/L PBTK Wambaugh et al. (2018) 95% Rat
#> 2 0.5297 mg/L PBTK Wambaugh et al. (2018) 50% Rat
#> 3 1.0760 mg/L 3compartmentss Wambaugh et al. (2018) 95% Rat
#> 4 0.5116 mg/L 3compartmentss Wambaugh et al. (2018) 50% Rat
#> 5 139.5000 hours 1compartment Wambaugh et al. (2018) NA Rat
#> 6 112.0000 Days PBTK NA NA Rat
#> dataSourceSpecies dataVersion importDate
#> 1 Rat NA 2024-06-13T16:53:14.622350Z
#> 2 Rat NA 2024-06-13T16:53:14.622350Z
#> 3 Rat NA 2024-06-13T16:53:14.622350Z
#> 4 Rat NA 2024-06-13T16:53:14.622350Z
#> 5 Rat NA 2024-06-13T16:53:14.622350Z
#> 6 Rat NA 2024-06-13T16:53:14.622350Z
There are a few resources for retrieving list data for specific chemicals (by DTXSID) or general list presence information.
get_exposure_list_presence_tags_by_dtxsid()
retrieves
LPKs and associated data for a specific chemical (by DTXSID).
id | dtxsid | docid | doctitle | docsubtitle | docdate | organization | reportedfunction | functioncategory | component | keywordset |
---|---|---|---|---|---|---|---|---|---|---|
127967 | DTXSID7020182 | 1557970 | Experimental Small Molecule Drugs | DrugBank | NA | NA | Canada; pharmaceutical | |||
9997 | DTXSID7020182 | 1359540 | Actively Registered AI’s by Common Name | California Department of Pesticide Regulation | NA | NA | active_ingredient; Pesticides | |||
40538 | DTXSID7020182 | 1372213 | Indirect Additives used in Food Contact Substances | FDA authorizes Indirect Food Additives by identity, intended use, and conditions of use; the presence of a substance in this list indicates that only certain intended uses and use conditions are authorized by FDA regulations | 10/4/2018 | FDA | NA | NA | Indirect additives food contact (10/2018) | |
135524 | DTXSID7020182 | 1558005 | Chemicals of High Concern to Children | 9/1/2020 | Vermont Department of Health | NA | NA | children | ||
113048 | DTXSID7020182 | 1551584 | Chemicals of high concern to children reporting list | State of Washington Department of Ecology | NA | NA | children; WA Children’s Safe Product Act (4/2020) | |||
76175 | DTXSID7020182 | 1373540 | Exposure of children and unborn children to selected chemical substances - Table 4.2.3 | Table 4.2.3 Regulation in the food area | Apr-17 | Danish Environmental Protection Agency | NA | NA | Europe; Food contact items |
There are two functions that provide access to exposure prediction data. The first provides general information on exposure pathways while the second provides exposure predictions from a variety of exposure models. The general information corresponds to SEEM3 predictions of exposure pathways, while the exposure predictions feature SEEM2 predictions broken down by demographic groups, general consensus predictions from SEEM3, and in some cases additional exposure predictions from other models
get_general_exposure_prediction()
returns general
exposure information for a given chemical.
bpa_general_exposure <- get_general_exposure_prediction(DTXSID = 'DTXSID7020182')
head(bpa_general_exposure)
#> dtxsid productionVolume units stockholmConvention probabilityDietary
#> <char> <int> <char> <int> <num>
#> 1: DTXSID7020182 2780000 kg/day 0 1
#> 5 variable(s) not shown: [probabilityResidential <num>, probabilityPesticde <num>, probabilityIndustrial <num>, dataVersion <lgcl>, importDate <char>]
get_demographic_exposure_prediction()
returns exposure
prediction information split across different demographics for a given
chemical.
bpa_demographic_exposure <- get_demographic_exposure_prediction(DTXSID = 'DTXSID7020182')
bpa_demographic_exposure
#> id dtxsid demographic predictor median
#> 1 768361 DTXSID7020182 Total Food.Contact 1.766000e-02
#> 2 769393 DTXSID7020182 Total FINE 9.460000e-06
#> 3 772655 DTXSID7020182 Total RAIDAR 3.770000e+00
#> 4 784083 DTXSID7020182 Total USETox.Pest 5.624000e-02
#> 5 785935 DTXSID7020182 Total USETox.Indust 1.372000e-04
#> 6 749502 DTXSID7020182 Age 66+ SEEM2 Heuristic 6.608350e-05
#> 7 751534 DTXSID7020182 BMI > 30 SEEM2 Heuristic 7.073042e-05
#> 8 760855 DTXSID7020182 Total SHEDS.Indirect 7.150000e-05
#> 9 761591 DTXSID7020182 Total SHEDS.Direct 0.000000e+00
#> 10 763267 DTXSID7020182 BMI <= 30 SEEM2 Heuristic 6.245051e-05
#> 11 488214 DTXSID7020182 Total SEEM3 Consensus 5.497000e-05
#> 12 797784 DTXSID7020182 Total USETox.Res 4.395000e-02
#> 13 807431 DTXSID7020182 Total USETox.Diet 1.498000e-04
#> 14 709226 DTXSID7020182 Males SEEM2 Heuristic 3.867956e-05
#> 15 735410 DTXSID7020182 Age 12-19 SEEM2 Heuristic 5.871957e-05
#> 16 697139 DTXSID7020182 Repro. Age Females SEEM2 Heuristic 1.364275e-05
#> 17 711258 DTXSID7020182 Females SEEM2 Heuristic 1.244431e-05
#> 18 737451 DTXSID7020182 Age 20-65 SEEM2 Heuristic 5.675943e-05
#> 19 723306 DTXSID7020182 Age 6-11 SEEM2 Heuristic 6.296203e-05
#> medianText l95 l95Text u95
#> 1 0.01766 NA NA NA
#> 2 9.46e-06 NA NA NA
#> 3 3.77 NA NA NA
#> 4 0.05624 NA NA NA
#> 5 0.0001372 NA NA NA
#> 6 6.60834995383669e-05 2.798634e-07 2.7986341540408e-07 0.019477870
#> 7 7.07304192271297e-05 3.136219e-07 3.13621919723853e-07 0.018576052
#> 8 7.15e-05 NA NA NA
#> 9 0 NA NA NA
#> 10 6.2450508333388e-05 2.591822e-07 2.59182177179327e-07 0.013621125
#> 11 5.497e-05 1.923000e-07 1.923e-07 0.020440000
#> 12 0.04395 NA NA NA
#> 13 0.0001498 NA NA NA
#> 14 3.86795578537834e-05 2.846711e-07 2.84671057884619e-07 0.006306170
#> 15 5.87195691748974e-05 2.809632e-07 2.80963221822448e-07 0.017185596
#> 16 1.36427543462443e-05 5.637240e-08 5.63723993835891e-08 0.004176617
#> 17 1.24443070751952e-05 4.901108e-08 4.90110833197268e-08 0.002897798
#> 18 5.67594250809775e-05 2.080289e-07 2.08028872989558e-07 0.011509267
#> 19 6.29620332442998e-05 3.049913e-07 3.04991342892185e-07 0.010537090
#> u95Text units ad reference dataVersion
#> 1 NA mg/kg/day 1 Biryol 2017 NA
#> 2 NA mg/day 1 Shin 2012 NA
#> 3 NA mg/kg/day 1 Arnot 2008 NA
#> 4 NA intake fraction 1 Fantke 2013 NA
#> 5 NA intake fraction 1 Rosenbaum 2008 NA
#> 6 0.0194778699251516 mg/kg/day 1 Wambaugh 2014 NA
#> 7 0.0185760522525412 mg/kg/day 1 Wambaugh 2014 NA
#> 8 NA mg/kg/day 1 Isaacs 2017 NA
#> 9 NA mg/kg/day 1 Isaacs 2017 NA
#> 10 0.0136211249503816 mg/kg/day 1 Wambaugh 2014 NA
#> 11 0.02044 mg/kg/day 1 Ring 2018 NA
#> 12 NA intake fraction 1 Huang 2016 NA
#> 13 NA intake fraction 1 Ernstoff 2016 NA
#> 14 0.00630617035849566 mg/kg/day 1 Wambaugh 2014 NA
#> 15 0.0171855959252902 mg/kg/day 1 Wambaugh 2014 NA
#> 16 0.00417661734132225 mg/kg/day 1 Wambaugh 2014 NA
#> 17 0.00289779809405841 mg/kg/day 1 Wambaugh 2014 NA
#> 18 0.0115092672875229 mg/kg/day 1 Wambaugh 2014 NA
#> 19 0.0105370896882791 mg/kg/day 1 Wambaugh 2014 NA
#> importDate
#> 1 2024-06-13T19:25:16.277317Z
#> 2 2024-06-13T19:25:16.277317Z
#> 3 2024-06-13T19:25:16.277317Z
#> 4 2024-06-13T19:25:16.277317Z
#> 5 2024-06-13T19:25:16.277317Z
#> 6 2024-06-13T19:25:16.277317Z
#> 7 2024-06-13T19:25:16.277317Z
#> 8 2024-06-13T19:25:16.277317Z
#> 9 2024-06-13T19:25:16.277317Z
#> 10 2024-06-13T19:25:16.277317Z
#> 11 2024-06-13T19:25:16.277317Z
#> 12 2024-06-13T19:25:16.277317Z
#> 13 2024-06-13T19:25:16.277317Z
#> 14 2024-06-13T19:25:16.277317Z
#> 15 2024-06-13T19:25:16.277317Z
#> 16 2024-06-13T19:25:16.277317Z
#> 17 2024-06-13T19:25:16.277317Z
#> 18 2024-06-13T19:25:16.277317Z
#> 19 2024-06-13T19:25:16.277317Z
There are batch search versions for several endpoints that gather
data specific to a chemical. Namely,
get_exposure_functional_use_batch()
,
get_exposure_functional_use_probability()
,
get_exposure_product_data_batch()
,
get_exposure_list_presence_tags_by_dtxsid_batch()
,
get_general_exposure_prediction_batch()
, and
get_demographic_exposure_prediction_batch()
. The function
get_exposure_functional_use_probability()
returns a
data.table with each row corresponding to a unique chemical and each
column representing a functional use category associated to at least one
input chemical. The other batch functions return a named list of
data.frames or data.tables, the names corresponding to the unique
chemicals input and the data.frames or data.tables corresponding to the
information to each individual chemical.
We demonstrate how the individual results differ from the batch results when retrieving functional use probabilities.
bpa_prob <- get_exposure_functional_use_probability(DTXSID = 'DTXSID7020182')
caf_prob <- get_exposure_functional_use_probability(DTXSID = 'DTXSID0020232')
bpa_caf_prob <- get_exposure_functional_use_probability_batch(DTXSID = c('DTXSID7020182', 'DTXSID0020232'))
#> harmonizedFunctionalUse probability
#> 1 antimicrobial 0.3722
#> 2 antioxidant 0.8941
#> 3 catalyst 0.2031
#> 4 colorant 0.1560
#> 5 crosslinker 0.7743
#> 6 flame_retardant 0.2208
#> 7 flavorant 0.0314
#> 8 fragrance 0.2071
#> 9 heat_stabilizer 0.5119
#> 10 skin_conditioner 0.1168
#> 11 skin_protectant 0.3306
#> 12 uv_absorber 0.8046
#> harmonizedFunctionalUse probability
#> 1 antimicrobial 0.4808
#> 2 buffer 0.6370
#> 3 colorant 0.3962
#> 4 skin_conditioner 0.9821
#> DTXSID antimicrobial antioxidant catalyst colorant crosslinker
#> <char> <num> <num> <num> <num> <num>
#> 1: DTXSID7020182 0.3722 0.8941 0.2031 0.1560 0.7743
#> 2: DTXSID0020232 0.4808 NA NA 0.3962 NA
#> 8 variable(s) not shown: [flame_retardant <num>, flavorant <num>, fragrance <num>, heat_stabilizer <num>, skin_conditioner <num>, skin_protectant <num>, uv_absorber <num>, buffer <num>]
Observe that Caffeine only has probabilities assigned to four functional use categories while Bisphenol A has probabilities assigned to twelve categories. For single chemical search, functional use categories denote the row. However, when using the batch search function, all reported categories are included as columns, with rows corresponding to each chemical. If a chemical does not have a probability associated to a functional use, the corresponding entry is given by an NA.
There are several CTX Exposure
API endpoints and ctxR
contains functions for each, and batch versions for some of these as
well. These allow users to access various types of exposure data
associated to a given chemical. In this vignette, we explored all of the
non-batch versions and discussed the batch versions. We encourage the
user to experiment with the different endpoints to understand better
what sorts of data are available.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.