The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Please send questions to wambaugh.john@epa.gov
This vignette provides the code used to generate the figures in Wambaugh et al. (2019) “Assessing Toxicokinetic Uncertainty and Variability in Risk Prioritization.”
John F Wambaugh, Barbara A Wetmore, Caroline L Ring, Chantel I Nicolas, Robert G Pearce, Gregory S Honda, Roger Dinallo, Derek Angus, Jon Gilbert, Teresa Sierra, Akshay Badrinarayanan, Bradley Snodgrass, Adam Brockman, Chris Strock, R Woodrow Setzer, Russell S Thomas
Toxicological Sciences, Volume 172, Issue 2, December 2019, Pages 235-251
https://doi.org/10.1093/toxsci/kfz205
High(er) throughput toxicokinetics (HTTK) encompasses in vitro measures of key determinants of chemical toxicokinetics and reverse dosimetry approaches for in vitro-in vivo extrapolation (IVIVE). With HTTK, the bioactivity identified by any in vitro assay can be converted to human equivalent doses and compared with chemical intake estimates. Biological variability in HTTK has been previously considered, but the relative impact of measurement uncertainty has not. Bayesian methods were developed to provide chemical-specific uncertainty estimates for 2 in vitro toxicokinetic parameters: unbound fraction in plasma (\(f_{up}\)) and intrinsic hepatic clearance (\(Cl_{int}\)). New experimental measurements of \(f_{up}\) and \(Cl_{int}\) are reported for 418 and 467 chemicals, respectively. These data raise the HTTK chemical coverage of the ToxCast Phase I and II libraries to 57%. Although the standard protocol for \(Cl_{int}\) was followed, a revised protocol for \(f_{up}\) measured unbound chemical at 10%, 30%, and 100% of physiologic plasma protein concentrations, allowing estimation of protein binding affinity. This protocol reduced the occurrence of chemicals with fup too low to measure from 44% to 9.1%. Uncertainty in \(f_{up}\) was also reduced, with the median coefficient of variation dropping from 0.4 to 0.1. Monte Carlo simulation was used to propagate both measurement uncertainty and biological variability into IVIVE. The uncertainty propagation techniques used here also allow incorporation of other sources of uncertainty such as in silico predictors of HTTK parameters. These methods have the potential to inform risk-based prioritization based on the relationship between in vitro bioactivities and exposures.
This vignette was created with httk v1.10. Although we attempt to maintain backward compatibility, if you encounter issues with the latest release of httk and cannot easily address the changes, historical versions of httk are available from: https://cran.r-project.org/src/contrib/Archive/httk/
R package knitr generates html and PDF documents from this RMarkdown file, Each bit of code that follows is known as a “chunk”. We start by telling knitr how we want our chunks to look.
::opts_chunk$set(collapse = TRUE, comment = '#>', fig.width=5, fig.height=3) knitr
It is a bad idea to let variables and other information from previous R sessions float around, so we first remove everything in the R memory.
rm(list=ls())
If you are using the RMarkdown version of this vignette (extension, .RMD) you will be able to see that several chunks of code in this vignette have the statement “eval = execute.vignette”. The next chunk of code, by default, sets execute.vignette = FALSE. This means that the code is included (and necessary) but was not run when the vignette was built. We do this because some steps require extensive computing time and the checks on CRAN limit how long we can spend building the package. If you want this vignette to work, you must run all code, either by cutting and pasting it into R. Or, if viewing the .RMD file, you can either change execute.vignette to TRUE or press “play” (the green arrow) on each chunk in RStudio.
# Set whether or not the following chunks will be executed (run):
<- FALSE execute.vignette
We use the command ‘library()’ to load various R packages for our analysis. If you get the message “Error in library(X) : there is no package called ‘X’” then you will need to install that package:
From the R command prompt:
install.packages(“X”)
Or, if using RStudio, look for ‘Install Packages’ under ‘Tools’
library(data.table)
library(httk)
library(ggplot2)
library(stringr)
library(scales)
library(grid)
library(reshape)
This returns an expression with root mean squared error (RMSE) and coefficient of determination \(R^2\) for plotting:
<- function(m,prefix=NULL){
lm_R2 <- substitute("RMSE = "~mse*","~~italic(R)^2~"="~r2,
out list(mse = signif(mean(m$residuals^2)^(1/2),3),
r2 = format(summary(m)$adj.r.squared, digits = 2)))
paste(prefix,as.character(as.expression(out)))
}
From http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/ ggplot objects can be passed in …, or to plotlist (as a list of ggplot objects) - cols: Number of columns in layout - layout: A matrix specifying the layout. If present, ‘cols’ is ignored.
If the layout is something like matrix(c(1,2,3,3), nrow=2, byrow=TRUE), then plot 1 will go in the upper left, 2 will go in the upper right, and 3 will go all the way across the bottom.
<- function(..., plotlist=NULL, file, cols=1, layout=NULL, heights=NULL, widths=unit(rep_len(1, cols), "null")) {
multiplot
# Make a list from the ... arguments and plotlist
<- c(list(...), plotlist)
plots
= length(plots)
numPlots
# If layout is NULL, then use 'cols' to determine layout
if (is.null(layout)) {
# Make the panel
# ncol: Number of columns of plots
# nrow: Number of rows needed, calculated from # of cols
<- matrix(seq(1, cols * ceiling(numPlots/cols)),
layout ncol = cols, nrow = ceiling(numPlots/cols))
}
if (numPlots==1) {
print(plots[[1]])
else {
} # Set up the page
grid.newpage()
if (!is.null(heights))
{pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout),heights=heights,widths=widths)))
else {
} pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout),widths=widths)))
}# Make each plot, in the correct location
for (i in 1:numPlots) {
# Get the i,j matrix positions of the regions that contain this subplot
<- as.data.frame(which(layout == i, arr.ind = TRUE))
matchidx
print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
layout.pos.col = matchidx$col))
}
} }
<- function(x) {
scientific_10 <- gsub("1e", "10^", scientific_format()(x))
out <- gsub("\\+","",out)
out <- gsub("10\\^01","10",out)
out <- parse(text=gsub("10\\^00","1",out))
out }
This one leaves every other tickmark blank:
<- function(x) {
scientific_10_skip <- gsub("1e", "10^", scientific_format()(x))
out <- gsub("\\+","",out)
out <- gsub("10\\^01","10",out)
out <- gsub("10\\^00","1",out)
out <- gsub("10\\^-01",NA,out)
out return(parse(text=out))
}
\(C_{ss}\) is the steady-state plasma concentration.
<- function(chemcas,
cssfun_u
MW,
hepatic.bioavailability, Fgutabs=1,
probs=0.95,
minimum.Funbound.plasma=0.0001)
{# Create a data table with uncertainty only:
<- as.data.table(parameterize_steadystate(chem.cas=chemcas,
pop_u_httk clint.pvalue.threshold=1,
minimum.Funbound.plasma=0))
# Get the physico-chemical properties:
<-parameterize_schmitt(chem.cas=chemcas)
params.schmitt <- params.schmitt$pKa_Donor
psd <- params.schmitt$pKa_Accept
psa <- params.schmitt[regexpr("pKa",names(params.schmitt))==-1]
params.schmitt names(params.schmitt):=params.schmitt]
pop_u_httk[,:= paste(psd, collapse=",")]
pop_u_httk[,pKa_Donor := paste(psa, collapse=",")]
pop_u_httk[,pKa_Accept <- pop_u_httk[rep(1,1000)]
pop_u_httk
# Calculate the distribution coefficient:
<- calc_ionization(pH=7.4,parameters=pop_u_httk)
ions := Pow * (ions$fraction_neutral +
pop_u_httk[,Dow74 0.001 * ions$fraction_charged + ions$fraction_zwitter)]
# Parse the Funbound.plasma.dist into quantiles:
<- pop_u_httk[1,Funbound.plasma.dist]
Funbound.plasma.dist if (nchar(Funbound.plasma.dist) -
nchar(gsub(",","",Funbound.plasma.dist))!=2)
{stop("Funbound.plasma distribution should be three values (median,low95th,high95th) separated by commas.")
}<- strsplit(Funbound.plasma.dist,",")
temp <- as.numeric(temp[[1]][1])
ppb.median <- as.numeric(temp[[1]][2])
ppb.low95 <- as.numeric(temp[[1]][3])
ppb.high95
if (ppb.median>minimum.Funbound.plasma)
{# Use optim to estimate alpha and beta for fup such that the median and 95%
# credible interval approximate the estimate from MCMC:
<- optim(c(2,(1-ppb.median)/ppb.median*2),
ppb.fit function(x) (0.95-pbeta(ppb.high95,x[1],x[2])+
pbeta(ppb.low95,x[1],x[2]))^2+
-qbeta(0.5,x[1],x[2]))^2,
(ppb.medianmethod="BFGS")
# We are drawing new values for the unadjusted Fup:
:=rbeta(1000,
pop_u_httk[, unadjusted.Funbound.plasma$par[1],
ppb.fit$par[2])]
ppb.fitelse if (ppb.high95>minimum.Funbound.plasma)
}
{# Assume that since the median is zero but the u95 is not, that there is
# an exponential distribution:
# 97.5% of clearance values will be below Funbound.plasma.u95:
:=runif(1000,
pop_u_httk[,unadjusted.Funbound.plasma
minimum.Funbound.plasma,min(1,minimum.Funbound.plasma+
2*(ppb.high95-minimum.Funbound.plasma)))]
as.logical(rbinom(1000,1,.95)),
pop_u_httk[:=minimum.Funbound.plasma]
unadjusted.Funbound.plasmaelse {
} :=minimum.Funbound.plasma]
pop_u_httk[, unadjusted.Funbound.plasma
}
# then we need to adjust for in vitro binding:
:=subset(physiology.data,
pop_u_httk[,Flipid=='Plasma Effective Neutral Lipid Volume Fraction')[,
Parameterwhich(colnames(physiology.data) == 'Human')]]
:=1 / (Dow74 * Flipid +
pop_u_httk[,Funbound.plasma.adjustment1 / unadjusted.Funbound.plasma)/unadjusted.Funbound.plasma]
pop_u_httk[, :=unadjusted.Funbound.plasma*Funbound.plasma.adjustment]
Funbound.plasma
# Enforce a minimum ppb:
<minimum.Funbound.plasma,
pop_u_httk[Funbound.plasma:=minimum.Funbound.plasma]
Funbound.plasma
# If Fup varies, then Rb2p and hepatic bioavailability also vary:
<- get_rblood2plasma(chem.cas=chemcas)
Rblood2plasma if (is.na(Rblood2plasma))
{:=calc_rblood2plasma(params=pop_u_httk)]
pop_u_httk[, Rblood2plasma# Right now we don't similate variability on a known blood:plasma ratio
else pop_u_httk[, Rblood2plasma:=Rblood2plasma]
}
# Parse the clint.dist to retrieve the quantiles:
<- pop_u_httk[1,Clint.dist]
Clint.dist if (nchar(Clint.dist) -
nchar(gsub(",","",Clint.dist))!=3)
{stop("Clint distribution should be four values (median,low95th,high95th,pValue) separated by commas.")
}<- strsplit(Clint.dist,",")
temp <- as.numeric(temp[[1]][1])
Clint.median <- as.numeric(temp[[1]][2])
Clint.low95 <- as.numeric(temp[[1]][3])
Clint.high95 <- as.numeric(temp[[1]][4])
Clint.pvalue
# Use optim to estimate mean and standard deviation of Clint such that the
# median and 95% credible interval approximate the estimate from MCMC:
if (Clint.high95>0)
{if (Clint.median>0)
{<- optim(c(log(Clint.median),0.3), function(x)
clint.fit 0.95-plnorm(Clint.high95,x[1],x[2])+plnorm(Clint.low95,x[1],x[2]))^2+
(-qlnorm(0.5,x[1],x[2]))^2)
(Clint.median:=rlnorm(1000,clint.fit$par[1],clint.fit$par[2])]
pop_u_httk[,Clintelse {
} :=exp(runif(1000,log(1),
pop_u_httk[,Clintlog(Clint.high95)-log(1))/0.975))]
(
}else pop_u_httk[,Clint:=0]
} as.logical(rbinom(1000,1,Clint.pvalue)),Clint:=0] # the Bayesian "p-value" here reflects how often there is no clearance
pop_u_httk[
:=calc_hepatic_clearance(chem.cas=chemcas,
pop_u_httk[, CLhparameters=pop_u_httk,hepatic.model='unscaled',suppress.messages=TRUE,clint.pvalue.threshold=1)]
:=Qtotal.liverc*BW^0.75]
pop_u_httk[, Qliver:= Qliver / (Qliver +
pop_u_httk[, hepatic.bioavailability* Clint / Rblood2plasma)]
Funbound.plasma setdiff(names(pop_u_httk),
pop_u_httk[, (names(httk::parameterize_steadystate(chem.cas=chemcas)))):=NULL]
#compute Css for each individual
# calc_analytic_css is vectorized so that this will work.
<- httk::calc_analytic_css(parameters=pop_u_httk,
css clint.pvalue.threshold=1,
daily.dose=1,
clint.pop.cv=NULL,
fup.pop.cv=NULL,
output.units="uM",
model="3compartmentss",
species="Human",
suppress.messages=TRUE)#,
# recalc.blood2plasma=TRUE)
#get Css95
<- quantile(css, probs=probs)
cssquants return(cssquants)
}
<- function(chemcas,
hlfun_u
MW, probs=0.95,
minimum.Funbound.plasma=0.0001)
{# Create a data table with uncertainty only:
print(chemcas)
<- as.data.table(parameterize_steadystate(chem.cas=chemcas,
pop_u_httk clint.pvalue.threshold=1,
minimum.Funbound.plasma=0))
<-parameterize_schmitt(chem.cas=chemcas)
params.schmitt <- params.schmitt$pKa_Donor
psd <- params.schmitt$pKa_Accept
psa <- params.schmitt[regexpr("pKa",names(params.schmitt))==-1]
params.schmitt names(params.schmitt):=params.schmitt]
pop_u_httk[,:= paste(psd, collapse=",")]
pop_u_httk[,pKa_Donor := paste(psa, collapse=",")]
pop_u_httk[,pKa_Accept <- pop_u_httk[rep(1,1000)]
pop_u_httk
<- calc_ionization(pH=7.4,parameters=pop_u_httk)
ions := Pow * (ions$fraction_neutral + 0.001 * ions$fraction_charged + ions$fraction_zwitter)]
pop_u_httk[,Dow74
# Parse the Funbound.plasma.dist into quantiles:
<- pop_u_httk[1,Funbound.plasma.dist]
Funbound.plasma.dist print(paste(chemcas,Funbound.plasma.dist))
if (nchar(Funbound.plasma.dist) -
nchar(gsub(",","",Funbound.plasma.dist))!=2)
{stop("Funbound.plasma distribution should be three values (median,low95th,high95th) separated by commas.")
}<- strsplit(Funbound.plasma.dist,",")
temp <- as.numeric(temp[[1]][1])
ppb.median <- as.numeric(temp[[1]][2])
ppb.low95 <- as.numeric(temp[[1]][3])
ppb.high95
if (ppb.median>minimum.Funbound.plasma)
{# Use optim to estimate alpha and beta for fup such that the median and 95%
# credible interval approximate the estimate from MCMC:
<- optim(c(2,(1-ppb.median)/ppb.median*2),
ppb.fit function(x) (0.95-pbeta(ppb.high95,x[1],x[2])+
pbeta(ppb.low95,x[1],x[2]))^2+
-qbeta(0.5,x[1],x[2]))^2,
(ppb.medianmethod="BFGS")
# We are drawing new values for the unadjusted Fup:
:=rbeta(1000,
pop_u_httk[, unadjusted.Funbound.plasma$par[1],
ppb.fit$par[2])]
ppb.fitelse if (ppb.high95>minimum.Funbound.plasma)
}
{# Assume that since the median is zero but the u95 is not, that there is
# an exponential distribution:
# 97.5% of clearance values will be below Funbound.plasma.u95:
:=runif(1000,
pop_u_httk[, unadjusted.Funbound.plasma
minimum.Funbound.plasma,min(1,(ppb.high95-minimum.Funbound.plasma)/0.975))]
as.logical(rbinom(1000,1,.975)),
pop_u_httk[:=minimum.Funbound.plasma]
unadjusted.Funbound.plasmaelse {
} :=minimum.Funbound.plasma]
pop_u_httk[, unadjusted.Funbound.plasma
}
# then we need to adjust for in vitro binding:
:=subset(physiology.data,
pop_u_httk[,Flipid=='Plasma Effective Neutral Lipid Volume Fraction')[,
Parameterwhich(colnames(physiology.data) == 'Human')]]
:=1 / (Dow74 * Flipid +
pop_u_httk[,Funbound.plasma.adjustment1 / unadjusted.Funbound.plasma)/unadjusted.Funbound.plasma]
pop_u_httk[, :=unadjusted.Funbound.plasma*Funbound.plasma.adjustment]
Funbound.plasma
# Enforce a minimum ppb:
<minimum.Funbound.plasma,
pop_u_httk[Funbound.plasma:=minimum.Funbound.plasma]
Funbound.plasma
<- get_rblood2plasma(chem.cas=chemcas)
Rblood2plasma if (is.na(Rblood2plasma))
{:=calc_rblood2plasma(params=pop_u_httk)]
pop_u_httk[, Rblood2plasmaelse pop_u_httk[, Rblood2plasma:=Rblood2plasma] # Right now we don't similate variability on a known blood:plasma ratio
} # Parse the clint.dist to retrieve the quantiles:
<- pop_u_httk[1,Clint.dist]
Clint.dist if (nchar(Clint.dist) -
nchar(gsub(",","",Clint.dist))!=3)
{stop("Clint distribution should be four values (median,low95th,high95th,pValue) separated by commas.")
}<- strsplit(Clint.dist,",")
temp <- as.numeric(temp[[1]][1])
Clint.median <- as.numeric(temp[[1]][2])
Clint.low95 <- as.numeric(temp[[1]][3])
Clint.high95 <- as.numeric(temp[[1]][4])
Clint.pvalue
# Use optim to estimate mean and standard deviation of Clint such that the
# median and 95% credible interval approximate the estimate from MCMC:
if (Clint.high95>0)
{if (Clint.median>0)
{<- optim(c(log(Clint.median),0.3), function(x)
clint.fit 0.95-plnorm(Clint.high95,x[1],x[2])+plnorm(Clint.low95,x[1],x[2]))^2+
(-qlnorm(0.5,x[1],x[2]))^2)
(Clint.median:=rlnorm(1000,clint.fit$par[1],clint.fit$par[2])]
pop_u_httk[,Clintelse {
} :=exp(runif(1000,log(1),
pop_u_httk[,Clintlog(Clint.high95)-log(1))/0.975))]
(
}else pop_u_httk[,Clint:=0]
} as.logical(rbinom(1000,1,Clint.pvalue)),Clint:=0] # the Bayesian "p-value" here reflects how often there is no clearance
pop_u_httk[
:=calc_hepatic_clearance(chem.cas=chemcas,parameters=pop_u_httk,hepatic.model='unscaled',suppress.messages=TRUE,clint.pvalue.threshold=1)]
pop_u_httk[, CLh:=Qtotal.liverc*BW^0.75]
pop_u_httk[, Qliver:= Qliver / (Qliver + Funbound.plasma * Clint / Rblood2plasma)]
pop_u_httk[, hepatic.bioavailability
<- names(predict_partitioning_schmitt(chem.cas="80-05-7"))
PC.names <-cbind(pop_u_httk,as.data.table(predict_partitioning_schmitt(parameters=pop_u_httk)))
pop_u_httk:=calc_elimination_rate(parameters=pop_u_httk)]
pop_u_httk[,hl
return(log(2)/quantile(unlist(pop_u_httk[,"hl",with=FALSE]),probs=probs))
}
Histograms depicting the median measured values (a, b) and 95% credible intervals (c, d) for fraction unbound in plasma (\(f_{up}\), a, c) and intrinsic hepatic clearance (\(Cl_{int}\), b, d). In panel a, the distribution of estimated \(f_{up}\) for the single protein concentration assay (solid line) is compared with the protein titration (three concentrations) method (dotted) line. In panel b, the median estimated \(Cl_{int}\) is shown. In panel c, the distribution with larger credible intervals corresponds to the single protein concentration assay, with a median credible interval is 0.1, corresponding to a precision of roughly +-0.05. When data from the protein titration are jointly analyzed (smaller credible intervals), the certainty is increased to a median credible interval of 0.029, or roughly +-0.015. The width of the credible interval is a measure of the certainty in an estimate, that is, a smaller credible interval indicates greater certainty. For the \(f_{up}\) values, a credible interval approaching 1 indicates that all possible values are consistent with the data, that is, the data do not help identify fup since fup must be between 0 and 1. In panel d the distribution of credible intervals for \(Cl_{int}\) is shown.
<- httk::wambaugh2019.raw
Fig1.table $Error1 <- Fig1.table$Base.Fup.High - Fig1.table$Base.Fup.Low
Fig1.table$Error2 <- Fig1.table$Affinity.Fup.High - Fig1.table$Affinity.Fup.Low
Fig1.table$Sigma1 <- Fig1.table$Error1/(2*qnorm(0.975))
Fig1.table$Mean1 <- (Fig1.table$Base.Fup.High + Fig1.table$Base.Fup.Low)/(qnorm(0.975))
Fig1.table$CV1 <- Fig1.table$Sigma1/Fig1.table$Mean1
Fig1.table$Sigma2 <- Fig1.table$Error2/(2*qnorm(0.975))
Fig1.table$Mean2 <- (Fig1.table$Base.Fup.High + Fig1.table$Base.Fup.Low)/(qnorm(0.975))
Fig1.table$CV2 <- Fig1.table$Sigma2/Fig1.table$Mean2
Fig1.table$ErrorClint <- Fig1.table$CLint.1uM.High95th - Fig1.table$CLint.1uM.Low95th
Fig1.table$SigmaClint <- Fig1.table$ErrorClint/(2*qnorm(0.975))
Fig1.table$MeanClint <- (Fig1.table$CLint.1uM.High95th + Fig1.table$CLint.1uM.Low95th)/(qnorm(0.975))
Fig1.table$CVClint <- Fig1.table$SigmaClint/Fig1.table$MeanClint
Fig1.table
<- subset(Fig1.table,!is.na(Affinity.Fup.Med))
Fupmeasured <- subset(Fupmeasured,!duplicated(CAS))
Fupmeasured <- subset(Fig1.table,!is.na(CLint.1uM.Median))
CLintmeasured <- subset(CLintmeasured,!duplicated(CAS))
CLintmeasured
print(paste("New HTTK experimental measurements were made for",length(unique(Fig1.table$CAS)),"chemicals from the ToxCast HTS library."))
print(paste("Fup was successfully measured for",length(unique(subset(Fig1.table,!is.na(Affinity.Fup.Med))$CAS)),"chemicals from the ToxCast HTS library."))
print(paste("CLint was successfully measured for",length(unique(subset(Fig1.table,!is.na(CLint.1uM.Median))$CAS)),"chemicals from the ToxCast HTS library."))
<- ggplot(Fig1.table)+
Fig1a geom_freqpoly(binwidth = 0.5,lwd=1.5,color="Red",aes(Base.Fup.Med))+
geom_freqpoly(binwidth = 0.5,lwd=1.5,lty=3,color="Blue",aes(Affinity.Fup.Med))+
xlab(expression(paste("Measured ",F[up]))) +
ylab("Number of chemicals")+
scale_x_log10(label=scientific_10,limits=c(10^-6,1.05))+
labs(title=paste(length(unique(subset(Fig1.table,!is.na(Affinity.Fup.Med))$CAS)),"Chemicals Measured"))+
theme_bw()+
theme( text = element_text(size=10)) +
annotate("text", x=10^-6,y=80,size=8,label="a")
print(paste("Median 100%:",signif(median(Fig1.table$Base.Fup.Med,na.rm=TRUE),2)))
print(paste("Median Titration:",signif(median(Fig1.table$Affinity.Fup.Med,na.rm=TRUE),2)))
<- ggplot(Fig1.table)+
Fig1b geom_histogram(binwidth = 0.5,alpha=0.6,fill="green",aes(CLint.1uM.Median))+
xlab(expression(paste("Measured ",Cl[int]," (uL/min/",10^6," hepatocytes)"))) +
ylab("Number of chemicals")+
scale_x_log10(label=scientific_10,limits=c(10^-1,5*10^3))+
labs(title=paste(length(unique(subset(Fig1.table,!is.na(CLint.1uM.Median))$CAS)),"Chemicals Measured"))+
theme_bw()+
theme( text = element_text(size=10)) +
annotate("text", x=2*10^-1,y=80,size=8,label="b")
print(paste(sum(Fig1.table$CLint.1uM.Median==0,na.rm=TRUE),"Zero Measurments"))
print(paste("Median Non-Zero:",signif(median(Fig1.table$CLint.1uM.Median,na.rm=TRUE),3)))
<- ggplot(Fig1.table)+
Fig1c geom_histogram(binwidth = 0.1,alpha=0.2,fill="Red",aes(Error1))+
geom_histogram(binwidth = 0.1,alpha=0.2,fill="Blue",aes(Error2))+
xlab("Width of Credible Interval") +
ylab("Number of chemicals")+
scale_x_log10(label=scientific_10,limits=c(5*10^-4,1.05))+
labs(title=paste("CI:",signif(median(Fig1.table$Error1,na.rm=TRUE),2),"CV:",signif(median(Fig1.table$CV1,na.rm=TRUE),1),"/ CI:",signif(median(Fig1.table$Error2,na.rm=TRUE),1),"CV:",signif(median(Fig1.table$CV2,na.rm=TRUE),1)))+
theme_bw()+
theme( text = element_text(size=10))+
annotate("text", x=5*10^-4,y=42,size=8,label="c")
print(paste("Median 100% CI:",signif(median(Fig1.table$Error1,na.rm=TRUE),2),"CV:",signif(median(Fig1.table$CV1,na.rm=TRUE),2)))
print(paste("Median Titr. CI:",signif(median(Fig1.table$Error2,na.rm=TRUE),2),"CV:",signif(median(Fig1.table$CV2,na.rm=TRUE),2)))
<- ggplot(Fig1.table)+
Fig1d geom_histogram(binwidth = 0.1,alpha=0.6,fill="green",aes(ErrorClint))+
xlab("Width of Credible Interval") +
ylab("Number of chemicals")+
scale_x_log10(label=scientific_10,limits=c(10^0,10^3))+
labs(title=paste("Median CI:",signif(median(Fig1.table$ErrorClint,na.rm=TRUE),2),"CV:",signif(median(Fig1.table$CVClint,na.rm=TRUE),2)))+
theme_bw()+
theme( text = element_text(size=10))+
annotate("text", x=1.5,y=25,size=8,label="d")
print(paste("Median CI:",signif(median(Fig1.table$ErrorClint,na.rm=TRUE),2),"CV:",signif(median(Fig1.table$CVClint,na.rm=TRUE),2)))
multiplot(Fig1a,Fig1c,Fig1b,Fig1d,cols=2,widths=c(1.75,1.75))
Distribution of Binding Affinities Employed in Bayesian \(f_{up}\) Estimations. To jointly analyze the data from the RED assay conducted at three protein concentrations, a model for binding to protein with an dissociation constant (that is, binding affinity) must be assumed. Although the specific binding protein (for example, hemoglobin) and the stoichiometry are unknown, this number provides a rough estimate for each chemical of how much that chemical would be expected to be impacted by the presence of plasma binding.
<- httk::wambaugh2019.raw
Fig2.table print(paste(sum(!is.na(Fig2.table$Fup.point)),"total successful Fup estimates using point estimation."))
print(paste(sum(!is.na(Fig2.table$Base.Fup.Med)),"total successful Fup estimates using BAyesian analysis of top protein concentration."))
$Error1 <- Fig2.table$Base.Fup.High - Fig2.table$Base.Fup.Low
Fig2.table$Error2 <- Fig2.table$Affinity.Fup.High - Fig2.table$Affinity.Fup.Low
Fig2.table$Sigma1 <- Fig2.table$Error1/(2*qnorm(0.975))
Fig2.table$Mean1 <- (Fig2.table$Base.Fup.High + Fig2.table$Base.Fup.Low)/(qnorm(0.975))
Fig2.table$CV1 <- Fig2.table$Sigma1/Fig2.table$Mean1
Fig2.table$Sigma2 <- Fig2.table$Error2/(2*qnorm(0.975))
Fig2.table$Mean2 <- (Fig2.table$Base.Fup.High + Fig2.table$Base.Fup.Low)/(qnorm(0.975))
Fig2.table$CV2 <- Fig2.table$Sigma2/Fig2.table$Mean2
Fig2.table$Point.Estimate <- "Good"
Fig2.table$Fup.point<0&!is.na(Fig2.table$Fup.point),"Point.Estimate"]<-"<0"
Fig2.table[Fig2.table$Fup.point>1&!is.na(Fig2.table$Fup.point),"Point.Estimate"]<-">1"
Fig2.table[Fig2.tableis.na(Fig2.table$Fup.point),"Point.Estimate"]<-"No Value"
Fig2.table[$Point.Estimate%in%c("<0","No Value"),"Fup.point"] <- Fig2.table[Fig2.table$Point.Estimate%in%c("<0","No Value"),"Base.Fup.Med"]
Fig2.table[Fig2.table$Uncertain<-"Yes"
Fig2.tableis.na(Fig2.table$CV1),"CV1"] <- -999
Fig2.table[$CV1<0.49,"Uncertain"] <- "No"
Fig2.table[Fig2.table$CV1==-999,"CV1"] <- NA
Fig2.table[Fig2.table
<- ggplot(subset(Fig2.table,!is.na(Base.Fup.Med)),aes(x=Fup.point,y=Base.Fup.Med,shape=Point.Estimate,alpha=Uncertain,colour=Point.Estimate)) +
Fig2a geom_point(size=3)+
# geom_errorbar(aes(ymax = Base.Fup.High, ymin=Base.Fup.Low))+
scale_y_log10(label=scientific_10,limits=c(10^-6,2)) +
scale_x_log10(label=scientific_10,limits=c(10^-6,2)) +
ylab(expression(paste(F[up]," Bayesian Analysis (Single Conc.)"))) +
xlab(expression(paste(F[up]," Point Estimate (Traditional Analysis)"))) +
geom_abline(intercept = 0, slope = 1,linetype="dashed", colour="Blue") +
# geom_vline(xintercept = 1, size=2,linetype="dashed", colour="red")+
geom_vline(xintercept = 0.01, size=2,linetype="dotted", colour="red")+
geom_hline(yintercept = 0.01, size=2,linetype="dotted", colour="red")+
scale_colour_discrete(name="Point Estimate")+
scale_shape_discrete(name="Point Estimate") +
scale_alpha_manual(values=c(1,0.3))+
theme_bw() +
theme(legend.position="bottom", text = element_text(size=12))+
annotate("text", size=8,x = 10^-6, y = 2, label = "a")+
guides(alpha=guide_legend(nrow=2,byrow=TRUE),colour=guide_legend(nrow=2,byrow=TRUE),shape=guide_legend(nrow=2,byrow=TRUE))
#print(Fig2a)
print(paste(sum(Fig2.table$Uncertain=="Yes"),"uncertain estimates using single conc."))
print(paste(sum(Fig2.table$Point.Estimate=="<0"),"chemicals with negative point estimates."))
print(paste(sum(Fig2.table$Point.Estimate==">1"),"chemicals with Fup > 1."))
summary(lm(data=subset(Fig2.table,!is.na(Fup.point)&Fup.point>0&Base.Fup.Med>0),log(Fup.point)~log(Base.Fup.Med)))
$Uncertain2<-"Yes"
Fig2.tableis.na(Fig2.table$CV2),"CV2"] <- -999
Fig2.table[$CV2<0.49,"Uncertain2"] <- "No"
Fig2.table[Fig2.table$CV2==-999,"CV2"] <- NA
Fig2.table[Fig2.table$TopFail <- ""
Fig2.table!is.na(Fig2.table$Affinity.Fup.Med)&Fig2.table$Uncertain2=="No","TopFail"]<-"No"
Fig2.table[$TopFail=="No"&!is.na(Fig2.table$Base.Fup.Med)&Fig2.table$Uncertain=="Yes","TopFail"]<-"Single Conc. Only"
Fig2.table[Fig2.table!is.na(Fig2.table$Affinity.Fup.Med)&Fig2.table$Uncertain2=="Yes","TopFail"]<-"Yes"
Fig2.table[!is.na(Fig2.table$Affinity.Fup.Med)&is.na(Fig2.table$Base.Fup.Med),"TopFail"] <- "Single Conc. Only"
Fig2.table[is.na(Fig2.table$Base.Fup.Med),"Base.Fup.Med"] <- Fig2.table[is.na(Fig2.table$Base.Fup.Med),"Affinity.Fup.Med"]
Fig2.table[
<- ggplot(subset(Fig2.table,!is.na(Affinity.Fup.Med)),aes(x=Base.Fup.Med,y=Affinity.Fup.Med,shape=TopFail,alpha=TopFail,colour=TopFail)) +
Fig2b geom_point(size=3)+
geom_point(data=subset(Fig2.table,TopFail=="No"),size=3)+
# geom_errorbar(aes(ymax = Fup.High.x, ymin=Fup.Low.x))+
# geom_errorbarh(aes(xmin=Fup.Low.y,xmax=Fup.High.y))+
scale_y_log10(label=scientific_10,limits=c(10^-8,2)) +
scale_x_log10(label=scientific_10,limits=c(10^-8,2)) +
xlab(expression(paste(F[up]," Bayesian Analysis (Single Conc.)"))) +
ylab(expression(paste(F[up]," Bayesian Analysis (Three Conc.)"))) +
geom_abline(intercept = 0, slope = 1,linetype="dashed", colour="Blue")+
scale_colour_discrete(name="Uncertain")+
scale_shape_discrete(name="Uncertain") +
# annotate("text",size=8, x = 10^-4, y = 1, label = "B")+
scale_alpha_manual(name="Uncertain",values=c(1,0.6,0.3))+
theme_bw() +
theme(legend.position="bottom", text = element_text(size=12))+
annotate("text", size=8,x = 10^-8, y = 2, label = "b")+
guides(alpha=guide_legend(nrow=2,byrow=TRUE),colour=guide_legend(nrow=2,byrow=TRUE),shape=guide_legend(nrow=2,byrow=TRUE))
#print(Fig2b)
# Need to reduce to unique chemicals.
<- subset(Fig2.table,!is.na(Base.Fup.Med)&Uncertain=="No")
Fupbaseworked <- subset(Fig2.table,!is.na(Base.Fup.Med)&Uncertain=="Yes")
Fupbasenot <- subset(Fupbaseworked,!duplicated(CAS))
Fupbaseworked <- subset(Fupbasenot,!duplicated(CAS))
Fupbasenot # Also need to count a chemical as a success if it worked for either duplicate:
<- subset(Fupbasenot,!(CAS%in%Fupbaseworked$CAS))
Fupbasenot
<- subset(Fig2.table,!is.na(Affinity.Fup.Med)&Uncertain2=="No")
Fupaffinityworked <- subset(Fig2.table,!is.na(Affinity.Fup.Med)&Uncertain2=="Yes")
Fupaffinitynot <- subset(Fupaffinityworked,!duplicated(CAS))
Fupaffinityworked <- subset(Fupaffinitynot,!duplicated(CAS))
Fupaffinitynot <- subset(Fupaffinitynot,!(CAS%in%Fupaffinityworked$CAS))
Fupaffinitynot
print(paste(dim(Fupbaseworked)[1],"total successful Fup estimates using single conc."))
print(paste(signif(dim(Fupbasenot)[1]/(dim(Fupbaseworked)[1]+dim(Fupbasenot)[1])*100,3),"percent failure rate using single conc."))
print(paste(dim(Fupaffinityworked)[1],"total successful Fup estimates using protein titration."))
print(paste(signif(dim(Fupaffinitynot)[1]/(dim(Fupaffinityworked)[1]+dim(Fupaffinitynot)[1])*100,3),"percent failure rate using protein titration."))
print(paste(dim(Fupbasenot)[1],"uncertain estimates using original protocol."))
print(paste(dim(Fupaffinitynot)[1],"uncertain estimates using protein titration."))
print(paste(dim(Fupaffinityworked)[1]-dim(Fupbaseworked)[1],"chemicals that could not be measured at 100% plasma protein."))
<- httk:::Wetmore.data
Wetmore.data <- subset(Wetmore.data,Reference=="Wetmore 2015")
W2015
<- subset(Wetmore.data,Reference=="Wetmore 2012")
W2012 print(paste("The Fup assay failed for ",signif(sum(W2012$Fub==0.005)/length(W2012$Fub)*100,3),"% of chemicals in {Wetmore, 2012} and ",signif(sum(W2015$Fub==0)/length(W2015$Fub)*100,3),"% of chemicals in {Wetmore, 2015}."))
<- subset(Fig2.table,TopFail=="Single Conc. Only")
Fuplod print(paste("Among the chemicals that were uncertain using the original protocol but could be measured using the new three concentration protocol, the median estimated Fup was ",signif(median(Fuplod$Affinity.Fup.Med),2),"with values as low as ",signif(min(Fuplod$Affinity.Fup.Med),2),"and as high as",signif(max(Fuplod$Affinity.Fup.Med),2),". Previously, a default of 0.005 has been assumed when Fup could not be measured {Rotroff, 2010}.}"))
summary(lm(data=subset(Fig2.table,!is.na(Base.Fup.Med)&Affinity.Fup.Med>0&Base.Fup.Med>0),log(Base.Fup.Med)~log(Affinity.Fup.Med)))
#dev.new(width=10,height=6)
multiplot(Fig2a,Fig2b,cols=2,widths=c(1.75,1.75))
Estimates. Two separate Bayesian analyses were performed: one
including only those data collected at 100% PPP, and a second that used
a binding model to relate data collected at 10%, 30%, and 100% PPP. In
(A), \(f_{up}\) point
estimates <0 are plotted on the y-axis. These values are considered
“good” if they lie between 0 and 1. The Bayesian median fup values
(x-axis) derived from the 100% PPP data and the associated uncertainty
estimates (vertical lines) were correlated with point estimates.
Bayesian model estimates were constrained to be greater than zero and
less than 1;. The red dotted line (at 1%) represents a previously
assumed generic limit of quantification (Wetmore, 2015; Wetmore, 2012).
In (B), the medians from the two Bayesian analyses are
compared. In both plots, the diagonal dashed line indicates the identity
line (“perfect predictions”). In both panels the Bayesian estimates are
“uncertain” (semi-solid) if the CV was larger than 0.5 and certain
(solid) otherwise.
$Class <- "Other"
Fig1.table$DTXSID %in% pharma$DTXSID,"Class"]<-"Pharmaceutical"
Fig1.table[Fig1.table$Class <- as.factor(Fig1.table$Class)
Fig1.table
<- ggplot(Fig1.table,aes(Affinity.Kd.Med,fill=Class))+
Fig3 geom_histogram(binwidth = 0.5)+
xlab(expression(paste("Binding Affinity (", mu,"M)",sep=""))) +
ylab("Number of chemicals")+
scale_x_log10(label=scientific_10)+
scale_fill_discrete(name="Chemical Class")+
theme_bw() +
theme(legend.position="bottom", text = element_text(size=18))
# dev.new()
print(Fig3)
\(Cl_{int}\) estimate for 1 uM chemical concentration displayed, though 1 and 10 uM were fit jointly. The size of the credible interval varies significantly from chemical to chemical, especially for the lower clearance rates. Plots of individual fits are provided as supplemental material .
<- subset(httk::wambaugh2019.raw,!is.na(CLint.1uM.Median))
clearance.table
print(paste(sum(clearance.table$CLint.1uM.Median==0),"zero valued median Bayesian posteriors"))
print(paste(sum(clearance.table$CLint.1uM.Point==0&clearance.table$CLint.1uM.Median>0,na.rm=TRUE),"zero valued point estimates where median posterior>0"))
print(paste(sum(clearance.table$CLint.1uM.Point>0&clearance.table$CLint.1uM.Median==0,na.rm=TRUE),"zero valued median Bayesian posteriors where point estimate was non-zero"))
print(paste(sum(is.na(clearance.table$CLint.1uM.Point)&clearance.table$CLint.1uM.Median==0,na.rm=TRUE),"zero valued median Bayesian posteriors where point estimate was NA"))
print(paste(sum(is.na(clearance.table$CLint.1uM.Point)&clearance.table$CLint.1uM.Median>0,na.rm=TRUE),"non-zero valued median Bayesian posteriors where point estimate was NA"))
$Fit <- "Decreasing"
clearance.table$CLint.1uM.Median == 0,"Fit"] <- "Median Zero"
clearance.table[clearance.table#clearance.table[is.na(clearance.table$CLint.1uM.Point == 0),"Fit"] <- "Point Est. Missing"
for (i in 1:dim(clearance.table)[1])
if (!is.na(clearance.table[i,"CLint.1uM.Point"]))
{if (clearance.table[i,"CLint.1uM.Point"] == 0) clearance.table[i,"Fit"] <- "Point Est. Zero"
else clearance.table[i,"Fit"] <- "Point Est. Failed"
}
set.seed(123456)
is.na(clearance.table$CLint.1uM.Point),"CLint.1uM.Point"]<-runif(sum(is.na(clearance.table$CLint.1uM.Point)),0.1,0.2)
clearance.table[$CLint.1uM.Point==0,"CLint.1uM.Point"]<-runif(sum(clearance.table$CLint.1uM.Point==0),0.3,0.8)
clearance.table[clearance.table$CLint.1uM.Low95th==0,"CLint.1uM.Low95th"]<-0.1
clearance.table[clearance.table$CLint.1uM.Median==0,"CLint.1uM.Median"]<-0.1
clearance.table[clearance.table$CLint.1uM.High95th==0,"CLint.1uM.High95th"]<-0.1
clearance.table[clearance.table$CLint.1uM.High95th>1000,"CLint.1uM.High95th"]<-1000
clearance.table[clearance.table$CLint.1uM.Low95th<0.1,"CLint.1uM.Low95th"]<-0.1
clearance.table[clearance.table
<- lm(log10(CLint.1uM.Median)~log10(CLint.1uM.Point),data=subset(clearance.table,!is.na(CLint.1uM.Point)&CLint.1uM.Point>1))
clearance.fit <- ggplot(clearance.table, aes(x=CLint.1uM.Point,y=CLint.1uM.Median,colour=Fit)) +
Fig4 geom_segment(aes(x=CLint.1uM.Point,xend=CLint.1uM.Point,y=CLint.1uM.Low95th,yend=CLint.1uM.High95th),size=1,alpha=0.5)+
geom_point(size=3) +
scale_y_log10(label=scientific_10,limits=c(10^-1,1000)) +
scale_x_log10(label=scientific_10_skip,limits=c(10^-1,1000)) +
xlab(expression(paste(CL[int]," (",mu,"L/min/",10^{6}," Hep.) Point Estimate",sep="")))+
ylab(expression(paste(CL[int]," (",mu,"L/min/",10^{6}," Hep.) Bayesian",sep="")))+
geom_vline(xintercept = 0.25, size=2,colour="black")+
geom_vline(xintercept = 0.9, size=2,colour="black")+
geom_abline(intercept = 0, slope = 1,linetype="dashed")+
scale_colour_discrete(name="Assay Result")+
theme_bw() +
theme(legend.position="bottom", text = element_text(size=14))+
guides(colour=guide_legend(nrow=2,byrow=TRUE))+
annotate("text",x = 10^2, y = 10^0,size=6, label = lm_R2(clearance.fit,prefix=""),parse=TRUE)
#dev.new()
print(Fig4)
summary(clearance.fit)
The impact of uncertainty on estimated values for \(f_{up}\) and \(Cl_{int}\) could in turn affect additional TK quantities, for example elimination half-life (\(t_{1/2}\), shown in panel A) and steady-state serum concentration (\(C_{ss}\), shown in panel B). In both plots, the x-axis indicates the value that would have been estimated with previous methods, while the y-axis indicates the median and 95% credible interval for values calculated with our Bayesian method. Chemicals plotted with triangles are those that could be estimated with the previous, point estimate methods, while chemicals plotted with circles could not. The dashed line indicates the identity (perfect predictor) line. Chemicals that had no point estimate are plotted on the identity line.
# This section takes a long time because calc_vdist is not yet data.table compatible:
<- as.data.table(subset(httk::wambaugh2019,!is.na(Human.Funbound.plasma)&!is.na(Human.Clint)&!is.na(logP)))
ceetox <- ceetox$CAS[ceetox$CAS %in% get_cheminfo(model="3compartmentss")]
httk_cas
# back up chem.physical_and_invtro.data:
<- chem.physical_and_invitro.data
HTTK.data
# Use the point estimates:
<- add_chemtable(httk::wambaugh2019.raw,current.table=chem.physical_and_invitro.data,data.list=list(Compound="Name",CAS="CAS",Clint="CLint.1uM.Point",Funbound.plasma="Fup.point"),reference="Wambaugh 2019",species="Human",overwrite=TRUE)
chem.physical_and_invitro.data
<- httk_cas[httk_cas %in% get_cheminfo(model="schmitt")]
schmitt_cas %in%schmitt_cas,
ceetox[CAS:=log(2)/httk::calc_elimination_rate(chem.cas=CAS),
hlpoint=.(CAS)]
by
<- HTTK.data
chem.physical_and_invitro.data
%in%schmitt_cas,
ceetox[CAS:=hlfun_u(chemcas=CAS,
hl_975MW=MW,
probs=0.975),
=.(CAS)]
by
%in%schmitt_cas,
ceetox[CAS:=hlfun_u(chemcas=CAS,
hl_medMW=MW,
probs=0.5),
=.(CAS)]
by
%in%schmitt_cas,
ceetox[CAS:=hlfun_u(chemcas=CAS,
hl_25MW=MW,
probs=0.025),
=.(CAS)]
by
:=TRUE]
ceetox[,hlpointEstimatedis.na(hlpoint),hlpointEstimated:=FALSE]
ceetox[is.na(hlpoint),hlpoint:=hl_med]
ceetox[
<- lm(data=subset(ceetox,hlpointEstimated),log10(hlpoint)~log10(hl_med))
hlfit <- ggplot(data=subset(ceetox,hlpointEstimated),aes(x=hlpoint,y=hl_med)) +
Fig5a geom_point(size=3,alpha=0.5) +
geom_errorbar(aes(ymax = hl_975, ymin=hl_25),alpha=0.3)+
scale_y_log10(label=scientific_10,limits=c(1,10^6)) +
scale_x_log10(label=scientific_10,limits=c(1,10^6)) +
xlab("half-life (h) Point Estimate")+
ylab("Bayesian half-life (h)")+
geom_abline(intercept = 0, slope = 1,linetype="dashed",color="blue")+
annotate("text", size=8,x = 1, y = 10^6, label = "A")+
theme_bw() +
theme(text = element_text(size=18))+
annotate("text",x = 3*10^3, y = 10^0,size=5, label = lm_R2(hlfit,prefix=""),parse=TRUE)
<- lm(data=ceetox,log10(cssmed)~log10(cssmed_u))
cssfit <- ggplot(data=ceetox,aes(x=cssmed,y=cssmed_u)) +
Fig5b geom_point(size=3,alpha=0.5) +
geom_errorbar(aes(ymin=css25_u,ymax=css975_u),alpha=0.3)+
scale_y_log10(label=scientific_10,limits=c(10^-3,10^5)) +
scale_x_log10(label=scientific_10,limits=c(10^-3,10^5)) +
xlab(expression(paste(C[ss]," Point Estimate (uM)")))+
ylab(expression(paste("Bayesian ",C[ss]," (uM)")))+
annotate("text", size=8,x = 10^-3, y = 10^5, label = "B")+
theme_bw() +
theme(text = element_text(size=18)) +
geom_abline(intercept = 0, slope = 1,linetype="dashed",color="blue")+
annotate("text",x = 10^2, y = 10^-3,size=5, label = lm_R2(cssfit,prefix=""),parse=TRUE)
#dev.new(width=8,height=5)
multiplot(Fig5a,Fig5b,cols=2,widths=c(1.75,1.75))
Here we examine the telative contributions of uncertainty and variability to differences between the 95th percentile and median \(C_{ss}\). Monte Carlo analysis of both variability (triangles) and uncertainty (circles) give distributions that can be characterized by a 95th percentile \(C_{ss}\) for which individuals achieve a higher \(C_{ss}\) for the same fixed dose rate – these individuals can be considered more “sensitive” to chemical exposure. Uncertainty and variability can be combined (squares) to estimate a 95th percentile reflecting both factors. For most chemicals, variability contributes more than uncertainty to the difference between the \(C_{ss}\) predicted for the median \(Cl_{int}\) and \(f_{up}\) values.
set.seed(42)
%in%httk_cas,
ceetox[CAS:=calc_mc_css(chem.cas=CAS,
css95_uvclint.pvalue.threshold=1, #Don't need to use p-values here (Bayesian)
output.units="uM",
model="3compartmentss",
species="Human"),
=.(CAS)]
by
set.seed(42)
%in%httk_cas,
ceetox[CAS:=calc_mc_css(chem.cas=CAS,
css95_vclint.pvalue.threshold=1, #Don't need to use p-values here (Bayesian)
output.units="uM",
clint.meas.cv=NULL,
fup.meas.cv=NULL,
clint.pop.cv=0.3,
fup.pop.cv=0.3,
model="3compartmentss",
species="Human"),
=.(CAS)]
by
set.seed(42)
%in%httk_cas,
ceetox[CAS:=cssfun_u(chemcas=CAS,
css95_uMW=MW),
=.(CAS)]
by
set.seed(42)
%in%httk_cas,
ceetox[CAS:=cssfun_u(chemcas=CAS,
cssmed_uMW=MW,
probs=0.5),
=.(CAS)]
by
set.seed(42)
%in%httk_cas,
ceetox[CAS:=cssfun_u(chemcas=CAS,
css25_uMW=MW,
probs=0.025),
=.(CAS)]
by
set.seed(42)
%in%httk_cas,
ceetox[CAS:=cssfun_u(chemcas=CAS,
css975_uMW=MW,
probs=0.975),
=.(CAS)]
by
%in%httk_cas,
ceetox[CAS:=httk::calc_analytic_css(chem.cas=CAS,
cssmedclint.pvalue.threshold=1,
output.units="uM",
model="3compartmentss",
species="Human",
suppress.messages=TRUE),
=.(CAS)]
by
# Go back to Bayes estimates:
<- HTTK.data
chem.physical_and_invitro.data
<- data.table::melt(ceetox[, .(Compound, CAS, css95_u, css95_v, css95_uv, cssmed)],
dt_melt id.vars=c("Compound","CAS", "cssmed"),
variable.name="type",
value.name="css95")
:=gsub(x=type, pattern="css95_uv", replacement="Both", fixed=TRUE)]
dt_melt[, type:=gsub(x=type, pattern="css95_u", replacement="Uncertainty", fixed=TRUE)]
dt_melt[, type:=gsub(x=type, pattern="css95_v", replacement="Variability", fixed=TRUE)]
dt_melt[, type
#now set the factor levels explicitly
:=factor(type,
dt_melt[, typelevels=c("Uncertainty", "Variability", "Both"))]
:=css95/cssmed]
dt_melt[, Norm
#now set the factor levels explicitly
:=factor(Compound,
dt_melt[, Compoundlevels=dt_melt[type=="Both",Compound][order(dt_melt[type=="Both",Norm])])]
<- ggplot(data=dt_melt) +
Fig6 geom_point(size=2,alpha=0.7,aes(x=Compound,
y=Norm,
color=type,
shape=type)) +
scale_y_log10(label=scientific_10,limits=c(0.9,500)) +
ylab(expression(paste("Ratio of ",C[ss]," ",95^th," Percentile to Median Estimate"))) +
xlab("Chemicals")+
scale_colour_discrete(name=expression(paste(C[ss]," Varied to Reflect")))+
scale_shape_discrete(name=expression(paste(C[ss]," Varied to Reflect"))) +
theme_bw() +
theme(legend.position="bottom",axis.text.x = element_blank())+
annotate("text", size=8,x = levels(dt_melt$Compound)[length(unique(dt_melt$Compound))/2], y = 400, label = paste("Median Ratio for Uncertainty:",signif(median(subset(dt_melt,type=="Uncertainty")$Norm),3)))+
annotate("text", size=8,x = levels(dt_melt$Compound)[length(unique(dt_melt$Compound))/2], y = 200, label = paste("Median Ratio for Variability:",signif(median(subset(dt_melt,type=="Variability")$Norm),3)))+
annotate("text", size=8,x = levels(dt_melt$Compound)[length(unique(dt_melt$Compound))/2], y = 100, label = paste("Median Ratio for Both:",signif(median(subset(dt_melt,type=="Both")$Norm),3)))
#dev.new(width=10,height=6)
print(Fig6)
High throughput risk-based prioritization can be conducted by using doses predicted to cause bioactivity identified by high-throughput screening. High throughput exposure estimates have large uncertainty, indicated by the triangles (median of distribution) and vertical bar (upper 95th percentile range). There are many in vitro activities identified for each chemical, the 80% interval of equivalent dose rates is indicated by the vertical red bar, with a horizontal slash indicating the median and the circle plot points indicating the 10th percentile of the bioactive concentration. Anywhere the entire exposure bar is below the bioactivity circle there is 95% probability that the median exposure will not cause a dose sufficient to cause the 10th percentile in vitro bioactivity. Chemicals are sorted from left-to-right by the margin between the 10th percentile bioactivity (small circle) and the upper 95th percentile limit on estimated median intake rate. In panel A, all chemicals for which \(C_{ss}\) could be calculated are shown using only variability. In panel B, the shaded region of panel A is shown, using both variability and uncertainty to calculate the bioactive dose. Arrows in panel B identify six chemicals where exposure and 10th percentile bioactivity potentially overlap that would have been missed as priority chemicals if uncertainty analysis was not performed. Two overlapping bars are shown for each chemical’’s bioactive doses – the upper one is for variability alone (as in Panel A), the lower is for uncertainty and variability together.
# Head to ftp://newftp.epa.gov/COMPTOX/High_Throughput_Screening_Data/Previous_Data/ToxCast_Data_Release_Oct_2015/ to get the ToxCast/Tox21 data:
# These are the concentrations that caused activity in exess of the background (ACC) "hits".
#
# # Load the ACC table into an object Tox21.acc:
# Tox21.acc <- read.csv("INVITRODB_V2_LEVEL5/EXPORT_LVL5&6_ASID7_TOX21_151020.csv",stringsAsFactors=FALSE)
# # Subset to the chemicals of interest:
# Tox21.acc <- subset(Tox21.acc,casn%in%Fig1.table$CAS)
# # We only need hits:
# Tox21.acc <- subset(Tox21.acc,hitc==1)
#
# # Pare this down to just the data we need:
# Tox21.acc <- Tox21.acc[,c("code","chid","chnm","casn","aenm","modl_acc","flags")]
#
# # In this vignette we use the precalculated quantiles of the ACC's
# # to help keep the HTTK package smaller:
#
# wambaugh2019.tox21 <- NULL
# for (this.chem in sort(unique(Tox21.acc$casn)))
# {
# this.subset <- subset(Tox21.acc,casn==this.chem)
# this.row <- data.frame(casn=this.chem,
# med.conc =
# median(10^(this.subset$modl_acc)),
# q10.conc =
# quantile(10^(this.subset$modl_acc),0.1),
# low.conc =
# min(10^(this.subset$modl_acc)),
# q90.conc =
# quantile(10^(this.subset$modl_acc),0.9),
# high.conc =
# max(10^(this.subset$modl_acc)),
# stringsAsFactors = F)
# wambaugh2019.tox21 <- rbind(wambaugh2019.tox21, this.row)
# }
<- httk::wambaugh2019.seem3
chem.preds <- httk::wambaugh2019.nhanes
directnhanes.preds <- httk::wambaugh2019.tox21
Tox21.acc.numeric
#Subset to those chemicals with HTS hits:
<- subset(ceetox,CAS%in%Tox21.acc.numeric$casn)
human.hits
# Now calculate the oral equivalent dose for each chemical:
# Must make sure that CSS is in units of uM!!
for (this.chem in human.hits$CAS)
{<-
med.conc $casn==this.chem,"med.conc"]
Tox21.acc.numeric[Tox21.acc.numeric<-
q10.conc $casn==this.chem,"q10.conc"]
Tox21.acc.numeric[Tox21.acc.numeric<-
low.conc $casn==this.chem,"low.conc"]
Tox21.acc.numeric[Tox21.acc.numeric<-
q90.conc $casn==this.chem,"q90.conc"]
Tox21.acc.numeric[Tox21.acc.numeric<-
high.conc $casn==this.chem,"high.conc"]
Tox21.acc.numeric[Tox21.acc.numeric<- human.hits[human.hits$CAS==this.chem][["css95_v"]]
css95.v <- human.hits[human.hits$CAS==this.chem][["css95_uv"]]
css95.uv $CAS==this.chem,"HTS.Median.ACC"] <- med.conc
human.hits[human.hits$CAS==this.chem,"HTS.Q10.ACC"] <- q10.conc
human.hits[human.hits$CAS==this.chem,"HTS.Q90.ACC"] <- q90.conc
human.hits[human.hits$CAS==this.chem,"HTS.Low.ACC"] <- low.conc
human.hits[human.hits$CAS==this.chem,"HTS.High.ACC"] <- high.conc
human.hits[human.hits$CAS==this.chem,"HTS.Median.equivdose.v"] <- med.conc/css95.v
human.hits[human.hits$CAS==this.chem,"HTS.Q10.equivdose.v"] <- q10.conc/css95.v
human.hits[human.hits$CAS==this.chem,"HTS.Q90.equivdose.v"] <- q90.conc/css95.v
human.hits[human.hits$CAS==this.chem,"HTS.Low.equivdose.v"] <- low.conc/css95.v
human.hits[human.hits$CAS==this.chem,"HTS.High.equivdose.v"] <- high.conc/css95.v
human.hits[human.hits$CAS==this.chem,"HTS.Median.equivdose.uv"] <- med.conc/css95.uv
human.hits[human.hits$CAS==this.chem,"HTS.Q10.equivdose.uv"] <- q10.conc/css95.uv
human.hits[human.hits$CAS==this.chem,"HTS.Q90.equivdose.uv"] <- q90.conc/css95.uv
human.hits[human.hits$CAS==this.chem,"HTS.Low.equivdose.uv"] <- low.conc/css95.uv
human.hits[human.hits$CAS==this.chem,"HTS.High.equivdose.uv"] <- high.conc/css95.uv
human.hits[human.hits
}
#Add the exposure predictions:
for (this.chem in human.hits$CAS)
{print(this.chem)
# If we have direct inferrences from NHANES, use those exposures:
if (this.chem %in% directnhanes.preds$CASRN)
{$CAS==this.chem,"Exposure.median"] <- directnhanes.preds[directnhanes.preds$CASRN==this.chem,"lP"]
human.hits[human.hits$CAS==this.chem,"Exposure.high"] <- directnhanes.preds[directnhanes.preds$CASRN==this.chem,"lP.max"]
human.hits[human.hits# Otherwise use the heuristics model (Wambaugh 2014):
else if (this.chem %in% chem.preds$CAS) {
} $CAS==this.chem,"Exposure.median"] <- chem.preds[chem.preds$CAS==this.chem,"seem3"]
human.hits[human.hits$CAS==this.chem,"Exposure.high"] <- chem.preds[chem.preds$CAS==this.chem,"seem3.u95"]
human.hits[human.hits
}
}
# Drop chemicals without exposure predictions:
<- subset(human.hits,!is.na(Exposure.median))
human.hits
# This code puts the chemicals into order by margin between exposure and activity:
<- unique(human.hits$Compound)
chem.names <- rep(Inf,times=length(chem.names))
potency names(potency) <- chem.names
<- potency
potency.u for (this.chem in chem.names)
{<- subset(human.hits,Compound==this.chem)
this.subset <- this.subset$HTS.Q10.equivdose.v/this.subset$Exposure.high
potency[this.chem] <- this.subset$HTS.Q10.equivdose.uv/this.subset$Exposure.high
potency.u[this.chem]
}<- sort(potency)
potency $Compound <- factor(human.hits$Compound, levels=names(potency))
human.hits
<- potency.u[names(potency.u)[potency.u<1]]
potency.u <- potency.u[names(potency)[potency>1]]
potency.u
<- names(potency.u[!is.na(potency.u)])
new.overlaps <- new.overlaps[1]
first.change <- new.overlaps[length(new.overlaps)]
last.change <- names(potency)[which(names(potency)==first.change)-5]
window.start <- names(potency)[which(names(potency)==last.change)+5]
window.end <- names(potency)[(which(names(potency)==first.change)-5):(which(names(potency)==last.change)+5)]
window.names # Initialize a new plot:
<- ggplot(data=human.hits) +
Fig7a geom_segment(aes(x=Compound,xend=Compound,y=HTS.Q10.equivdose.v,yend=HTS.Q90.equivdose.v),size=1,color="white",alpha=0.5) +
geom_rect(mapping=aes(xmin=window.start,xmax=window.end,ymin=10^-9,ymax=10^3),color="grey",alpha=0.5)+
geom_segment(aes(x=Compound,xend=Compound,y=HTS.Q10.equivdose.v,yend=HTS.Q90.equivdose.v),size=1,color="red",alpha=0.5) +
geom_point(aes(x=Compound,y=HTS.Median.equivdose.v),shape=3) +
geom_point(aes(x=Compound,y=HTS.Q10.equivdose.v)) +
theme_bw() +
theme(axis.text.x = element_blank()) +
theme(axis.title.x = element_text(size=16)) +
theme(axis.title.y = element_text(size=16)) +
scale_y_log10(label=scientific_10,limits = c(10^-9,10^3)) +
geom_segment(aes(x=Compound,xend=Compound,y=Exposure.high,yend=Exposure.median),size=1,color="blue",alpha=0.5) +
geom_point(aes(x=Compound,y=Exposure.median),shape=2) +
ylab("Bioactive Dose & Exposure\nmg/kg BW/day\n(Variability Only)") +
xlab("Chemicals Ranked By Ratio Between Bioactive Dose and Exposure")+
annotate("text", size=8,x = names(potency)[10], y = 100, label = "a")
# Initialize a new plot:
<- ggplot() +
Fig7c geom_segment(data=human.hits,aes(x=Compound,xend=Compound,y=HTS.Q10.equivdose.v,yend=HTS.Q90.equivdose.v),size=3,color="grey") +
geom_segment(data=human.hits,aes(x=Compound,xend=Compound,y=HTS.Q10.equivdose.uv,yend=HTS.Q90.equivdose.uv),size=3,color="red",alpha=0.5) +
geom_point(data=human.hits,aes(x=Compound,y=HTS.Median.equivdose.uv),shape=3,size=2) +
geom_point(data=human.hits,aes(x=Compound,y=HTS.Q10.equivdose.uv),size=2) +
theme_bw() +
theme(axis.text.x = element_text(angle = 60, hjust = 1,size=6)) +
theme(axis.text.x = element_blank()) +
theme(axis.title.x = element_text(size=16)) +
theme(axis.title.y = element_text(size=16)) +
xlim(window.names) +
scale_y_log10(label=scientific_10,limits = c(10^-8,2*10^2)) +
geom_segment(data=human.hits,aes(x=Compound,xend=Compound,y=Exposure.high,yend=Exposure.median),size=3,color="blue",alpha=0.5) +
geom_segment(data=subset(human.hits,Compound%in%names(potency.u)),aes(x=Compound,xend=Compound,y=10^-8,yend=Exposure.median/1.5),size=2,arrow=arrow(length=unit(0.5,"cm")))+
geom_point(data=human.hits,aes(x=Compound,y=Exposure.median),shape=2,size=2) +
xlab("") +
ylab("Bioactive Dose & Exposure\nmg/kg BW/day\n(Uncertainty and Variability)") +
annotate("text", size=8,x = window.names[1], y = 100, label = "b")
#print(Fig10b)
#dev.new(width=10,height=6)
multiplot(Fig7a,Fig7c,cols=1,heights=c(0.5,0.5))
write.csv(subset(human.hits,Exposure.high>HTS.Q10.equivdose.uv)[,c("Compound","DSSTox_Substance_Id","Human.Clint","Human.Funbound.plasma","HTS.Q10.ACC","HTS.Q10.equivdose.v","HTS.Q10.equivdose.uv","Exposure.high")],file=paste("SupTable5-",Sys.Date(),".txt",sep=""),row.names=FALSE)
<- subset(chem.physical_and_invitro.data,regexpr("Wetmore",Human.Clint.Reference)!=-1)[,c("Compound","CAS","DTXSID","Formula")]
Wetmore.chems <- subset(Wetmore.chems,CAS%in%get_cheminfo())
Wetmore.chems
for (this.cas in Wetmore.chems$CAS)
{$CAS==this.cas,"Css.Lit"] <- get_lit_css(
Wetmore.chems[Wetmore.chemschem.cas=this.cas,
output.units="uM")
set.seed(42)
$CAS==this.cas,"Css.v192"] <- calc_mc_css(
Wetmore.chems[Wetmore.chemschem.cas=this.cas,
fup.meas.cv=NULL,
clint.meas.cv=NULL,
fup.censored.dist=TRUE,
output.units="uM")
set.seed(42)
$CAS==this.cas,"Css.v110"] <- calc_mc_css(
Wetmore.chems[Wetmore.chemschem.cas=this.cas,
output.units="uM")
}
median(Wetmore.chems$Css.v110/Wetmore.chems$Css.Lit,na.rm=TRUE)
$Human.Clint2 <- sapply(ceetox$Human.Clint,function(x) as.numeric(strsplit(x,",")[[1]][1]))
ceetox$Human.Fup2 <- sapply(ceetox$Human.Funbound.plasma,function(x) as.numeric(strsplit(x,",")[[1]][1]))
ceetoxwrite.csv(ceetox[,c(1,2,8,9,16,17,6,7,12,30,10,11,15,31,13,14,18:28)],row.names=FALSE,file=paste("Supplemental-Table4-",Sys.Date(),".txt",sep=""))
print(paste("Complete HTTK data on",sum(get_cheminfo()%in%ceetox$CAS),"new chemicals."))
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.