The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Authors: Glen Satten (GSatten@emory.edu), Mo Li (mo.li@louisiana.edu), Ni Zhao (nzhao10@jhu.edu)
In this study, we introduce a novel statistical framework for differential abundance analysis of microbiome data, termed the Compositional Accelerated Failure Time (CAFT) model. The CAFT model addresses zero read counts by treating them as censored observations below the detection limit, similar to censoring mechanisms employed in survival analysis. This approach is inherently resistant to multiplicative bias, eliminates the need for pseudocounts, and addresses compositional bias through the establishment of appropriate score test procedures. For FDR control, we utilize and expand the idea from Efron’s empirical null distribution to achieve better FDR control.
You can install the version of CAFT from Github:
# install.packages("remotes")
remotes::install_github("mli171/CAFT", build_vignettes = TRUE, dependencies = TRUE)
browseVignettes("CAFT")
The main function in CAFT package is:
caft()
Apply ‘caft’ to a dataset from the study of gut microbiome data set focusing on the adult colorectal cancer using the stool samples (Pasolli et al.,2017).
library(CAFT)
library(phyloseq)
data(Colon)
count.tab = t(as.data.frame(as.matrix(otu_table(Colon))))
sample.tab = as.data.frame(as.matrix(sample_data(Colon)))
tax.tab = as.data.frame(as.matrix(tax_table(Colon)))
dim(count.tab)
pNA = which(is.na(sample.tab$age))
if(length(pNA) > 0){
count.tab = count.tab[-pNA, ]
sample.tab = sample.tab[-pNA,]
}
# No missing values from gender
## otu presence filtering
p_otu = which(rowSums(t(count.tab) > 0) > 1)
count.tab = count.tab[,p_otu]
tax.tab = tax.tab[p_otu,]
dim(count.tab)
cens.prop = colMeans(count.tab == 0, na.rm = T)
mean(cens.prop)
Disease1 = Disease2 = rep(0, NROW(sample.tab)) # healthy
Disease1[sample.tab$disease == "CRC"] = 1
Disease2[sample.tab$disease == "adenoma"] = 1
Age = as.numeric(sample.tab$age)
Gender = as.numeric(factor(sample.tab$gender)) - 1
x.test = cbind(Disease1, Disease2)
x.adj = cbind(Age, Gender)
res.CAFT = caft(otu.table=count.tab, x.test=x.test, x.adj=x.adj)
res.CAFT = caft(otu.table=count.tab, x.test=x.test, x.adj=x.adj, n.cores=4)
If you use CAFT in your work, please cite:
Satten, G. A., Li, M., & Zhao, N. (2025). CAFT: A Compositional Log-Linear Model for Microbiome Data with Zero Cells. bioRxiv, 2025.11.26.690468. https://doi.org/10.1101/2025.11.26.690468
BibTeX:
@article{satten2025caft,
title = {CAFT: A Compositional Log-Linear Model for Microbiome Data with Zero Cells},
author = {Satten, Glen A. and Li, Mo and Zhao, Ni},
journal = {bioRxiv},
year = {2025},
doi = {10.1101/2025.11.26.690468},
note = {Preprint}
}Pasolli E, Schiffer L, Manghi P, Renson A, Obenchain V, Truong D, Beghini F, Malik F, Ramos M, Dowd J, Huttenhower C, Morgan M, Segata N, Waldron L (2017). “Accessible, curated metagenomic data through ExperimentHub.” Nat. Methods, 14(11), 1023–1024. ISSN 1548-7091, 1548-7105, doi:10.1038/nmeth.4468.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.