This vignete shows how UPS1 spike-in experiments may be analyzed using the packages wrProteo, wrMisc and wrGraph, all are available on CRAN.
Furthermore, the Bioconductor package limma will be used internally for it’s robust statistical testing.
# If not already installed, you'll have to install this package and wrMisc first.
install.packages("wrMisc")
install.packages("wrProteo")
# The package wrGraph is recommended for better graphics
install.packages("wrGraph")
# You cat start the vignettes for this package by typing :
browseVignettes("wrProteo") # ... and the select the html output
Now let’s load the packages needed :
library(wrMisc)
library(wrProteo)
library(wrGraph)
# Version number for wrProteo :
packageVersion("wrProteo")
#> [1] '1.2.0'
The main aim of the experimental setup in UPS1 spike-in experiments is to provide a framework to test identification and quantitation procedures in proteomics. By mixing known amounts of a collection of human proteins (UPS1) in various concentrations into a yeast protein extract, one expects to find only human proteins varying between samples. In terms of ROC curves the human proteins are expected to show up as true positives (TP). In contrast, all yeast proteins were always added in the same quantity and should thus be observed constant, ie as true negatives (TN).
The data were published with the article : Ramus et al 2016 Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset. J Proteomics 2016 Jan 30;132:51-62. PMID: 26585461 doi: 10.1016/j.jprot.2015.11.011
This dataset is available on PRIDE as PXD001819 (and/or on ProteomeXchange).
Briefly, this experiment aims to compare quantification of the heterologous spike-in UPS1 in yeast protein extracts as constant matrix.
## Two small functions we'll need lateron
replSpecType <- function(x, annCol="SpecType", replBy=cbind(old=c("mainSpe","species2"), new=c("Yeast","UPS1"))) {
## rename $annot[,"SpecType"] to more specific names
chCol <- annCol[1] %in% colnames(x$annot)
if(chCol) { chCol <- which(colnames(x$annot)==annCol[1])
chIt <- replBy[,1] %in% unique(x$annot[,chCol]) # check items to replace if present
if(any(chIt)) for(i in which(chIt)) {useLi <- which(x$annot[,chCol] %in% replBy[i,1]); cat("useLi",head(useLi),"\n"); x$annot[useLi,chCol] <- replBy[i,2]}
} else message(" replSpecType: 'annCol' not found in x$annot !")
x }
replNAProtNames <- function(x,annCol=c("ProteinName","Accession","SpecType")) {
## replace in $annot missing ProteinNames by concatenating Accession + SpecType (ie 2nd & 3rd of annCol)
chCol <- annCol %in% colnames(x$annot)
if(all(chCol)) {
chNA <- is.na(x$annot[,annCol[1]])
if(any(chNA)) x$annot[which(chNA),annCol[1]] <- paste(x$annot[which(chNA),annCol[2]],x$annot[which(chNA),annCol[3]],sep="_")
} else message(" replNAProtNames: none of the columnnames 'annCol' found in x$annot !")
x }
MaxQuant is free software provided by the Max-Planck-Insutute, see Tyanova et al 2016. Typically MaxQuant exports by default quantitation data on level of consensus-proteins as a folder called txt with a file called proteinGroups.txt . So in a standard case one needs only to provide the path to this file.
path1 <- system.file("extdata", package="wrProteo")
fiNaMa <- "proteinGroups.txt.gz"
specPrefMQ <- c(conta="CON_|LYSC_CHICK", mainSpecies="OS=Saccharomyces cerevisiae", spike="HUMAN_UPS")
dataMQ <- readMaxQuantFile(path1, file=fiNaMa, specPref=specPrefMQ, refLi="mainSpe")
#> by species : conta: 9 mainSpe: 1040 species2: 48
#> -> readMaxQuantFile : normalize using subset of 1040
The data were imported and median-normalized, the protein annotation was parsed to atomatically extract IDs, protein-names and species information.
## a summary of the quantitation data
dim(dataMQ$quant)
#> [1] 1097 27
summary(dataMQ$quant[,1:8]) # the first 8 cols
#> 12500amol_R1 12500amol_R2 12500amol_R3 125amol_R1
#> Min. :17.52 Min. :15.85 Min. :15.09 Min. :15.22
#> 1st Qu.:22.49 1st Qu.:22.39 1st Qu.:22.40 1st Qu.:22.33
#> Median :23.47 Median :23.47 Median :23.47 Median :23.43
#> Mean :23.69 Mean :23.63 Mean :23.64 Mean :23.58
#> 3rd Qu.:24.86 3rd Qu.:24.81 3rd Qu.:24.80 3rd Qu.:24.86
#> Max. :30.36 Max. :30.32 Max. :30.34 Max. :30.21
#> NA's :79 NA's :77 NA's :92 NA's :118
#> 125amol_R2 125amol_R3 25000amol_R1 25000amol_R2
#> Min. :15.60 Min. :14.99 Min. :15.98 Min. :15.43
#> 1st Qu.:22.36 1st Qu.:22.34 1st Qu.:22.46 1st Qu.:22.49
#> Median :23.44 Median :23.45 Median :23.52 Median :23.54
#> Mean :23.59 Mean :23.58 Mean :23.72 Mean :23.72
#> 3rd Qu.:24.88 3rd Qu.:24.87 3rd Qu.:24.97 3rd Qu.:24.96
#> Max. :30.22 Max. :30.25 Max. :30.32 Max. :30.20
#> NA's :113 NA's :122 NA's :94 NA's :98
colnames(dataMQ$annot)[1:12]
#> [1] "Accession"
#> [2] "ProteinName"
#> [3] "Species"
#> [4] "Contam"
#> [5] "SpecType"
#> [6] "Majority.protein.IDs"
#> [7] "Fasta.headers"
#> [8] "Number.of.proteins"
#> [9] "Potential.contaminant"
#> [10] "Razor...unique.peptides"
#> [11] "Razor...unique.peptides.12500amol_R1"
#> [12] "Razor...unique.peptides.12500amol_R2"
table(dataMQ$annot[,"Species"])
#>
#> Gallus gallus HUMAN_UPS Homo sapiens
#> 1 2 47
#> Mus musculus Saccharomyces cerevisiae
#> 1 1040
table(dataMQ$annot[,"SpecType"])
#>
#> conta mainSpe species2
#> 9 1040 48
ProteomeDiscoverer is commercial software from ThermoFisher (www.thermofisher.com). In this case, the identification was performed using the XCalibur module of ProteomeDiscoverer. In ProteomeDiscoverer quantitation data on level of consensus-proteins should be exported to tabulated text files, which can be treated by this function. The resultant data were export as ‘Proteins’ in tablulated format (the option R-headers was checked, however data can also be read when this option was not chosen).
path1 <- system.file("extdata", package="wrProteo")
fiNaPd <- "pxd001819_PD2.4_Proteins.txt.gz"
file.exists(file.path(path1,fiNaPd))
#> [1] TRUE
## Note: data exported from ProteomeDiscoverer does not have proper column-names
sampNa <- paste(rep(c(50,125,250,500,2500,5000,12500,25000,50000), each=3),"amol_R",rep(1:3,9),sep="")
specPrefPD <- c(conta="Bos tauris|Gallus", mainSpecies="OS=Saccharomyces cerevisiae", spike="OS=Homo sapiens")
dataPD <- readPDExport(file=fiNaPd, path=path1, sampleNames=sampNa, refLi="mainSpe", specPref=specPrefPD)
#> -> readPDExport : Trouble ahead, expecting tabulated text file (this file might not be right format) !!
#> -> readPDExport : correcting 'annotCol' to export of R-friendly colnames
#> -> readPDExport : Note: 6 (out of 919) unrecognized species
#> -> readPDExport : by species : Gallus gallus: 1 ; Homo sapiens: 46 ; Saccharomyces cerevisiae: 866 ;
#> -> readPDExport : normalize using subset of 866
The data were imported and median-normalized, the protein annotation was parsed to atomatically extract IDs, protein-names and species information.
## a summary of the quantitation data
summary(dataPD$quant[,1:8]) # the first 8 cols
#> 50amol_R1 50amol_R2 50amol_R3 125amol_R1
#> Min. :13.72 Min. :11.67 Min. :11.28 Min. :12.52
#> 1st Qu.:18.73 1st Qu.:18.78 1st Qu.:18.80 1st Qu.:18.81
#> Median :19.94 Median :19.94 Median :19.94 Median :19.94
#> Mean :20.00 Mean :20.02 Mean :20.06 Mean :20.03
#> 3rd Qu.:21.30 3rd Qu.:21.35 3rd Qu.:21.35 3rd Qu.:21.37
#> Max. :26.41 Max. :26.47 Max. :26.50 Max. :26.52
#> NA's :32 NA's :28 NA's :28 NA's :25
#> 125amol_R2 125amol_R3 250amol_R1 250amol_R2
#> Min. :10.66 Min. :13.65 Min. :10.26 Min. :10.39
#> 1st Qu.:18.83 1st Qu.:18.80 1st Qu.:18.85 1st Qu.:18.79
#> Median :19.94 Median :19.93 Median :19.94 Median :19.93
#> Mean :20.08 Mean :20.09 Mean :20.07 Mean :20.02
#> 3rd Qu.:21.41 3rd Qu.:21.39 3rd Qu.:21.40 3rd Qu.:21.34
#> Max. :26.52 Max. :26.57 Max. :26.62 Max. :26.55
#> NA's :31 NA's :30 NA's :27 NA's :24
dim(dataPD$quant)
#> [1] 918 27
colnames(dataPD$annot)[]
#> [1] "Accession" "ProteinName"
#> [3] "Species" "Contam"
#> [5] "SpecType" "Description"
#> [7] "Contaminant" "Number.of.Peptides"
#> [9] "Number.of.PSMs" "Number.of.Unique.Peptides"
#> [11] "Number.of.AAs" "Coverage.in.Percent"
#> [13] "rowNo"
#head(dataPD$annot)
table(dataPD$annot[,"Species"])
#>
#> Gallus gallus Homo sapiens Saccharomyces cerevisiae
#> 1 46 866
table(dataPD$annot[,"SpecType"])
#>
#> conta mainSpe species2
#> 1 866 46
Proline is free software provided by the Profi-consortium,
see Ramus et al 2016 and Bouyssié et al 2020 (Proline: an efficient and user-friendly software suite for large-scale proteomics. Bioinformatics. 2020, PMID: 32096818, DOI: 10.1093/bioinformatics/btaa118 )
In Proline quantitation data on level of consensus-proteins can be exported to csv or tabulated text files, which can be treated by this function.
In order for easy and proper comparisons we need to make sure all columns are in the same order.
# get all results (MaxQuant,ProteomeDiscoverer, ...) in same order
sampNa <- paste0(rep(c(50,125,250,500,2500,5000,12500,25000,50000),each=3),"amol_R",rep(1:3,9))
grp9 <- paste0(rep(c(50,125,250,500,2500,5000,12500,25000,50000),each=3),"amol")
## it is more convenient to re-order columns this way in each project
dataPD <- corColumnOrder(dataPD,sampNames=sampNa) # already in good order
#> -> corColumnOrder : order already correct !
dataMQ <- corColumnOrder(dataMQ,sampNames=sampNa)
#dataPL <- corColumnOrder(dataPL,sampNames=sampNa)
The from the protein annotation the membership to 3 groups was extracted : yeast (matrix) as “‘main Spe’”, UPS1 (spike) as ‘species2’ and other contaminants (‘conta’). The first two terms will be replace by more specific ones (‘Yeast’ and ‘UPS1’) :
#>
#> conta mainSpe species2
#> 1 866 46
#> useLi 1 2 3 4 5 6
#> useLi 122 801 802 804 809 814
#> useLi 57 58 59 60 61 62
#> useLi 3 10 11 12 13 14
No additional normalization is needed, all data were already median normalized to the host proteins (ie Saccaromyces cerevisiae) after importing the initial quantification-output using ‘readMaxQuantFile()’, and ‘readPDExport()’.
As mentioned in the (general) vignette ‘wrProteoVignette1’ it is important to investigate the nature of NA-values, in particular the hypothesis that NA-values originate from very low abundance instances (eg non of its peptides identified during the MS1 run).
## Let's inspect NA values as graphic
matrixNAinspect(dataPD$quant, gr=grp9, tit="ProteomeDiscoverer") # gl(9,3)
NAs values represent a challange for statistical testing. In the sections above we provided evidence that NA-values typically represent proteins with very low protein abundance that finally ended as non-detectable (NA). The number of NAs varies between samples : Very low concentrations of UPS1 tend not to get detected and thus contribute largely to the NAs. Since the amout if yeast proteins stays constant they should always get detected the way in all samples.
## Let's look at the number of NAs. Is there an accumulated number in lower UPS1 semples ?
sumNAperGroup(dataPD$raw, grp9)
#> 50amol 125amol 250amol 500amol 2500amol 5000amol 12500amol 25000amol
#> 88 86 80 51 14 8 8 12
#> 50000amol
#> 10
sumNAperGroup(dataMQ$raw, grp9)
#> 50amol 125amol 250amol 500amol 2500amol 5000amol 12500amol 25000amol
#> 354 353 353 355 278 277 248 281
#> 50000amol
#> 284
The function testRobustToNAimputation() from the pckage wrProteo performs NA-imputation and statistical testing (after repeated imputation) between all groups of samples the same time (as it would be inefficient to separate these two tasks). The tests underneith apply shrinkage from the empirical Bayes procedure from the bioconductor package limma. In addition, various formats of multiple test correction can be directly added to the results : Benjamini-Hochberg FDR, local false discovery rate (lfdr, using the package fdrtool, see Strimmer 2008 doi: 10.1093/bioinformatics/btn209), or modified testing by ROTS, etc … One of the advantages of this method, is that multiple rounds of imputation are run, so that final results (including pair-wise testing) gets stabilized to (rare) stochastic effects without bias due to low variances.
We are ready to launch the testing :
## Let's run pairwise-testing for ProteomeDiscoverer
testPD <- testRobustToNAimputation(dataPD$quant, gr=grp9, lfdrInclude=TRUE, annot=dataPD$annot) # gl(9,3
#> -> testRobustToNAimputation -> matrixNAneighbourImpute : n.woNA= 24429 n.NA = 357
#> model 10 %-tile of (min 1 NA/grp) 154 NA-neighbour values
#> imputation: mean= 14 sd= 0.93
#> -> testRobustToNAimputation -> combineMultFilterNAimput : at presenceFilt: 917 917 907 917 897 905 916 917 917 917 917 895 907 898 917 917 917 917 917 917 917 916 917 917 917 916 917 917 917 917 917 917 917 917 917 917 out of 918
#> -> testRobustToNAimputation -> combineMultFilterNAimput : at abundanceFilt: 877 876 873 877 875 870 903 904 902 903 913 890 898 890 909 911 910 909 909 910 911 910 910 908 909 907 908 910 913 912 908 909 910 910 912 910
#> -> testRobustToNAimputation -> combineMultFilterNAimput : at NA> mean: 874, 871, 869, 874, 874, 869, 881, 882, 883, 894, 883, 884, 882, 885, 909, 883, 884, 884, 895, 909, 910, 881, 883, 882, 894, 907, 908, 909, 884, 885, 883, 895, 909, 909, 912 and 909
## Let's run pairwise-testing for MaxQuant
testMQ <- testRobustToNAimputation(dataMQ$quant, gr=grp9, lfdrInclude=TRUE, annot=dataMQ$annot)
#> -> testRobustToNAimputation -> matrixNAneighbourImpute : n.woNA= 26836 n.NA = 2783
#> model 10 %-tile of (min 1 NA/grp) 802 NA-neighbour values
#> imputation: mean= 20.2 sd= 0.76
#> -> testRobustToNAimputation -> combineMultFilterNAimput : at presenceFilt: 1041 1038 1005 1038 1005 1007 1033 1042 1039 1038 1051 1013 1017 1016 1053 1037 1044 1042 1040 1039 1054 1030 1034 1033 1035 1033 1048 1043 1034 1040 1042 1038 1035 1055 1040 1035 out of 1097
#> -> testRobustToNAimputation -> combineMultFilterNAimput : at abundanceFilt: 1003 1004 982 995 982 979 1009 1014 1009 1006 1040 996 997 1000 1026 1025 1031 1025 1028 1018 1032 1020 1023 1020 1025 1022 1029 1025 1023 1029 1026 1024 1021 1031 1028 1022
#> -> testRobustToNAimputation -> combineMultFilterNAimput : at NA> mean: 943, 938, 948, 934, 944, 949, 935, 946, 945, 945, 934, 939, 939, 939, 973, 937, 955, 957, 952, 982, 981, 932, 944, 949, 948, 981, 972, 990, 923, 936, 943, 937, 963, 965, 980 and 968
From these results we’ll use i) the NA-imputed version of our datasets for plotting principal components and ii) the (stabilized) testing results for counting TP, FP, etc.
To have the statistical testing results in our main object, we’ll copy the imputed data to our initial list :
Principal component analysis (PCA) cannot handle NA-values. Either all lines with any NAs have to be excluded, or data after NA-imputation have to be used. Here, we chose the second option. Plots will be made using the package wrGraph.
plotPCAw(testPD$datImp, sampleGrp=grp9, tit="PCA on ProteomeDiscoverer (NAs imputed)", rowTyName="proteins", useSymb2=0)
plotPCAw(testMQ$datImp ,sampleGrp=grp9, tit="PCA on MaxQuant (NAs imputed)", rowTyName="proteins", useSymb2=1:9)
Again, since the sample consists predominantly of yeast proteins that are kept constant, one would not expect many sample-related characteristics. In this case we might be rather interested in the (global) characteristics and similarity of the UPS1 proteins :
# limit to UPS1
plotPCAw(testPD$datImp[which(testPD$annot[,"SpecType"]=="UPS1"),], sampleGrp=grp9, tit="PCA on ProteomeDiscoverer, UPS1 only (NAs imputed)",rowTyName="proteins", useSymb2=0)
plotPCAw(testMQ$datImp[which(testMQ$annot[,"SpecType"]=="UPS1"),], sampleGrp=grp9, tit="PCA on MaxQuant, UPS1 only (NAs imputed)",rowTyName="proteins", useSymb2=1:9)
Based on PCA one cane see that the comparison with concentrations >= 250 aMol may be better to actually detect differences, as also confirmed by ROC part later.
A very universal and simple way to analyze data is by checking on several pairwise comparisons, in particular if the experimental setup does not include complete multifactorial plans.
This UPS1 spike-in experiment has 27 samples organized (according to meta-information) as 27 groups. Thus one obtains in total 36 comparisons which will make comparisons very crowded. The publication by Ramus focussed on 3 pairwise comparisons only. Here we’ll extend this to 5 pairwise comparisons.
Thus, the graphical comparisons were restricted to three comparisons presented in the original publication plus two additional ones. The distribution of intra-group CV-values showed (without major surprise) that the highest UPS1 concentrations replicated best. In consequence comparisons using this group are expected to have a decent chance to rather specifically reveil a high number of UPS1 proteins.
## The names of all the pair-wise comparisons possible
colnames(testPD$BH)
#> [1] "12500amol-125amol" "12500amol-25000amol" "125amol-25000amol"
#> [4] "12500amol-2500amol" "125amol-2500amol" "25000amol-2500amol"
#> [7] "12500amol-250amol" "125amol-250amol" "25000amol-250amol"
#> [10] "2500amol-250amol" "12500amol-50000amol" "125amol-50000amol"
#> [13] "25000amol-50000amol" "2500amol-50000amol" "250amol-50000amol"
#> [16] "12500amol-5000amol" "125amol-5000amol" "25000amol-5000amol"
#> [19] "2500amol-5000amol" "250amol-5000amol" "50000amol-5000amol"
#> [22] "12500amol-500amol" "125amol-500amol" "25000amol-500amol"
#> [25] "2500amol-500amol" "250amol-500amol" "50000amol-500amol"
#> [28] "5000amol-500amol" "12500amol-50amol" "125amol-50amol"
#> [31] "25000amol-50amol" "2500amol-50amol" "250amol-50amol"
#> [34] "50000amol-50amol" "5000amol-50amol" "500amol-50amol"
Now, we’ll construct a table showing all possible pairwise-comparisons. Using the function numPairDeColNames() we can easily extract the UPS1 concentrations as numeric content and show the (log-)ratio of the pairwise comparisons (column ‘log2rat’), the final concentration (in fmol) and the number of differentially abundant proteins passing 5% FDR (using classical Benjamini-Hochberg FDR or lfdr Strimmer 2008.
## The number of differentially abundant proteins passing 5% FDR (ProteomeDiscoverer and MaxQuant)
signCount <- cbind( sig.PD.BH=colSums(testPD$BH < 0.05, na.rm=TRUE), sig.PD.lfdr=if("lfdr" %in% names(testPD)) colSums(testPD$lfdr < 0.05, na.rm=TRUE),
sig.MQ.BH=colSums(testMQ$BH < 0.05, na.rm=TRUE), sig.MQ.lfdr=if("lfdr" %in% names(testMQ)) colSums(testMQ$lfdr < 0.05, na.rm=TRUE) )
table1 <- numPairDeColNames(testPD$BH, stripTxt="amol", sortByAbsRatio=TRUE)
table1 <- cbind(table1, signCount[table1[,1],])
knitr::kable(table1, caption="All pairwise comparisons (extended from Ramus et al)", align="c")
index | log2rat | conc1 | conc2 | sig.PD.BH | sig.PD.lfdr | sig.MQ.BH | sig.MQ.lfdr | |
---|---|---|---|---|---|---|---|---|
50000amol-50amol | 34 | 9.966 | 50 | 50000 | 448 | 401 | 372 | 307 |
25000amol-50amol | 31 | 8.966 | 50 | 25000 | 464 | 404 | 357 | 305 |
125amol-50000amol | 12 | 8.644 | 125 | 50000 | 296 | 234 | 206 | 158 |
12500amol-50amol | 29 | 7.966 | 50 | 12500 | 405 | 353 | 289 | 241 |
125amol-25000amol | 3 | 7.644 | 125 | 25000 | 291 | 258 | 127 | 94 |
250amol-50000amol | 15 | 7.644 | 250 | 50000 | 312 | 255 | 261 | 199 |
12500amol-125amol | 1 | 6.644 | 125 | 12500 | 186 | 145 | 83 | 64 |
25000amol-250amol | 9 | 6.644 | 250 | 25000 | 252 | 204 | 117 | 94 |
50000amol-500amol | 27 | 6.644 | 500 | 50000 | 311 | 242 | 229 | 164 |
5000amol-50amol | 35 | 6.644 | 50 | 5000 | 514 | 479 | 362 | 299 |
12500amol-250amol | 7 | 5.644 | 250 | 12500 | 131 | 120 | 40 | 24 |
25000amol-500amol | 24 | 5.644 | 500 | 25000 | 306 | 264 | 77 | 62 |
2500amol-50amol | 32 | 5.644 | 50 | 2500 | 472 | 429 | 320 | 261 |
125amol-5000amol | 17 | 5.322 | 125 | 5000 | 258 | 202 | 100 | 66 |
12500amol-500amol | 22 | 4.644 | 500 | 12500 | 170 | 121 | 56 | 41 |
125amol-2500amol | 5 | 4.322 | 125 | 2500 | 163 | 140 | 77 | 64 |
2500amol-50000amol | 14 | 4.322 | 2500 | 50000 | 252 | 195 | 162 | 122 |
250amol-5000amol | 20 | 4.322 | 250 | 5000 | 182 | 122 | 100 | 81 |
25000amol-2500amol | 6 | 3.322 | 2500 | 25000 | 138 | 87 | 6 | 4 |
2500amol-250amol | 10 | 3.322 | 250 | 2500 | 95 | 67 | 34 | 26 |
50000amol-5000amol | 21 | 3.322 | 5000 | 50000 | 314 | 271 | 179 | 137 |
5000amol-500amol | 28 | 3.322 | 500 | 5000 | 159 | 128 | 69 | 53 |
500amol-50amol | 36 | 3.322 | 50 | 500 | 387 | 331 | 257 | 233 |
12500amol-2500amol | 4 | 2.322 | 2500 | 12500 | 16 | 9 | 1 | 0 |
25000amol-5000amol | 18 | 2.322 | 5000 | 25000 | 254 | 187 | 27 | 21 |
2500amol-500amol | 25 | 2.322 | 500 | 2500 | 110 | 85 | 41 | 25 |
250amol-50amol | 33 | 2.322 | 50 | 250 | 326 | 281 | 210 | 173 |
12500amol-50000amol | 11 | 2.000 | 12500 | 50000 | 174 | 115 | 101 | 66 |
125amol-500amol | 23 | 2.000 | 125 | 500 | 19 | 17 | 4 | 3 |
12500amol-5000amol | 16 | 1.322 | 5000 | 12500 | 49 | 71 | 4 | 3 |
125amol-50amol | 30 | 1.322 | 50 | 125 | 266 | 244 | 164 | 108 |
12500amol-25000amol | 2 | 1.000 | 12500 | 25000 | 18 | 17 | 0 | 0 |
125amol-250amol | 8 | 1.000 | 125 | 250 | 11 | 6 | 2 | 1 |
25000amol-50000amol | 13 | 1.000 | 25000 | 50000 | 124 | 86 | 69 | 47 |
2500amol-5000amol | 19 | 1.000 | 2500 | 5000 | 8 | 4 | 2 | 1 |
250amol-500amol | 26 | 1.000 | 250 | 500 | 6 | 2 | 3 | 0 |
You can see that in numerous cases much more than the 48 UPS1 proteins showed up significant.
In the Ramus et al paper only 3 pairwise comparisons were further analyzed :
## In Ramus paper selection
colnames(testPD$BH)[c(2,21,27)]
#> [1] "12500amol-25000amol" "50000amol-5000amol" "50000amol-500amol"
Finally, let’s use a slightly extended selection of concentrations :
## extended selection
useCompNo <- c(2,21,27, 14,15)
colnames(testPD$BH)[useCompNo]
#> [1] "12500amol-25000amol" "50000amol-5000amol" "50000amol-500amol"
#> [4] "2500amol-50000amol" "250amol-50000amol"
## Let's extract the concentration part to numeric
numNamePart <- numPairDeColNames(testPD$BH, selComp=useCompNo, stripTxt="amol", sortByAbsRatio=TRUE)
head(numNamePart)
#> index log2rat conc1 conc2
#> [1,] 15 7.644 250 50000
#> [2,] 27 6.644 500 50000
#> [3,] 14 4.322 2500 50000
#> [4,] 21 3.322 5000 50000
#> [5,] 2 1.000 12500 25000
## table with concentrations in selected comparisons
table2 <- cbind(numNamePart, signCount[numNamePart[,1],])
knitr::kable(table2, caption="Selected pairwise comparisons (extended from Ramus et al)", align="c")
index | log2rat | conc1 | conc2 | sig.PD.BH | sig.PD.lfdr | sig.MQ.BH | sig.MQ.lfdr | |
---|---|---|---|---|---|---|---|---|
250amol-50000amol | 15 | 7.644 | 250 | 50000 | 312 | 255 | 261 | 199 |
50000amol-500amol | 27 | 6.644 | 500 | 50000 | 311 | 242 | 229 | 164 |
2500amol-50000amol | 14 | 4.322 | 2500 | 50000 | 252 | 195 | 162 | 122 |
50000amol-5000amol | 21 | 3.322 | 5000 | 50000 | 314 | 271 | 179 | 137 |
12500amol-25000amol | 2 | 1.000 | 12500 | 25000 | 18 | 17 | 0 | 0 |
Volcano-plots offer more insight in how statistical test results vary in respect to p-values. In addition we can mark the different protein-groups (or species), see also vignette to the package wrGraph.
The PCA plots already told us graphically how strong the differences appear in the various (pairwise) comparisons. Counting the number of proteins passing a classical threshold for differential expression is a good way to start.
The dataset contains 9 different levels of UPS1 concentrations, in consequence 36 pair-wise comparisons are possible in the data-set from Ramus et al 2016. Plotting all these pair-wise comparisons would make way too crowded plots.
## the selected comparisons to check
cbind(no=useCompNo, name=colnames(testPD$t)[useCompNo])
#> no name
#> [1,] "2" "12500amol-25000amol"
#> [2,] "21" "50000amol-5000amol"
#> [3,] "27" "50000amol-500amol"
#> [4,] "14" "2500amol-50000amol"
#> [5,] "15" "250amol-50000amol"
## check presence and good version of package wrGraph
doVolc <- requireNamespace("wrGraph", quietly=TRUE)
if(doVolc) doVolc <- packageVersion("wrGraph") >= "1.0.6"
## ProteomeDiscoverer
layout(matrix(1:6, ncol=2))
if(doVolc) {
for(i in useCompNo) VolcanoPlotW(testPD, useComp=i, FCthrs=2, FdrThrs=0.05, annColor=c(4,2,3),silent=TRUE)}
## MaxQuant
layout(matrix(1:6, ncol=2))
if(doVolc) {
for(i in useCompNo) VolcanoPlotW(testMQ, useComp=i, FCthrs=2, FdrThrs=0.05, annColor=c(4,2,3),silent=TRUE)}
Typically a classical proteomics analysis would go from this step into further investigating proteins with significant abundances. However, the UPS1 setup is special since we know in advance which proteins should be differential. In the following section we’ll focus on these UPS1 proteins and we’ll take advantage of the fact that multiple concentrations thereof have been measured.
We know from the experimental setup that there were 48 UPS1 proteins proteins present in the commercial mix. The lowest concentrations are extremely challenging and it is no surprise that many of them were not detected at the lowest concentrations. In order to choose among the various concentrations of UPS1, let’s look how many NAs are in each group of replicates, and in particular, the number of NAs among the UPS1 proteins.
## The number of NAs, just the UPS1 proteins (in ProteomeDiscoverer):
sumNAperGroup(dataPD$raw[which(dataPD$annot[,"SpecType"]=="species2"),], grp9)
#> 50amol 125amol 250amol 500amol 2500amol 5000amol 12500amol 25000amol
#> 0 0 0 0 0 0 0 0
#> 50000amol
#> 0
sumNAperGroup(dataMQ$raw[which(dataMQ$annot[,"SpecType"]=="species2"),], grp9)
#> 50amol 125amol 250amol 500amol 2500amol 5000amol 12500amol 25000amol
#> 0 0 0 0 0 0 0 0
#> 50000amol
#> 0
As general indicator for data-quality and -usability let’s look at the intra-replicate variability. Here we plot all intra-group (ie UPS1-concentration) CVs.
In the figure below the complete series (including yeast) is shown on the left side, the human UPS1 proteins only on the right side. Briefly, vioplots show a kernel-estimate for the distribution, in addition, a box-plot is also integrated (see vignette to package wrGraph).
## combined plot : all data (left), Ups1 (right)
layout(1:2)
sumNAinPD <- list(length=18)
sumNAinPD[2*(1:length(unique(grp9))) -1] <- as.list(as.data.frame(log2(rowGrpCV(testPD$datImp, grp9))))
sumNAinPD[2*(1:length(unique(grp9))) ] <- as.list(as.data.frame(log2(rowGrpCV(testPD$datImp[which(testPD$annot[,"SpecType"]=="UPS1"),], grp9))))
names(sumNAinPD)[2*(1:length(unique(grp9))) -1] <- sub("amol","",unique(grp9))
names(sumNAinPD)[2*(1:length(unique(grp9))) ] <- paste(sub("amol","",unique(grp9)),"Ups",sep=".")
vioplotW(sumNAinPD, halfViolin="pairwise", tit="CV Intra Replicate, ProteomeDiscoverer", cexNameSer=0.6)
mtext("left part : all data\nright part: UPS1",adj=0,cex=0.8)
sumNAinMQ <- list(length=18)
sumNAinMQ[2*(1:length(unique(grp9))) -1] <- as.list(as.data.frame(log2(rowGrpCV(testMQ$datImp, grp9))))
sumNAinMQ[2*(1:length(unique(grp9))) ] <- as.list(as.data.frame(log2(rowGrpCV(testMQ$datImp[which(testMQ$annot[,"SpecType"]=="UPS1"),], grp9))))
names(sumNAinMQ)[2*(1:length(unique(grp9))) -1] <- sub("amol","",unique(grp9)) # paste(unique(grp9),"all",sep=".")
names(sumNAinMQ)[2*(1:length(unique(grp9))) ] <- paste(sub("amol","",unique(grp9)),"Ups",sep=".") #paste(unique(grp9),"Ups1",sep=".")
vioplotW(sumNAinMQ, halfViolin="pairwise", tit="CV intra replicate, MaxQuant",cexNameSer=0.6)
mtext("left part : all data\nright part: UPS1",adj=0,cex=0.8)
## decent compromise based on CV : focus on 250 amol vs 50000 amol
## ... or for low no of NAs: 2500 amol vs 50000 amol
## for linear modeling rather 500 amol vs 50000 amol (last of 'high' NA counts)
## Ramus: 500 vs 50000 (PD 28/0 NA, MQ 97/1 NA)
## 5000 vs 50000 (PD 0/0 NA, MQ 8/1 NA)
## 12500 vs 25000
Once can see that lower concentrations of UPS1 usually have worse CV (coefficient of variance) in the respective samples, this phenomenon also correlates with the content of NAs in the original data.
## the quantified UPS1 names
table(dataPD$annot[,"SpecType"]) # 46
#>
#> UPS1 Yeast conta
#> 46 866 1
table(dataMQ$annot[,"SpecType"]) # 48
#>
#> UPS1 Yeast conta
#> 48 1040 9
## extract names of quantified UPS1-proteins
NamesUpsPD <- dataPD$annot[which(dataPD$annot[,"SpecType"]=="UPS1"),"Accession"]
NamesUpsMQ <- dataMQ$annot[which(dataMQ$annot[,"SpecType"]=="UPS1"),"Accession"]
Run linear models, extract slope & pval, plot per UPS1 protein :
## ProteomeDiscoverer
lmPD <- list(length=length(NamesUpsPD))
layout(matrix(1:12, ncol=2))
lmPD[1:12] <- lapply(NamesUpsPD[1:12], linModelSelect, dat=dataPD, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
lmPD[13:24] <- lapply(NamesUpsPD[13:24], linModelSelect, dat=dataPD, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
lmPD[25:36] <- lapply(NamesUpsPD[25:36], linModelSelect, dat=dataPD, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
lmPD[37:46] <- lapply(NamesUpsPD[37:46], linModelSelect, dat=dataPD, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
names(lmPD) <- NamesUpsPD
## We make a little summary of regression-results (ProteomeDiscoverer)
lmPDsum <- cbind(pVal=sapply(lmPD,function(x) x$coef[2,4]),logp=NA,slope=sapply(lmPD,function(x) x$coef[2,1]), startFr=sapply(lmPD,function(x) x$startLev), medRawAbund=apply(log2(dataPD$raw[NamesUpsPD,]),1,median,na.rm=TRUE),good=0)
lmPDsum[,"logp"] <- log10(lmPDsum[,"pVal"])
lmPDsum[which(lmPDsum[,"logp"] < -12 & lmPDsum[,"slope"] >0.75),"good"] <- 1
lmPDsum[which(lmPDsum[,"logp"] < -10 & lmPDsum[,"slope"] >0.7),"good"] <- lmPDsum[which(lmPDsum[,"logp"] < -10 & lmPDsum[,"slope"] >0.7),"good"]+ 1
## now we can check the number of high-confidence quantifications (0 means bad linear model)
table(lmPDsum[,"good"]) # 24 good quantifications
#>
#> 0 1 2
#> 20 2 24
## at which concentration of UPS1 did one et the best regression results ?
table(lmPDsum[,"startFr"]) # most starting at 1
#>
#> 1 2 3 4 5
#> 8 17 6 6 9
## a brief summary/overview of regression-results
summary(lmPDsum)
#> pVal logp slope startFr
#> Min. :0.0000000 Min. :-22.5469 Min. :-0.08763 Min. :1.000
#> 1st Qu.:0.0000000 1st Qu.:-17.3973 1st Qu.: 0.05714 1st Qu.:2.000
#> Median :0.0000000 Median :-13.1783 Median : 0.86133 Median :2.000
#> Mean :0.0213440 Mean :-11.4901 Mean : 0.64745 Mean :2.804
#> 3rd Qu.:0.0005831 3rd Qu.: -3.2699 3rd Qu.: 1.08937 3rd Qu.:4.000
#> Max. :0.1993724 Max. : -0.7003 Max. : 1.29960 Max. :5.000
#> medRawAbund good
#> Min. :16.46 Min. :0.000
#> 1st Qu.:18.40 1st Qu.:0.000
#> Median :19.07 Median :2.000
#> Mean :19.09 Mean :1.087
#> 3rd Qu.:19.53 3rd Qu.:2.000
#> Max. :25.16 Max. :2.000
## Now for MaxQuant
lmMQ <- list(length=length(NamesUpsMQ))
layout(matrix(1:12, ncol=2))
lmMQ[1:12] <- lapply(NamesUpsMQ[1:12], linModelSelect, dat=dataMQ, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
lmMQ[13:24] <- lapply(NamesUpsMQ[13:24], linModelSelect, dat=dataMQ, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
lmMQ[25:36] <- lapply(NamesUpsMQ[25:36], linModelSelect, dat=dataMQ, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
lmMQ[37:48] <- lapply(NamesUpsMQ[37:48], linModelSelect, dat=dataMQ, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
## We make a little summary of regression-results (MaxQuant)
## Regressions with bad slope and/or p-value will be marked as O
lmMQsum <- cbind(pVal=sapply(lmMQ,function(x) x$coef[2,4]),logp=NA,slope=sapply(lmMQ,function(x) x$coef[2,1]), startFr=sapply(lmMQ,function(x) x$startLev), medRawAbund=apply(log2(dataMQ$raw[NamesUpsMQ,]),1,median,na.rm=TRUE),good=0)
lmMQsum[,"logp"] <- log10(lmMQsum[,"pVal"])
lmMQsum[which(lmMQsum[,"logp"] < -12 & lmMQsum[,"slope"] >0.75),"good"] <- 1
lmMQsum[which(lmMQsum[,"logp"] < -10 & lmMQsum[,"slope"] >0.7),"good"] <- lmMQsum[which(lmMQsum[,"logp"] < -10 & lmMQsum[,"slope"] >0.7),"good"]+ 1
## now we can check the number of high-confidence quantifications (0 means bad linear model)
table(lmMQsum[,"good"]) # 26 good quantifications
#>
#> 0 1 2
#> 15 7 26
## at which concentration of UPS1 did one et the best regression results ?
table(lmMQsum[,"startFr"]) # most starting at 5 !
#>
#> 1 2 3 4 5
#> 3 6 1 7 31
## a brief summary/overview of regression-results
summary(lmMQsum)
#> pVal logp slope startFr
#> Min. :0.0000000 Min. :-23.273 Min. :0.1405 Min. :1.000
#> 1st Qu.:0.0000000 1st Qu.:-13.838 1st Qu.:0.9539 1st Qu.:4.000
#> Median :0.0000000 Median :-12.713 Median :1.2154 Median :5.000
#> Mean :0.0003915 Mean :-12.035 Mean :1.1472 Mean :4.188
#> 3rd Qu.:0.0000000 3rd Qu.: -9.779 3rd Qu.:1.4008 3rd Qu.:5.000
#> Max. :0.0187899 Max. : -1.726 Max. :1.6093 Max. :5.000
#> medRawAbund good
#> Min. :19.94 Min. :0.000
#> 1st Qu.:21.92 1st Qu.:0.000
#> Median :22.96 Median :2.000
#> Mean :22.73 Mean :1.229
#> 3rd Qu.:23.49 3rd Qu.:2.000
#> Max. :25.10 Max. :2.000
Next, we can compare the different modelizations on a global basis :
## summary graphics on all indiv protein regressions for ProteomeDiscoverer
layout(matrix(c(1:3,3), ncol=2, byrow=TRUE))
hist(log10(sapply(lmPD,function(x) x$coef[2,4])), br=15,las=1, main="PD: hist of regr p-values",xlab="log10 p-values") # good p < 1e-12
hist( sapply(lmPD,function(x) x$coef[2,1]), br=15,las=1, main="PD: hist of regr slopes",xlab="slope") # good
tit <- "ProteomeDiscoverer, UPS1 regressions : p-value vs slope"
useCol <- colorAccording2(lmPDsum[,"medRawAbund"], gradTy="rainbow", revCol=TRUE, nEndOmit=14)
plot(lmPDsum[,c(2,3)], main=tit, type="n") #col=1, bg.col=useCol, pch=20+lmPDsum[,"startFr"],
points(lmPDsum[,c(2,3)], col=1, bg=useCol, pch=20+lmPDsum[,"startFr"],)
legend("topright",paste("best starting from ",1:5), text.col=1, pch=21:25, col=1, pt.bg="white", cex=0.9, xjust=0.5, yjust=0.5)
mtext("fill color according to median (raw) abundance (violet/blue/low -> green -> red/high)",cex=0.9)
abline(v=c(-12,-10),lty=2,col="grey") ; abline(h=c(0.7,0.75),lty=2,col="grey")
hi1 <- hist(lmPDsum[,"medRawAbund"], plot=FALSE)
legendHist(sort(lmPDsum[,5]), colRamp=useCol[order(lmPDsum[,"medRawAbund"])][cumsum(hi1$counts)], location="bottomleft", legTit="median raw abundance") #
ProteomeDiscoverer has bimodial character in histogram of slopes and p-values (not as clear) : apr 50% of proteins got well quantified others very bad.
## now for MaxQuant
layout(matrix(c(1:3,3), ncol=2, byrow=TRUE))
hist(log10(sapply(lmMQ,function(x) x$coef[2,4])), br=15,las=1, main="MQ: hist of regr p-values",xlab="log10 p-values") # good p < 1e-12
hist( sapply(lmMQ,function(x) x$coef[2,1]), br=15,las=1, main="MQ: hist of regr slopes",xlab="slope") # good
tit <- "MaxQuant, UPS1 regressions : p-value vs slope"
useCol <- colorAccording2(lmMQsum[,"medRawAbund"], gradTy="rainbow", revCol=TRUE, nEndOmit=14)
plot(lmMQsum[,c(2,3)], main=tit, type="n") #col=1, bg.col=useCol, pch=20+lmMQsum[,"startFr"],
points(lmMQsum[,c(2,3)], col=1, bg=useCol, pch=20+lmMQsum[,"startFr"],)
legend("topright",paste("best starting from ",1:5), text.col=1, pch=21:25, col=1, pt.bg="white", cex=0.9, xjust=0.5, yjust=0.5)
mtext("fill color according to median (raw) abundance (red/high -> blue/low)",cex=0.9)
abline(v=c(-12,-10),lty=2,col="grey") ; abline(h=c(0.7,0.75),lty=2,col="grey")
hi1 <- hist(lmMQsum[,"medRawAbund"], plot=FALSE)
legendHist(sort(lmMQsum[,5]), colRamp=useCol[order(lmMQsum[,"medRawAbund"])][cumsum(hi1$counts)], location="bottomleft", legTit="median raw abundance")
MaxQuant : No bimodial distributions for p-values or slopes, regressions appear raher uniform with high slopes using only the last few concentrations !
ROC curves display Sensitivity (True Positive Rate) versus 1-Specificity (False Positive Rate). They are typically used as illustrate and compare the discriminiative capacity of a yes/no decision system, see eg also ROC on Wikipedia or the original publication Hand and Till 2001.
In this case ROC curves are used to judge how well heterologous human UPS1 proteins can be recognized as differential abundant while constant yeast matrix proteins should not get classified as differential. Finally, ROC curves let us also gain some additional insights if the commonly used 5-percent FDR threshld cutoff allows getting the best out of the testing system.
The Ramus et al 2016 -dataset contains 9 different levels of UPS1 concentrations, in consequence 36 pair-wise comparisons are possible. Plotting all these pair-wise comparisons would make way too crowded plots.
Thus, the graphical comparisons were restricted to three comparisons presented in the original publication by Ramus et al 2016 plus two additional ones. The distribution of intra-group CV-values showed (without major surprise) that the highest UPS1 concentrations replicated best. In consequence comparisons using this group are expected to have a decent chance to rather specifically reveil a high number of UPS1 proteins.
Initially a ROC-curve cat get calculated for each pair-wise comparison where it is known which proteins should be found differential (ie human UPS1 proteins).
## single comparison data for ROC
rocPD.2 <- summarizeForROC(testPD, annotCol="SpecType", spec=c("Yeast","UPS1"), columnTest=2, tyThr="BH",overl=F,color=5) # 12500amol-25000amol
tail(signif(rocPD.2,3))
#> alph spec sens prec accur FDR n.pos.Yeast n.pos.UPS1
#> [137,] 0.93 0.2030 0.773 0.0245 0.2170 0.976 677 17
#> [138,] 0.95 0.1850 0.818 0.0254 0.2010 0.975 692 18
#> [139,] 0.96 0.1720 0.818 0.0250 0.1880 0.975 703 18
#> [140,] 0.97 0.1630 0.818 0.0247 0.1790 0.975 711 18
#> [141,] 0.98 0.0813 0.955 0.0262 0.1030 0.974 780 21
#> [142,] 1.00 0.0000 1.000 0.0253 0.0253 0.975 849 22
However, since we’re treating a larger data-set this can be done in batch. Now we are ready to extract all counts of each UPS1 for constructing ROC-curves.
layout(1)
rocPD <- lapply(table2[,1],function(x) summarizeForROC(testPD, annotCol="SpecType", spec=c("Yeast","UPS1"), columnTest=x, tyThr="BH", plotROC=FALSE))
rocMQ <- lapply(table2[,1],function(x) summarizeForROC(testMQ, annotCol="SpecType", spec=c("Yeast","UPS1"), columnTest=x, tyThr="BH", plotROC=FALSE))
#> -> summarizeForROC : PROBLEM :
#> *** None of the elements annotated as 'positive' species to search for has any valid testing results ! Unable to construct TP ! ***
names(rocPD) <- colnames(testPD$BH)[useCompNo]
names(rocMQ) <- colnames(testMQ$BH)[useCompNo]
And we can plot the ROC curves for ProteomeDiscoverer :
layout(1)
colPanel <- 2:6 #c(grey(0.4),2:5)
methNa <- paste(table2[,1],", ie",table2[,3],"-",table2[,4])
methNa <- paste0(rep(c("PD","MQ"), each=length(useCompNo)), methNa)
plotROC(rocPD[[1]],rocPD[[2]],rocPD[[3]],rocPD[[4]],rocPD[[5]], col=colPanel, methNames=methNa[1:5], pointSi=0.8, tit="ProteomeDiscoverer at 5 ratios",legCex=1)
One can see form the figure, that the classical threshold of FDR=0.05 suggests in this case to cut not at the optimal point, lower threshod values would provide a (slightly) better compromise between specificity & sensitivty.
We can see that the comparison 12500 amol vs 25000 amol performed worse than the other ones. Although at these high UPS1 concentrations the proteins were well detected, the statistical test had more problems just calling the UPS1 proteins ‘differential’. At the other comparisons the (theoretical) ratio was much higher :
Let’s moove on with the ROC curves for MaxQuant :
plotROC(rocMQ[[1]],rocMQ[[2]],rocMQ[[3]],rocMQ[[4]],rocMQ[[5]], col=colPanel, methNames=methNa[6:10], pointSi=0.8, xlim=c(0,0.27),txtLoc=c(0.09,0.3,0.03), tit="MaxQuant selected ratios",legCex=1)
Please note, that instead of 5 curves only 4 are shown : The comparison of 12500 vs 25000 gave not even a single UPS1 protein as ‘sinificant’. Thus, the true positives (TP) never left the count of 0, in consequence specificity and sensitivity can’t be calculated.
And the ROC curves for both ProteomeDiscoverer and MaxQuant :
colPan10 <- rainbow(13)[c(-3,-5,-13)]
plotROC(rocPD[[1]],rocPD[[2]],rocPD[[3]],rocPD[[4]],rocPD[[5]], rocMQ[[1]],rocMQ[[2]],rocMQ[[3]],rocMQ[[4]],rocMQ[[5]], col=colPan10, methNames=methNa, pointSi=0.8, tit="PD and MQ at selected ratios",legCex=1)
More quantitation methods will get integrated shortly …
The author wants to acknowledge the support by the IGBMC (CNRS UMR 7104, Inserm U 1258), CNRS, IGBMC, Universite de Strasbourg and Inserm and of course my collegues from the IGBMC proteomics platform. Furthermore, many very fruitful discussions with colleages on national and international level have helped to formulate ideas, improve and disseminate the tools presented here.
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=C LC_CTYPE=French_France.1252
#> [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C
#> [5] LC_TIME=French_France.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] wrGraph_1.0.6 rmarkdown_2.4 knitr_1.30 wrProteo_1.2.0 wrMisc_1.4.0
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.25 R.methodsS3_1.8.1 magrittr_1.5 evaluate_0.14
#> [5] highr_0.8 sm_2.2-5.6 rlang_0.4.8 stringi_1.5.3
#> [9] fdrtool_1.2.15 limma_3.44.3 R.oo_1.24.0 R.utils_2.10.1
#> [13] RColorBrewer_1.1-2 tools_4.0.3 stringr_1.4.0 xfun_0.18
#> [17] yaml_2.2.1 compiler_4.0.3 tcltk_4.0.3 htmltools_0.5.0