Analyzing UPS1 Spike-in Experiments (Example Ramus 2016 Dataset)

Wolfgang Raffelsberger

2020-10-18

Introduction

This vignete shows how UPS1 spike-in experiments may be analyzed using the packages wrProteo, wrMisc and wrGraph, all are available on CRAN.

Furthermore, the Bioconductor package limma will be used internally for it’s robust statistical testing.

# If not already installed, you'll have to install this package and wrMisc first.
install.packages("wrMisc")
install.packages("wrProteo")

# The package wrGraph is recommended for better graphics
install.packages("wrGraph")

# You cat start the vignettes for this package by typing :
browseVignettes("wrProteo")    #  ... and the select the html output

Now let’s load the packages needed :

library(wrMisc)
library(wrProteo)
library(wrGraph)

# Version number for wrProteo :
packageVersion("wrProteo")
#> [1] '1.2.0'

Benchmark Tests Experimental Setup

The main aim of the experimental setup in UPS1 spike-in experiments is to provide a framework to test identification and quantitation procedures in proteomics. By mixing known amounts of a collection of human proteins (UPS1) in various concentrations into a yeast protein extract, one expects to find only human proteins varying between samples. In terms of ROC curves the human proteins are expected to show up as true positives (TP). In contrast, all yeast proteins were always added in the same quantity and should thus be observed constant, ie as true negatives (TN).

The Ramus Data-Set

The data were published with the article : Ramus et al 2016 Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset. J Proteomics 2016 Jan 30;132:51-62. PMID: 26585461 doi: 10.1016/j.jprot.2015.11.011

This dataset is available on PRIDE as PXD001819 (and/or on ProteomeXchange).

Briefly, this experiment aims to compare quantification of the heterologous spike-in UPS1 in yeast protein extracts as constant matrix.

Additional Functions

## Two small functions we'll need lateron

replSpecType <- function(x, annCol="SpecType", replBy=cbind(old=c("mainSpe","species2"), new=c("Yeast","UPS1"))) {
  ## rename $annot[,"SpecType"] to more specific names
  chCol <- annCol[1] %in% colnames(x$annot)
  if(chCol) { chCol <- which(colnames(x$annot)==annCol[1])
    chIt <- replBy[,1] %in% unique(x$annot[,chCol])    # check items to replace if present
    if(any(chIt)) for(i in which(chIt)) {useLi <- which(x$annot[,chCol] %in% replBy[i,1]); cat("useLi",head(useLi),"\n"); x$annot[useLi,chCol] <- replBy[i,2]}
  } else message(" replSpecType: 'annCol' not found in x$annot !")
  x }
  
replNAProtNames <- function(x,annCol=c("ProteinName","Accession","SpecType")) {
  ## replace in $annot missing ProteinNames by concatenating Accession + SpecType (ie 2nd & 3rd of annCol)
  chCol <- annCol %in% colnames(x$annot)
  if(all(chCol)) {
    chNA <- is.na(x$annot[,annCol[1]])
    if(any(chNA)) x$annot[which(chNA),annCol[1]] <- paste(x$annot[which(chNA),annCol[2]],x$annot[which(chNA),annCol[3]],sep="_")
  } else message(" replNAProtNames: none of the columnnames 'annCol' found in x$annot !")
  x }

Protein Identification and Initial Quantification

MaxQuant

MaxQuant is free software provided by the Max-Planck-Insutute, see Tyanova et al 2016. Typically MaxQuant exports by default quantitation data on level of consensus-proteins as a folder called txt with a file called proteinGroups.txt . So in a standard case one needs only to provide the path to this file.

path1 <- system.file("extdata", package="wrProteo")
fiNaMa <- "proteinGroups.txt.gz"
specPrefMQ <- c(conta="CON_|LYSC_CHICK", mainSpecies="OS=Saccharomyces cerevisiae", spike="HUMAN_UPS")

dataMQ <- readMaxQuantFile(path1, file=fiNaMa, specPref=specPrefMQ, refLi="mainSpe")
#>    by species : conta: 9  mainSpe: 1040  species2: 48
#>  -> readMaxQuantFile :  normalize using subset of 1040

The data were imported and median-normalized, the protein annotation was parsed to atomatically extract IDs, protein-names and species information.

## a summary of the quantitation data
dim(dataMQ$quant)
#> [1] 1097   27
summary(dataMQ$quant[,1:8])       # the first 8 cols
#>   12500amol_R1    12500amol_R2    12500amol_R3     125amol_R1   
#>  Min.   :17.52   Min.   :15.85   Min.   :15.09   Min.   :15.22  
#>  1st Qu.:22.49   1st Qu.:22.39   1st Qu.:22.40   1st Qu.:22.33  
#>  Median :23.47   Median :23.47   Median :23.47   Median :23.43  
#>  Mean   :23.69   Mean   :23.63   Mean   :23.64   Mean   :23.58  
#>  3rd Qu.:24.86   3rd Qu.:24.81   3rd Qu.:24.80   3rd Qu.:24.86  
#>  Max.   :30.36   Max.   :30.32   Max.   :30.34   Max.   :30.21  
#>  NA's   :79      NA's   :77      NA's   :92      NA's   :118    
#>    125amol_R2      125amol_R3     25000amol_R1    25000amol_R2  
#>  Min.   :15.60   Min.   :14.99   Min.   :15.98   Min.   :15.43  
#>  1st Qu.:22.36   1st Qu.:22.34   1st Qu.:22.46   1st Qu.:22.49  
#>  Median :23.44   Median :23.45   Median :23.52   Median :23.54  
#>  Mean   :23.59   Mean   :23.58   Mean   :23.72   Mean   :23.72  
#>  3rd Qu.:24.88   3rd Qu.:24.87   3rd Qu.:24.97   3rd Qu.:24.96  
#>  Max.   :30.22   Max.   :30.25   Max.   :30.32   Max.   :30.20  
#>  NA's   :113     NA's   :122     NA's   :94      NA's   :98
colnames(dataMQ$annot)[1:12]
#>  [1] "Accession"                           
#>  [2] "ProteinName"                         
#>  [3] "Species"                             
#>  [4] "Contam"                              
#>  [5] "SpecType"                            
#>  [6] "Majority.protein.IDs"                
#>  [7] "Fasta.headers"                       
#>  [8] "Number.of.proteins"                  
#>  [9] "Potential.contaminant"               
#> [10] "Razor...unique.peptides"             
#> [11] "Razor...unique.peptides.12500amol_R1"
#> [12] "Razor...unique.peptides.12500amol_R2"
table(dataMQ$annot[,"Species"])
#> 
#>            Gallus gallus                HUMAN_UPS             Homo sapiens 
#>                        1                        2                       47 
#>             Mus musculus Saccharomyces cerevisiae 
#>                        1                     1040
table(dataMQ$annot[,"SpecType"])
#> 
#>    conta  mainSpe species2 
#>        9     1040       48

ProteomeDiscoverer

ProteomeDiscoverer is commercial software from ThermoFisher (www.thermofisher.com). In this case, the identification was performed using the XCalibur module of ProteomeDiscoverer. In ProteomeDiscoverer quantitation data on level of consensus-proteins should be exported to tabulated text files, which can be treated by this function. The resultant data were export as ‘Proteins’ in tablulated format (the option R-headers was checked, however data can also be read when this option was not chosen).

path1 <- system.file("extdata", package="wrProteo")
fiNaPd <- "pxd001819_PD2.4_Proteins.txt.gz"
 file.exists(file.path(path1,fiNaPd))
#> [1] TRUE
## Note: data exported from ProteomeDiscoverer does not have proper column-names 
sampNa <- paste(rep(c(50,125,250,500,2500,5000,12500,25000,50000), each=3),"amol_R",rep(1:3,9),sep="") 
specPrefPD <- c(conta="Bos tauris|Gallus", mainSpecies="OS=Saccharomyces cerevisiae", spike="OS=Homo sapiens")

dataPD <- readPDExport(file=fiNaPd, path=path1, sampleNames=sampNa, refLi="mainSpe", specPref=specPrefPD)
#>  -> readPDExport :  Trouble ahead, expecting tabulated text file (this file might not be right format) !!
#>  -> readPDExport :  correcting 'annotCol' to export of R-friendly colnames
#>  -> readPDExport :  Note: 6 (out of 919) unrecognized species
#>  -> readPDExport :    by species : Gallus gallus: 1 ;  Homo sapiens: 46 ;  Saccharomyces cerevisiae: 866 ;
#>  -> readPDExport :  normalize using subset of 866

The data were imported and median-normalized, the protein annotation was parsed to atomatically extract IDs, protein-names and species information.

## a summary of the quantitation data
summary(dataPD$quant[,1:8])        # the first 8 cols
#>    50amol_R1       50amol_R2       50amol_R3       125amol_R1   
#>  Min.   :13.72   Min.   :11.67   Min.   :11.28   Min.   :12.52  
#>  1st Qu.:18.73   1st Qu.:18.78   1st Qu.:18.80   1st Qu.:18.81  
#>  Median :19.94   Median :19.94   Median :19.94   Median :19.94  
#>  Mean   :20.00   Mean   :20.02   Mean   :20.06   Mean   :20.03  
#>  3rd Qu.:21.30   3rd Qu.:21.35   3rd Qu.:21.35   3rd Qu.:21.37  
#>  Max.   :26.41   Max.   :26.47   Max.   :26.50   Max.   :26.52  
#>  NA's   :32      NA's   :28      NA's   :28      NA's   :25     
#>    125amol_R2      125amol_R3      250amol_R1      250amol_R2   
#>  Min.   :10.66   Min.   :13.65   Min.   :10.26   Min.   :10.39  
#>  1st Qu.:18.83   1st Qu.:18.80   1st Qu.:18.85   1st Qu.:18.79  
#>  Median :19.94   Median :19.93   Median :19.94   Median :19.93  
#>  Mean   :20.08   Mean   :20.09   Mean   :20.07   Mean   :20.02  
#>  3rd Qu.:21.41   3rd Qu.:21.39   3rd Qu.:21.40   3rd Qu.:21.34  
#>  Max.   :26.52   Max.   :26.57   Max.   :26.62   Max.   :26.55  
#>  NA's   :31      NA's   :30      NA's   :27      NA's   :24
dim(dataPD$quant)
#> [1] 918  27
colnames(dataPD$annot)[]
#>  [1] "Accession"                 "ProteinName"              
#>  [3] "Species"                   "Contam"                   
#>  [5] "SpecType"                  "Description"              
#>  [7] "Contaminant"               "Number.of.Peptides"       
#>  [9] "Number.of.PSMs"            "Number.of.Unique.Peptides"
#> [11] "Number.of.AAs"             "Coverage.in.Percent"      
#> [13] "rowNo"
#head(dataPD$annot)
table(dataPD$annot[,"Species"])
#> 
#>            Gallus gallus             Homo sapiens Saccharomyces cerevisiae 
#>                        1                       46                      866
table(dataPD$annot[,"SpecType"])
#> 
#>    conta  mainSpe species2 
#>        1      866       46

Proline

Proline is free software provided by the Profi-consortium,
see Ramus et al 2016 and Bouyssié et al 2020 (Proline: an efficient and user-friendly software suite for large-scale proteomics. Bioinformatics. 2020, PMID: 32096818, DOI: 10.1093/bioinformatics/btaa118 )

In Proline quantitation data on level of consensus-proteins can be exported to csv or tabulated text files, which can be treated by this function.

Uniform Re-Arranging of Data

In order for easy and proper comparisons we need to make sure all columns are in the same order.

# get all results (MaxQuant,ProteomeDiscoverer, ...) in same order
sampNa <- paste0(rep(c(50,125,250,500,2500,5000,12500,25000,50000),each=3),"amol_R",rep(1:3,9))
grp9 <- paste0(rep(c(50,125,250,500,2500,5000,12500,25000,50000),each=3),"amol") 

## it is more convenient to re-order columns this way in each project
dataPD <- corColumnOrder(dataPD,sampNames=sampNa)          # already in good order
#>  -> corColumnOrder :  order already correct !
dataMQ <- corColumnOrder(dataMQ,sampNames=sampNa) 
#dataPL <- corColumnOrder(dataPL,sampNames=sampNa) 

The from the protein annotation the membership to 3 groups was extracted : yeast (matrix) as “‘main Spe’”, UPS1 (spike) as ‘species2’ and other contaminants (‘conta’). The first two terms will be replace by more specific ones (‘Yeast’ and ‘UPS1’) :

#> 
#>    conta  mainSpe species2 
#>        1      866       46
#> useLi 1 2 3 4 5 6 
#> useLi 122 801 802 804 809 814
#> useLi 57 58 59 60 61 62 
#> useLi 3 10 11 12 13 14

Data Treatment

Normalization

No additional normalization is needed, all data were already median normalized to the host proteins (ie Saccaromyces cerevisiae) after importing the initial quantification-output using ‘readMaxQuantFile()’, and ‘readPDExport()’.

Presence of NA-values

As mentioned in the (general) vignette ‘wrProteoVignette1’ it is important to investigate the nature of NA-values, in particular the hypothesis that NA-values originate from very low abundance instances (eg non of its peptides identified during the MS1 run).

## Let's inspect NA values as graphic
matrixNAinspect(dataPD$quant, gr=grp9, tit="ProteomeDiscoverer")  # gl(9,3)

## Let's inspect NA values as graphic
matrixNAinspect(dataMQ$quant, gr=gl(9,3), tit="MaxQuant") 

## why only 24 columns => reprocess ?

NA-Imputation and Statistical Testing for Changes in Abundance

NAs values represent a challange for statistical testing. In the sections above we provided evidence that NA-values typically represent proteins with very low protein abundance that finally ended as non-detectable (NA). The number of NAs varies between samples : Very low concentrations of UPS1 tend not to get detected and thus contribute largely to the NAs. Since the amout if yeast proteins stays constant they should always get detected the way in all samples.

## Let's look at the number of NAs. Is there an accumulated number in lower UPS1 semples ?
sumNAperGroup(dataPD$raw, grp9) 
#>    50amol   125amol   250amol   500amol  2500amol  5000amol 12500amol 25000amol 
#>        88        86        80        51        14         8         8        12 
#> 50000amol 
#>        10
sumNAperGroup(dataMQ$raw, grp9) 
#>    50amol   125amol   250amol   500amol  2500amol  5000amol 12500amol 25000amol 
#>       354       353       353       355       278       277       248       281 
#> 50000amol 
#>       284

The function testRobustToNAimputation() from the pckage wrProteo performs NA-imputation and statistical testing (after repeated imputation) between all groups of samples the same time (as it would be inefficient to separate these two tasks). The tests underneith apply shrinkage from the empirical Bayes procedure from the bioconductor package limma. In addition, various formats of multiple test correction can be directly added to the results : Benjamini-Hochberg FDR, local false discovery rate (lfdr, using the package fdrtool, see Strimmer 2008 doi: 10.1093/bioinformatics/btn209), or modified testing by ROTS, etc … One of the advantages of this method, is that multiple rounds of imputation are run, so that final results (including pair-wise testing) gets stabilized to (rare) stochastic effects without bias due to low variances.

We are ready to launch the testing :

## Let's run pairwise-testing for ProteomeDiscoverer
testPD <- testRobustToNAimputation(dataPD$quant, gr=grp9, lfdrInclude=TRUE, annot=dataPD$annot)       # gl(9,3
#> -> testRobustToNAimputation -> matrixNAneighbourImpute :  n.woNA= 24429  n.NA = 357
#>     model 10 %-tile of (min 1 NA/grp) 154 NA-neighbour values
#>     imputation: mean= 14   sd= 0.93
#> -> testRobustToNAimputation -> combineMultFilterNAimput :     at presenceFilt:   917 917 907 917 897 905 916 917 917 917 917 895 907 898 917 917 917 917 917 917 917 916 917 917 917 916 917 917 917 917 917 917 917 917 917 917   out of  918 
#> -> testRobustToNAimputation -> combineMultFilterNAimput :     at abundanceFilt:  877 876 873 877 875 870 903 904 902 903 913 890 898 890 909 911 910 909 909 910 911 910 910 908 909 907 908 910 913 912 908 909 910 910 912 910
#> -> testRobustToNAimputation -> combineMultFilterNAimput :    at NA> mean:   874, 871, 869, 874, 874, 869, 881, 882, 883, 894, 883, 884, 882, 885, 909, 883, 884, 884, 895, 909, 910, 881, 883, 882, 894, 907, 908, 909, 884, 885, 883, 895, 909, 909, 912 and 909
## Let's run pairwise-testing for MaxQuant
testMQ <- testRobustToNAimputation(dataMQ$quant, gr=grp9, lfdrInclude=TRUE, annot=dataMQ$annot) 
#> -> testRobustToNAimputation -> matrixNAneighbourImpute :  n.woNA= 26836  n.NA = 2783
#>     model 10 %-tile of (min 1 NA/grp) 802 NA-neighbour values
#>     imputation: mean= 20.2   sd= 0.76
#> -> testRobustToNAimputation -> combineMultFilterNAimput :     at presenceFilt:   1041 1038 1005 1038 1005 1007 1033 1042 1039 1038 1051 1013 1017 1016 1053 1037 1044 1042 1040 1039 1054 1030 1034 1033 1035 1033 1048 1043 1034 1040 1042 1038 1035 1055 1040 1035   out of  1097 
#> -> testRobustToNAimputation -> combineMultFilterNAimput :     at abundanceFilt:  1003 1004 982 995 982 979 1009 1014 1009 1006 1040 996 997 1000 1026 1025 1031 1025 1028 1018 1032 1020 1023 1020 1025 1022 1029 1025 1023 1029 1026 1024 1021 1031 1028 1022
#> -> testRobustToNAimputation -> combineMultFilterNAimput :    at NA> mean:   943, 938, 948, 934, 944, 949, 935, 946, 945, 945, 934, 939, 939, 939, 973, 937, 955, 957, 952, 982, 981, 932, 944, 949, 948, 981, 972, 990, 923, 936, 943, 937, 963, 965, 980 and 968

From these results we’ll use i) the NA-imputed version of our datasets for plotting principal components and ii) the (stabilized) testing results for counting TP, FP, etc.

To have the statistical testing results in our main object, we’ll copy the imputed data to our initial list :

## recuperate imputeded data to main data-object
dataPD$datImp <- testPD$datImp
dataMQ$datImp <- testMQ$datImp

Similarity by PCA

Principal component analysis (PCA) cannot handle NA-values. Either all lines with any NAs have to be excluded, or data after NA-imputation have to be used. Here, we chose the second option. Plots will be made using the package wrGraph.

plotPCAw(testPD$datImp, sampleGrp=grp9, tit="PCA on ProteomeDiscoverer (NAs imputed)", rowTyName="proteins", useSymb2=0)

plotPCAw(testMQ$datImp ,sampleGrp=grp9, tit="PCA on MaxQuant (NAs imputed)", rowTyName="proteins", useSymb2=1:9)

Again, since the sample consists predominantly of yeast proteins that are kept constant, one would not expect many sample-related characteristics. In this case we might be rather interested in the (global) characteristics and similarity of the UPS1 proteins :

# limit to UPS1 
plotPCAw(testPD$datImp[which(testPD$annot[,"SpecType"]=="UPS1"),], sampleGrp=grp9, tit="PCA on ProteomeDiscoverer, UPS1 only (NAs imputed)",rowTyName="proteins", useSymb2=0)


plotPCAw(testMQ$datImp[which(testMQ$annot[,"SpecType"]=="UPS1"),], sampleGrp=grp9, tit="PCA on MaxQuant, UPS1 only (NAs imputed)",rowTyName="proteins", useSymb2=1:9)

PCA Only on UPS1 Proteins

Based on PCA one cane see that the comparison with concentrations >= 250 aMol may be better to actually detect differences, as also confirmed by ROC part later.

Characteristics of Pairwise Comparisons

A very universal and simple way to analyze data is by checking on several pairwise comparisons, in particular if the experimental setup does not include complete multifactorial plans.

This UPS1 spike-in experiment has 27 samples organized (according to meta-information) as 27 groups. Thus one obtains in total 36 comparisons which will make comparisons very crowded. The publication by Ramus focussed on 3 pairwise comparisons only. Here we’ll extend this to 5 pairwise comparisons.

Pairwise Testing Summary

Thus, the graphical comparisons were restricted to three comparisons presented in the original publication plus two additional ones. The distribution of intra-group CV-values showed (without major surprise) that the highest UPS1 concentrations replicated best. In consequence comparisons using this group are expected to have a decent chance to rather specifically reveil a high number of UPS1 proteins.

## The names of all the pair-wise comparisons possible
colnames(testPD$BH)
#>  [1] "12500amol-125amol"   "12500amol-25000amol" "125amol-25000amol"  
#>  [4] "12500amol-2500amol"  "125amol-2500amol"    "25000amol-2500amol" 
#>  [7] "12500amol-250amol"   "125amol-250amol"     "25000amol-250amol"  
#> [10] "2500amol-250amol"    "12500amol-50000amol" "125amol-50000amol"  
#> [13] "25000amol-50000amol" "2500amol-50000amol"  "250amol-50000amol"  
#> [16] "12500amol-5000amol"  "125amol-5000amol"    "25000amol-5000amol" 
#> [19] "2500amol-5000amol"   "250amol-5000amol"    "50000amol-5000amol" 
#> [22] "12500amol-500amol"   "125amol-500amol"     "25000amol-500amol"  
#> [25] "2500amol-500amol"    "250amol-500amol"     "50000amol-500amol"  
#> [28] "5000amol-500amol"    "12500amol-50amol"    "125amol-50amol"     
#> [31] "25000amol-50amol"    "2500amol-50amol"     "250amol-50amol"     
#> [34] "50000amol-50amol"    "5000amol-50amol"     "500amol-50amol"

Now, we’ll construct a table showing all possible pairwise-comparisons. Using the function numPairDeColNames() we can easily extract the UPS1 concentrations as numeric content and show the (log-)ratio of the pairwise comparisons (column ‘log2rat’), the final concentration (in fmol) and the number of differentially abundant proteins passing 5% FDR (using classical Benjamini-Hochberg FDR or lfdr Strimmer 2008.

## The number of differentially abundant proteins passing 5% FDR (ProteomeDiscoverer and MaxQuant) 
signCount <- cbind( sig.PD.BH=colSums(testPD$BH < 0.05, na.rm=TRUE), sig.PD.lfdr=if("lfdr" %in% names(testPD)) colSums(testPD$lfdr < 0.05, na.rm=TRUE),
  sig.MQ.BH=colSums(testMQ$BH < 0.05, na.rm=TRUE), sig.MQ.lfdr=if("lfdr" %in% names(testMQ)) colSums(testMQ$lfdr < 0.05, na.rm=TRUE) )

table1 <- numPairDeColNames(testPD$BH, stripTxt="amol", sortByAbsRatio=TRUE)
table1 <- cbind(table1, signCount[table1[,1],])
knitr::kable(table1, caption="All pairwise comparisons (extended from Ramus et al)", align="c")
All pairwise comparisons (extended from Ramus et al)
index log2rat conc1 conc2 sig.PD.BH sig.PD.lfdr sig.MQ.BH sig.MQ.lfdr
50000amol-50amol 34 9.966 50 50000 448 401 372 307
25000amol-50amol 31 8.966 50 25000 464 404 357 305
125amol-50000amol 12 8.644 125 50000 296 234 206 158
12500amol-50amol 29 7.966 50 12500 405 353 289 241
125amol-25000amol 3 7.644 125 25000 291 258 127 94
250amol-50000amol 15 7.644 250 50000 312 255 261 199
12500amol-125amol 1 6.644 125 12500 186 145 83 64
25000amol-250amol 9 6.644 250 25000 252 204 117 94
50000amol-500amol 27 6.644 500 50000 311 242 229 164
5000amol-50amol 35 6.644 50 5000 514 479 362 299
12500amol-250amol 7 5.644 250 12500 131 120 40 24
25000amol-500amol 24 5.644 500 25000 306 264 77 62
2500amol-50amol 32 5.644 50 2500 472 429 320 261
125amol-5000amol 17 5.322 125 5000 258 202 100 66
12500amol-500amol 22 4.644 500 12500 170 121 56 41
125amol-2500amol 5 4.322 125 2500 163 140 77 64
2500amol-50000amol 14 4.322 2500 50000 252 195 162 122
250amol-5000amol 20 4.322 250 5000 182 122 100 81
25000amol-2500amol 6 3.322 2500 25000 138 87 6 4
2500amol-250amol 10 3.322 250 2500 95 67 34 26
50000amol-5000amol 21 3.322 5000 50000 314 271 179 137
5000amol-500amol 28 3.322 500 5000 159 128 69 53
500amol-50amol 36 3.322 50 500 387 331 257 233
12500amol-2500amol 4 2.322 2500 12500 16 9 1 0
25000amol-5000amol 18 2.322 5000 25000 254 187 27 21
2500amol-500amol 25 2.322 500 2500 110 85 41 25
250amol-50amol 33 2.322 50 250 326 281 210 173
12500amol-50000amol 11 2.000 12500 50000 174 115 101 66
125amol-500amol 23 2.000 125 500 19 17 4 3
12500amol-5000amol 16 1.322 5000 12500 49 71 4 3
125amol-50amol 30 1.322 50 125 266 244 164 108
12500amol-25000amol 2 1.000 12500 25000 18 17 0 0
125amol-250amol 8 1.000 125 250 11 6 2 1
25000amol-50000amol 13 1.000 25000 50000 124 86 69 47
2500amol-5000amol 19 1.000 2500 5000 8 4 2 1
250amol-500amol 26 1.000 250 500 6 2 3 0

You can see that in numerous cases much more than the 48 UPS1 proteins showed up significant.

In the Ramus et al paper only 3 pairwise comparisons were further analyzed :

## In Ramus paper selection
colnames(testPD$BH)[c(2,21,27)]   
#> [1] "12500amol-25000amol" "50000amol-5000amol"  "50000amol-500amol"

Finally, let’s use a slightly extended selection of concentrations :

## extended selection
useCompNo <- c(2,21,27, 14,15)
colnames(testPD$BH)[useCompNo]
#> [1] "12500amol-25000amol" "50000amol-5000amol"  "50000amol-500amol"  
#> [4] "2500amol-50000amol"  "250amol-50000amol"

## Let's extract the concentration part to numeric
numNamePart <- numPairDeColNames(testPD$BH, selComp=useCompNo, stripTxt="amol", sortByAbsRatio=TRUE)
head(numNamePart)
#>      index log2rat conc1 conc2
#> [1,]    15   7.644   250 50000
#> [2,]    27   6.644   500 50000
#> [3,]    14   4.322  2500 50000
#> [4,]    21   3.322  5000 50000
#> [5,]     2   1.000 12500 25000

## table with concentrations in selected comparisons
table2 <- cbind(numNamePart, signCount[numNamePart[,1],])
knitr::kable(table2, caption="Selected pairwise comparisons (extended from Ramus et al)", align="c")
Selected pairwise comparisons (extended from Ramus et al)
index log2rat conc1 conc2 sig.PD.BH sig.PD.lfdr sig.MQ.BH sig.MQ.lfdr
250amol-50000amol 15 7.644 250 50000 312 255 261 199
50000amol-500amol 27 6.644 500 50000 311 242 229 164
2500amol-50000amol 14 4.322 2500 50000 252 195 162 122
50000amol-5000amol 21 3.322 5000 50000 314 271 179 137
12500amol-25000amol 2 1.000 12500 25000 18 17 0 0

Pairwise Simlarity : Volcano-Plots

Volcano-plots offer more insight in how statistical test results vary in respect to p-values. In addition we can mark the different protein-groups (or species), see also vignette to the package wrGraph.

The PCA plots already told us graphically how strong the differences appear in the various (pairwise) comparisons. Counting the number of proteins passing a classical threshold for differential expression is a good way to start.

The dataset contains 9 different levels of UPS1 concentrations, in consequence 36 pair-wise comparisons are possible in the data-set from Ramus et al 2016. Plotting all these pair-wise comparisons would make way too crowded plots.

## the selected comparisons to check
cbind(no=useCompNo, name=colnames(testPD$t)[useCompNo])
#>      no   name                 
#> [1,] "2"  "12500amol-25000amol"
#> [2,] "21" "50000amol-5000amol" 
#> [3,] "27" "50000amol-500amol"  
#> [4,] "14" "2500amol-50000amol" 
#> [5,] "15" "250amol-50000amol"
## check presence and good version of package wrGraph
doVolc <- requireNamespace("wrGraph", quietly=TRUE)
if(doVolc) doVolc <- packageVersion("wrGraph") >= "1.0.6"

## ProteomeDiscoverer
layout(matrix(1:6, ncol=2)) 
if(doVolc) {
  for(i in useCompNo) VolcanoPlotW(testPD, useComp=i, FCthrs=2, FdrThrs=0.05, annColor=c(4,2,3),silent=TRUE)}

## MaxQuant
layout(matrix(1:6, ncol=2))
if(doVolc) {
  for(i in useCompNo) VolcanoPlotW(testMQ, useComp=i, FCthrs=2, FdrThrs=0.05, annColor=c(4,2,3),silent=TRUE)}

Typically a classical proteomics analysis would go from this step into further investigating proteins with significant abundances. However, the UPS1 setup is special since we know in advance which proteins should be differential. In the following section we’ll focus on these UPS1 proteins and we’ll take advantage of the fact that multiple concentrations thereof have been measured.

UPS1: Characteristics of the Data for the Spike-In Proteins (after NA-Imputation)

We know from the experimental setup that there were 48 UPS1 proteins proteins present in the commercial mix. The lowest concentrations are extremely challenging and it is no surprise that many of them were not detected at the lowest concentrations. In order to choose among the various concentrations of UPS1, let’s look how many NAs are in each group of replicates, and in particular, the number of NAs among the UPS1 proteins.

## The number of NAs, just the UPS1 proteins (in ProteomeDiscoverer):
sumNAperGroup(dataPD$raw[which(dataPD$annot[,"SpecType"]=="species2"),], grp9) 
#>    50amol   125amol   250amol   500amol  2500amol  5000amol 12500amol 25000amol 
#>         0         0         0         0         0         0         0         0 
#> 50000amol 
#>         0
sumNAperGroup(dataMQ$raw[which(dataMQ$annot[,"SpecType"]=="species2"),], grp9) 
#>    50amol   125amol   250amol   500amol  2500amol  5000amol 12500amol 25000amol 
#>         0         0         0         0         0         0         0         0 
#> 50000amol 
#>         0

As general indicator for data-quality and -usability let’s look at the intra-replicate variability. Here we plot all intra-group (ie UPS1-concentration) CVs.

In the figure below the complete series (including yeast) is shown on the left side, the human UPS1 proteins only on the right side. Briefly, vioplots show a kernel-estimate for the distribution, in addition, a box-plot is also integrated (see vignette to package wrGraph).

## combined plot : all data (left), Ups1 (right)
layout(1:2)
sumNAinPD <- list(length=18)
sumNAinPD[2*(1:length(unique(grp9))) -1] <- as.list(as.data.frame(log2(rowGrpCV(testPD$datImp, grp9))))
sumNAinPD[2*(1:length(unique(grp9))) ] <- as.list(as.data.frame(log2(rowGrpCV(testPD$datImp[which(testPD$annot[,"SpecType"]=="UPS1"),], grp9))))
names(sumNAinPD)[2*(1:length(unique(grp9))) -1] <-  sub("amol","",unique(grp9))
names(sumNAinPD)[2*(1:length(unique(grp9))) ] <- paste(sub("amol","",unique(grp9)),"Ups",sep=".")
vioplotW(sumNAinPD, halfViolin="pairwise", tit="CV Intra Replicate, ProteomeDiscoverer", cexNameSer=0.6) 
mtext("left part : all data\nright part: UPS1",adj=0,cex=0.8)

sumNAinMQ <- list(length=18)
sumNAinMQ[2*(1:length(unique(grp9))) -1] <- as.list(as.data.frame(log2(rowGrpCV(testMQ$datImp, grp9))))
sumNAinMQ[2*(1:length(unique(grp9))) ] <- as.list(as.data.frame(log2(rowGrpCV(testMQ$datImp[which(testMQ$annot[,"SpecType"]=="UPS1"),], grp9))))
names(sumNAinMQ)[2*(1:length(unique(grp9))) -1] <- sub("amol","",unique(grp9))                        # paste(unique(grp9),"all",sep=".")
names(sumNAinMQ)[2*(1:length(unique(grp9))) ] <- paste(sub("amol","",unique(grp9)),"Ups",sep=".")      #paste(unique(grp9),"Ups1",sep=".")
vioplotW(sumNAinMQ, halfViolin="pairwise", tit="CV intra replicate, MaxQuant",cexNameSer=0.6) 
mtext("left part : all data\nright part: UPS1",adj=0,cex=0.8)


## decent compromise based on CV : focus on 250 amol  vs 50000 amol
##    ... or for low no of NAs:           2500 amol   vs 50000 amol

 ## for linear modeling rather  500 amol  vs 50000 amol (last of 'high' NA counts)
 
## Ramus:  500  vs  50000    (PD 28/0 NA, MQ 97/1 NA)
##        5000  vs  50000    (PD  0/0 NA, MQ  8/1 NA)
##       12500  vs  25000

Once can see that lower concentrations of UPS1 usually have worse CV (coefficient of variance) in the respective samples, this phenomenon also correlates with the content of NAs in the original data.

Testing All Individual UPS1 Proteins By Linear Regression

## the quantified UPS1 names
table(dataPD$annot[,"SpecType"])              # 46
#> 
#>  UPS1 Yeast conta 
#>    46   866     1
table(dataMQ$annot[,"SpecType"])              # 48
#> 
#>  UPS1 Yeast conta 
#>    48  1040     9

## extract names of quantified UPS1-proteins
NamesUpsPD <- dataPD$annot[which(dataPD$annot[,"SpecType"]=="UPS1"),"Accession"]
NamesUpsMQ <- dataMQ$annot[which(dataMQ$annot[,"SpecType"]=="UPS1"),"Accession"]

Run linear models, extract slope & pval, plot per UPS1 protein :

## ProteomeDiscoverer
lmPD <- list(length=length(NamesUpsPD))

layout(matrix(1:12, ncol=2))
lmPD[1:12] <- lapply(NamesUpsPD[1:12], linModelSelect, dat=dataPD, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)

lmPD[13:24] <- lapply(NamesUpsPD[13:24], linModelSelect, dat=dataPD, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)

lmPD[25:36] <- lapply(NamesUpsPD[25:36], linModelSelect, dat=dataPD, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)

lmPD[37:46] <- lapply(NamesUpsPD[37:46], linModelSelect, dat=dataPD, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)
names(lmPD) <- NamesUpsPD

## We make a little summary of regression-results (ProteomeDiscoverer)
lmPDsum <- cbind(pVal=sapply(lmPD,function(x) x$coef[2,4]),logp=NA,slope=sapply(lmPD,function(x) x$coef[2,1]), startFr=sapply(lmPD,function(x) x$startLev), medRawAbund=apply(log2(dataPD$raw[NamesUpsPD,]),1,median,na.rm=TRUE),good=0)

lmPDsum[,"logp"] <- log10(lmPDsum[,"pVal"])
lmPDsum[which(lmPDsum[,"logp"] < -12 & lmPDsum[,"slope"] >0.75),"good"] <- 1
lmPDsum[which(lmPDsum[,"logp"] < -10 & lmPDsum[,"slope"] >0.7),"good"] <- lmPDsum[which(lmPDsum[,"logp"] < -10 & lmPDsum[,"slope"] >0.7),"good"]+ 1

## now we can check the number of high-confidence quantifications (0 means bad linear model) 
table(lmPDsum[,"good"])           # 24 good quantifications
#> 
#>  0  1  2 
#> 20  2 24

## at which concentration of UPS1 did one et the best regression results ?
table(lmPDsum[,"startFr"])        # most starting at 1
#> 
#>  1  2  3  4  5 
#>  8 17  6  6  9

## a brief summary/overview of regression-results
summary(lmPDsum)
#>       pVal                logp              slope             startFr     
#>  Min.   :0.0000000   Min.   :-22.5469   Min.   :-0.08763   Min.   :1.000  
#>  1st Qu.:0.0000000   1st Qu.:-17.3973   1st Qu.: 0.05714   1st Qu.:2.000  
#>  Median :0.0000000   Median :-13.1783   Median : 0.86133   Median :2.000  
#>  Mean   :0.0213440   Mean   :-11.4901   Mean   : 0.64745   Mean   :2.804  
#>  3rd Qu.:0.0005831   3rd Qu.: -3.2699   3rd Qu.: 1.08937   3rd Qu.:4.000  
#>  Max.   :0.1993724   Max.   : -0.7003   Max.   : 1.29960   Max.   :5.000  
#>   medRawAbund         good      
#>  Min.   :16.46   Min.   :0.000  
#>  1st Qu.:18.40   1st Qu.:0.000  
#>  Median :19.07   Median :2.000  
#>  Mean   :19.09   Mean   :1.087  
#>  3rd Qu.:19.53   3rd Qu.:2.000  
#>  Max.   :25.16   Max.   :2.000
## Now for MaxQuant
lmMQ <- list(length=length(NamesUpsMQ))

layout(matrix(1:12, ncol=2))
lmMQ[1:12] <- lapply(NamesUpsMQ[1:12], linModelSelect, dat=dataMQ, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)

lmMQ[13:24] <- lapply(NamesUpsMQ[13:24], linModelSelect, dat=dataMQ, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)

lmMQ[25:36] <- lapply(NamesUpsMQ[25:36], linModelSelect, dat=dataMQ, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)

lmMQ[37:48] <- lapply(NamesUpsMQ[37:48], linModelSelect, dat=dataMQ, expect=grp9, startLev=1:5, cexXAxis=0.7, logExpect=TRUE, silent=TRUE)

names(lmMQ) <- NamesUpsMQ
## We make a little summary of regression-results (MaxQuant)
## Regressions with bad slope and/or p-value will be marked as O
lmMQsum <- cbind(pVal=sapply(lmMQ,function(x) x$coef[2,4]),logp=NA,slope=sapply(lmMQ,function(x) x$coef[2,1]), startFr=sapply(lmMQ,function(x) x$startLev), medRawAbund=apply(log2(dataMQ$raw[NamesUpsMQ,]),1,median,na.rm=TRUE),good=0)
lmMQsum[,"logp"] <- log10(lmMQsum[,"pVal"])
lmMQsum[which(lmMQsum[,"logp"] < -12 & lmMQsum[,"slope"] >0.75),"good"] <- 1
lmMQsum[which(lmMQsum[,"logp"] < -10 & lmMQsum[,"slope"] >0.7),"good"] <- lmMQsum[which(lmMQsum[,"logp"] < -10 & lmMQsum[,"slope"] >0.7),"good"]+ 1

## now we can check the number of high-confidence quantifications (0 means bad linear model) 
table(lmMQsum[,"good"])           # 26 good quantifications
#> 
#>  0  1  2 
#> 15  7 26

## at which concentration of UPS1 did one et the best regression results ?
table(lmMQsum[,"startFr"])        # most starting at 5 !
#> 
#>  1  2  3  4  5 
#>  3  6  1  7 31

## a brief summary/overview of regression-results
summary(lmMQsum)
#>       pVal                logp             slope           startFr     
#>  Min.   :0.0000000   Min.   :-23.273   Min.   :0.1405   Min.   :1.000  
#>  1st Qu.:0.0000000   1st Qu.:-13.838   1st Qu.:0.9539   1st Qu.:4.000  
#>  Median :0.0000000   Median :-12.713   Median :1.2154   Median :5.000  
#>  Mean   :0.0003915   Mean   :-12.035   Mean   :1.1472   Mean   :4.188  
#>  3rd Qu.:0.0000000   3rd Qu.: -9.779   3rd Qu.:1.4008   3rd Qu.:5.000  
#>  Max.   :0.0187899   Max.   : -1.726   Max.   :1.6093   Max.   :5.000  
#>   medRawAbund         good      
#>  Min.   :19.94   Min.   :0.000  
#>  1st Qu.:21.92   1st Qu.:0.000  
#>  Median :22.96   Median :2.000  
#>  Mean   :22.73   Mean   :1.229  
#>  3rd Qu.:23.49   3rd Qu.:2.000  
#>  Max.   :25.10   Max.   :2.000

Next, we can compare the different modelizations on a global basis :

## summary graphics on all indiv protein regressions for ProteomeDiscoverer
layout(matrix(c(1:3,3), ncol=2, byrow=TRUE))
hist(log10(sapply(lmPD,function(x) x$coef[2,4])), br=15,las=1, main="PD: hist of regr p-values",xlab="log10 p-values")     # good p < 1e-12
hist( sapply(lmPD,function(x) x$coef[2,1]), br=15,las=1, main="PD: hist of regr slopes",xlab="slope")     # good 

tit <- "ProteomeDiscoverer, UPS1 regressions :  p-value vs slope"
useCol <- colorAccording2(lmPDsum[,"medRawAbund"], gradTy="rainbow", revCol=TRUE, nEndOmit=14)
plot(lmPDsum[,c(2,3)], main=tit, type="n")   #col=1, bg.col=useCol, pch=20+lmPDsum[,"startFr"],
points(lmPDsum[,c(2,3)], col=1, bg=useCol, pch=20+lmPDsum[,"startFr"],)
legend("topright",paste("best starting from ",1:5), text.col=1, pch=21:25, col=1, pt.bg="white", cex=0.9, xjust=0.5, yjust=0.5)
mtext("fill color according to median (raw) abundance (violet/blue/low -> green -> red/high)",cex=0.9)
  abline(v=c(-12,-10),lty=2,col="grey") ; abline(h=c(0.7,0.75),lty=2,col="grey")

hi1 <- hist(lmPDsum[,"medRawAbund"], plot=FALSE)
legendHist(sort(lmPDsum[,5]), colRamp=useCol[order(lmPDsum[,"medRawAbund"])][cumsum(hi1$counts)], location="bottomleft", legTit="median raw abundance")  #

ProteomeDiscoverer has bimodial character in histogram of slopes and p-values (not as clear) : apr 50% of proteins got well quantified others very bad.

## now for MaxQuant
layout(matrix(c(1:3,3), ncol=2, byrow=TRUE))
hist(log10(sapply(lmMQ,function(x) x$coef[2,4])), br=15,las=1, main="MQ: hist of regr p-values",xlab="log10 p-values")     # good p < 1e-12
hist( sapply(lmMQ,function(x) x$coef[2,1]), br=15,las=1, main="MQ: hist of regr slopes",xlab="slope")     # good 

tit <- "MaxQuant, UPS1 regressions :  p-value vs slope"
useCol <- colorAccording2(lmMQsum[,"medRawAbund"], gradTy="rainbow", revCol=TRUE, nEndOmit=14)
plot(lmMQsum[,c(2,3)], main=tit, type="n")   #col=1, bg.col=useCol, pch=20+lmMQsum[,"startFr"],
points(lmMQsum[,c(2,3)], col=1, bg=useCol, pch=20+lmMQsum[,"startFr"],)
legend("topright",paste("best starting from ",1:5), text.col=1, pch=21:25, col=1, pt.bg="white", cex=0.9, xjust=0.5, yjust=0.5)
mtext("fill color according to median (raw) abundance (red/high -> blue/low)",cex=0.9)
  abline(v=c(-12,-10),lty=2,col="grey") ; abline(h=c(0.7,0.75),lty=2,col="grey") 

hi1 <- hist(lmMQsum[,"medRawAbund"], plot=FALSE)
legendHist(sort(lmMQsum[,5]), colRamp=useCol[order(lmMQsum[,"medRawAbund"])][cumsum(hi1$counts)], location="bottomleft", legTit="median raw abundance")  

MaxQuant : No bimodial distributions for p-values or slopes, regressions appear raher uniform with high slopes using only the last few concentrations !

Comparison Using ROC-Curves

ROC curves display Sensitivity (True Positive Rate) versus 1-Specificity (False Positive Rate). They are typically used as illustrate and compare the discriminiative capacity of a yes/no decision system, see eg also ROC on Wikipedia or the original publication Hand and Till 2001.
In this case ROC curves are used to judge how well heterologous human UPS1 proteins can be recognized as differential abundant while constant yeast matrix proteins should not get classified as differential. Finally, ROC curves let us also gain some additional insights if the commonly used 5-percent FDR threshld cutoff allows getting the best out of the testing system.

The Ramus et al 2016 -dataset contains 9 different levels of UPS1 concentrations, in consequence 36 pair-wise comparisons are possible. Plotting all these pair-wise comparisons would make way too crowded plots.

Thus, the graphical comparisons were restricted to three comparisons presented in the original publication by Ramus et al 2016 plus two additional ones. The distribution of intra-group CV-values showed (without major surprise) that the highest UPS1 concentrations replicated best. In consequence comparisons using this group are expected to have a decent chance to rather specifically reveil a high number of UPS1 proteins.

ROC for Single Pair

Initially a ROC-curve cat get calculated for each pair-wise comparison where it is known which proteins should be found differential (ie human UPS1 proteins).

## single comparison data for ROC
rocPD.2 <- summarizeForROC(testPD, annotCol="SpecType", spec=c("Yeast","UPS1"), columnTest=2, tyThr="BH",overl=F,color=5)       # 12500amol-25000amol

tail(signif(rocPD.2,3))
#>        alph   spec  sens   prec  accur   FDR n.pos.Yeast n.pos.UPS1
#> [137,] 0.93 0.2030 0.773 0.0245 0.2170 0.976         677         17
#> [138,] 0.95 0.1850 0.818 0.0254 0.2010 0.975         692         18
#> [139,] 0.96 0.1720 0.818 0.0250 0.1880 0.975         703         18
#> [140,] 0.97 0.1630 0.818 0.0247 0.1790 0.975         711         18
#> [141,] 0.98 0.0813 0.955 0.0262 0.1030 0.974         780         21
#> [142,] 1.00 0.0000 1.000 0.0253 0.0253 0.975         849         22

ROC for Multiple Pairs

However, since we’re treating a larger data-set this can be done in batch. Now we are ready to extract all counts of each UPS1 for constructing ROC-curves.


layout(1)
rocPD <- lapply(table2[,1],function(x) summarizeForROC(testPD, annotCol="SpecType", spec=c("Yeast","UPS1"), columnTest=x, tyThr="BH", plotROC=FALSE))
rocMQ <- lapply(table2[,1],function(x) summarizeForROC(testMQ, annotCol="SpecType", spec=c("Yeast","UPS1"), columnTest=x, tyThr="BH", plotROC=FALSE))
#>  -> summarizeForROC :  PROBLEM :
#>   ***  None of the elements annotated as 'positive' species to search for has any valid testing results ! Unable to construct TP ! ***

names(rocPD) <- colnames(testPD$BH)[useCompNo] 
names(rocMQ) <- colnames(testMQ$BH)[useCompNo] 

And we can plot the ROC curves for ProteomeDiscoverer :

layout(1)
colPanel <- 2:6                                              #c(grey(0.4),2:5)
methNa <- paste(table2[,1],", ie",table2[,3],"-",table2[,4])
methNa <- paste0(rep(c("PD","MQ"), each=length(useCompNo)), methNa)
plotROC(rocPD[[1]],rocPD[[2]],rocPD[[3]],rocPD[[4]],rocPD[[5]], col=colPanel, methNames=methNa[1:5], pointSi=0.8, tit="ProteomeDiscoverer at 5 ratios",legCex=1)

One can see form the figure, that the classical threshold of FDR=0.05 suggests in this case to cut not at the optimal point, lower threshod values would provide a (slightly) better compromise between specificity & sensitivty.

We can see that the comparison 12500 amol vs 25000 amol performed worse than the other ones. Although at these high UPS1 concentrations the proteins were well detected, the statistical test had more problems just calling the UPS1 proteins ‘differential’. At the other comparisons the (theoretical) ratio was much higher :

Let’s moove on with the ROC curves for MaxQuant :

plotROC(rocMQ[[1]],rocMQ[[2]],rocMQ[[3]],rocMQ[[4]],rocMQ[[5]], col=colPanel, methNames=methNa[6:10], pointSi=0.8, xlim=c(0,0.27),txtLoc=c(0.09,0.3,0.03), tit="MaxQuant selected ratios",legCex=1)

Please note, that instead of 5 curves only 4 are shown : The comparison of 12500 vs 25000 gave not even a single UPS1 protein as ‘sinificant’. Thus, the true positives (TP) never left the count of 0, in consequence specificity and sensitivity can’t be calculated.

And the ROC curves for both ProteomeDiscoverer and MaxQuant :


colPan10 <- rainbow(13)[c(-3,-5,-13)]
plotROC(rocPD[[1]],rocPD[[2]],rocPD[[3]],rocPD[[4]],rocPD[[5]], rocMQ[[1]],rocMQ[[2]],rocMQ[[3]],rocMQ[[4]],rocMQ[[5]], col=colPan10, methNames=methNa, pointSi=0.8, tit="PD and MQ at selected ratios",legCex=1)

More quantitation methods will get integrated shortly …

Acknowledgements

The author wants to acknowledge the support by the IGBMC (CNRS UMR 7104, Inserm U 1258), CNRS, IGBMC, Universite de Strasbourg and Inserm and of course my collegues from the IGBMC proteomics platform. Furthermore, many very fruitful discussions with colleages on national and international level have helped to formulate ideas, improve and disseminate the tools presented here.

Session-Info

#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=C                   LC_CTYPE=French_France.1252   
#> [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C                  
#> [5] LC_TIME=French_France.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] wrGraph_1.0.6  rmarkdown_2.4  knitr_1.30     wrProteo_1.2.0 wrMisc_1.4.0  
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.25      R.methodsS3_1.8.1  magrittr_1.5       evaluate_0.14     
#>  [5] highr_0.8          sm_2.2-5.6         rlang_0.4.8        stringi_1.5.3     
#>  [9] fdrtool_1.2.15     limma_3.44.3       R.oo_1.24.0        R.utils_2.10.1    
#> [13] RColorBrewer_1.1-2 tools_4.0.3        stringr_1.4.0      xfun_0.18         
#> [17] yaml_2.2.1         compiler_4.0.3     tcltk_4.0.3        htmltools_0.5.0