The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Single cell analysis is a powerful method that allows for the deconvolution of the effect of treatments on complex populations containing different cell types, that may or may not respond to specific treatments. Depending on the technology used, the analytes can be genes, transcripts, proteins or metabolites. Using mass cytometry, bodenmiller et al measured the level of 9 proteins and 14 post-translational modifications. After using signal intensity from the 9 proteins (so called phenotypic markers) to define 14 sub-populations, they monitored the effect of several treatments using the 14 post-translational modifications.
Modeling and visualization of these type of data is challenging: the large number of events measured combined to the complexity of each samples is making the modelling complex, while the high dimensionality of the data precludes the use of standard visualizations.
The goal of this package is to enable the development of new methods by providing a curated set of data for testing and benchmarking.
For details on data acquisition please refer to Bodenmiller et al Nat Biotech 2012. Briefly, after treatment cells where profiled using a CyTOF, dead cells and debris were excluded and live cells were assigned to 1 of the 14 sub-populations using signal intensity from 9 phenotypic markers.
Samples corresponding to untreated cells, stimulated with BCR/FcR-XL, PMA/Ionomycin or vanadate or unstimulated, were downloaded from CytoBank as FCS files. Data was extracted and normalized using the arcsinh
function with a cofactor of 5.
The reference dataset corresponds to unstimulated, untreated samples. Data for the phenotypic markers can be visualized using a simple boxplot. First we turn the dataset into a Tall-Skinny data.frame
data(refPhenoMat)
<- melt(refPhenoMat)
refPhenoFrame names(refPhenoFrame) <- c('cell_id','channel','value')
ggplot(data=refPhenoFrame,aes(x=channel,y=value))+
geom_boxplot()+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
We can also use boxplots to visualize the intensity of a single channel over all populations. We first load the cell annotations, add it to the phenotypic data, and define a single color per cell type:
data('refAnnots')
$Cells <- rep(refAnnots$Cells,ncol(refPhenoMat))
refPhenoFrame<- setNames(c('#9CA5D5','#0015C5','#5B6CB4','#BFC5E8','#C79ED0','#850094',
cell.colors '#A567B1','#DBBCE2','#D3C6A1','#5E4500','#BBDEB1','#8A1923',
'#B35E62','#CEA191'),
c('cd14-hladr-','cd14-hladrhigh','cd14-hladrmid','cd14-surf-',
'cd14+hladr-','cd14+hladrhigh','cd14+hladrmid','cd14+surf-',
'cd4+','cd8+','dendritic','igm-','igm+','nk'))
And then
<- refPhenoFrame %>% filter(channel=='CD7')
cd7.pops ggplot(data=cd7.pops,
aes(x=Cells,y=value,fill=Cells))+
geom_boxplot()+
scale_fill_manual(values=cell.colors)+
guides(fill=FALSE)+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Alternatively one can use fan plots to visualize the trends of values of all markers in a population:
ggplot(refPhenoFrame %>%
filter(Cells=='cd4+'),
aes(x=channel,y=value))+
geom_fan()+
facet_wrap(~Cells)+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
For each population, the level of activation of the different pathways monitored using the functional markers can be visualized in the same way:
data(refFuncMat)
<- melt(refFuncMat)
refFuncFrame names(refFuncFrame) <- c('cell_id','channel','value')
$Cells <- rep(refAnnots$Cells,ncol(refFuncMat))
refFuncFrameggplot(refFuncFrame %>%
filter(Cells=='cd4+'),
aes(x=channel,y=value))+
geom_fan()+
facet_wrap(~Cells)+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
The untreated dataset corresponds to unstimulated samples that have been treated with BCR/FcR-XL, PMA/Ionomycin or vanadate. To visualize the effect of the different treatments on cell populations, we first turn the dataset into a Tall-Skinny data.frame
data('untreatedFuncMat')
data('untreatedAnnots')
<- melt(untreatedFuncMat)
untreatedFuncFrame names(untreatedFuncFrame) <- c('cell_id','channel','value')
$Cells <- rep(untreatedAnnots$Cells,ncol(untreatedFuncMat))
untreatedFuncFrame$Treatment <- rep(untreatedAnnots$Treatment,ncol(untreatedFuncMat)) untreatedFuncFrame
Then visualize the effects of the different stimulations on cd4+ cells:
<- refFuncFrame %>% filter(Cells=='cd4+') %>% group_by(Cells,channel) %>% summarise(value=median(value)) refFuncLine
## `summarise()` has grouped output by 'Cells'. You can override using the `.groups` argument.
<- do.call(rbind,lapply(seq(1,3),function(x) refFuncLine))
refFuncLine $Treatment <- rep(levels(untreatedFuncFrame$Treatment),each=nlevels(refFuncLine$channel))
refFuncLine$percent <- 0
refFuncLine$id <- 'ref'
refFuncLineggplot(untreatedFuncFrame %>%
filter(Cells=='cd4+'),
aes(x=channel,y=value))+
geom_fan()+
geom_line(data=refFuncLine,
aes(y=value,group=id),
col='black',linetype=4)+
facet_wrap(~Cells*Treatment,ncol=2,scale='free_x')+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
The dotted line corresponds to the median value of the reference sample for each channel.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.