The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

GFM: A Simple Transcriptomics Data

Wei Liu

2023-08-10

Load real data

First, we load the ‘GFM’ package and the real data which can be downloaded here. This data is in the format of ‘.Rdata’ that inludes a gene expression matrix ‘X’ with 3460 rows (cells) and 2000 columns (genes), a vector ‘group’ specifying two groups of variable types (‘type’ variable) including ‘gaussian’ and ‘poisson’ and a vector ‘y’ meaning the clusters of cells annotated by experts. We compare the performance of ‘GFM’ and ‘LFM’ in downstream clustering analysis based on the benchchmarked clusters ‘y’.

githubURL <- "https://github.com/feiyoung/GFM/blob/main/vignettes_data/Brain76.Rdata?raw=true"
download.file(githubURL,"Brain76.Rdata",mode='wb')

Then load to R

load("Brain76.Rdata")
XList <- list(X[,group==1], X[,group==2])
types <- type
str(XList)

library("GFM")
#load("vignettes_data\\Brain76.Rdata")
#ls() # check the variables
set.seed(2023) # set a random seed for reproducibility.

Fit GFM model

We fit the GFM model using ‘gfm’ function.

q <- 15
system.time(
  gfm1 <- gfm(XList, types, q= q, verbose = TRUE)
)

Compare with LFM in downstream analysis

We conduct the clustering analysis based on the extracted factors by GFM and evaluate the adjusted rand index (ARI) value based on the annotated cluster labels by experts.

hH <- gfm1$hH
library(mclust)
set.seed(1)
gmm1 <- Mclust(hH, G=7)
ARI_gfm <- adjustedRandIndex(gmm1$classification, y)

We fit linear factor model using same number of factors.

fac <- Factorm(X, q=15)
hH_lfm <- fac$hH
set.seed(1)
gmm2 <- Mclust(hH_lfm, G=7)
ARI_lfm <- adjustedRandIndex(gmm2$classification, y)

Compare with the ARIs by visualization.

library(ggplot2)
df1 <- data.frame(ARI= c(ARI_gfm,ARI_lfm),
                    Method =factor(c('GFM', "LFM")))
ggplot(data=df1, aes(x=Method, y=ARI, fill=Method)) + geom_bar(position = "dodge", stat="identity",width = 0.5)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.