xLDblock | R Documentation |
xLDblock
is supposed to obtain LD blocks for a list of Lead SNPs
together with the significance level.
xLDblock(data, include.LD = c("AFR", "AMR", "EAS", "EUR", "SAS"), LD.customised = NULL, LD.r2 = 0.8, GR.SNP = "LDblock_GR", verbose = T, RData.location = "http://galahad.well.ox.ac.uk/bigdata")
data |
a named input vector containing the significance level for nodes (dbSNP). For this named vector, the element names are dbSNP (starting with rs or in the format of 'chrN:xxx', where N is either 1-22 or X, xxx is number; for example, 'chr16:28525386'), the element values for the significance level (measured as p-value or fdr). Alternatively, it can be a matrix or data frame with two columns: 1st column for dbSNP, 2nd column for the significance level. |
include.LD |
additional SNPs in LD with Lead SNPs are also included. By default, it is 'NA' to disable this option. Otherwise, LD SNPs will be included based on one or more of 26 populations and 5 super populations from 1000 Genomics Project data (phase 3). The population can be one of 5 super populations ("AFR", "AMR", "EAS", "EUR", "SAS"). Explanations for population code can be found at http://www.1000genomes.org/faq/which-populations-are-part-your-study |
LD.customised |
a user-input matrix or data frame with 3 compulsory columns: 1st column for Lead SNPs, 2nd column for LD SNPs, and 3rd for LD r2 value. The recommended columns are 'maf', 'distance' (to the nearest gene) and 'cadd'. It is designed to allow the user analysing their precalcuated LD info. This customisation (if provided) has the high priority over built-in LD SNPs |
LD.r2 |
the LD r2 value. By default, it is 0.8, meaning that SNPs in LD (r2>=0.8) with input SNPs will be considered as LD SNPs. It can be any value from 0.1 to 1 |
GR.SNP |
the genomic regions of SNPs. By default, it is 'LDblock_GR', that is, SNPs from dbSNP (version 150) restricted to GWAS SNPs and their LD SNPs (hg19). Beyond it, the user can also directly provide a customised GR object |
verbose |
logical to indicate whether the messages will be displayed in the screen. By default, it sets to true for display |
RData.location |
the characters to tell the location of built-in
RData files. See |
an object of class "bLD", a list with following components:
best
: a GR object. It has optional meta-columns 'maf',
'distance' (to the nearest gene) and 'cadd', and compulsory
meta-columns 'pval', 'score' (-log10(pval)), 'upstream' (the lower
boundary away from the best SNP, non-positive value), 'downstream' (the
upper boundary away from the best SNP, non-negative value) and 'num'
(the number of SNPs in the block)
block
: a GRL object, each element corresponding to a block
for the best SNP with optional meta-columns 'maf', 'distance' (to the
nearest gene) and 'cadd', and compulsory meta-columns 'pval', 'score'
(-log10(pval)*R2, based on pval for its lead SNP), 'best' (the best
SNP) and 'distance_to_best' (to the best SNP)
None
xLDblock
# Load the XGR package and specify the location of built-in data library(XGR) RData.location <- "http://galahad.well.ox.ac.uk/bigdata" ## Not run: # a) provide the seed SNPs with the significance info ## load ImmunoBase data(ImmunoBase) ## get lead SNPs reported in AS GWAS and their significance info (p-values) gr <- ImmunoBase$AS$variant data <- GenomicRanges::mcols(gr)[,c('Variant','Pvalue')] # b) get LD block (EUR population) bLD <- xLDblock(data, include.LD="EUR", LD.r2=0.8, RData.location=RData.location) # c1) manhattan plot of the best best <- bLD$best best$value <- best$score gp <- xGRmanhattan(best, top=length(best)) gp # c2) manhattan plot of all LD block grl_block <- bLD$block gr_block <- BiocGenerics::unlist(grl_block,use.names=F) gr_block$value <- gr_block$score top.label.query <- names(gr_block)[!is.na(gr_block$pval)] #gr_block <- gr_block[as.character(GenomicRanges::seqnames(gr_block)) %in% c('chr1','chr2')] gp <- xGRmanhattan(gr_block, top=length(gr_block), top.label.query=top.label.query) # c3) karyogram plot of the best kp <- xGRkaryogram(gr=best,cytoband=T,label=T, RData.location=RData.location) kp # c4) circle plot of the best library(ggbio) gr_ideo <- xRDataLoader(RData.customised="hg19_ideogram", RData.location=RData.location)$ideogram #cp <- ggbio() + circle(kp$gr, geom="rect", color="steelblue", size=0.5) cp <- ggbio() + circle(kp$gr, aes(x=start, y=num), geom="point", color="steelblue", size=0.5) cp <- cp + circle(gr_ideo, geom="ideo", fill="gray70") + circle(gr_ideo, geom="scale", size=1.5) + circle(gr_ideo, geom="text", aes(label=seqnames), vjust=0, size=3) cp # d) track plot of 1st LD block gr_block <- bLD$block[[1]] cnames <- c('score','maf','cadd') ls_gr <- lapply(cnames, function(x) gr_block[,x]) names(ls_gr) <- cnames ls_gr$score$Label <- names(gr_block) ls_gr$score$Label[is.na(gr_block$pval)] <-'' GR.score.customised <- ls_gr ## cse.query df_block <- as.data.frame(gr_block) chr <- unique(df_block$seqnames) xlim <- range(df_block$start) cse.query <- paste0(chr,':',xlim[1],'-',xlim[2]) #cse.query <- paste0(chr,':',xlim[1]-1e4,'-',xlim[2]+1e4) ## xGRtrack tks <- xGRtrack(cse.query=cse.query, GR.score="RecombinationRate", GR.score.customised=GR.score.customised, RData.location=RData.location) tks ############### # Advanced use: get LD block (based on customised LD and SNP data) ############### LD.customised <- xRDataLoader('LDblock_EUR', RData.location=RData.location) GR.SNP <- xRDataLoader('LDblock_GR', RData.location=RData.location) bLD <- xLDblock(data, LD.customised=LD.customised, LD.r2=0.8, GR.SNP=GR.SNP, RData.location=RData.location) ## End(Not run)