SubtypeDrug: a software package for prioritization of candidate cancer subtype-specific drugs

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Xudong Han
Junwei Han
Chonghui Liu

2024-04-17

Introduction

The SubtypeDrug package is a systematic biological tool to optimize cancer subtype-specific drugs. The main capabilities of this tool are as follows:

1. Identifying drug-regulated subpathways.

2. Infer patient-specific subpathway activity profiles. 

3. Calculating drug-disease reverse association score.

4. Identifying cancer subtype-specific drugs. 

5. Visualization of results.

Identifying drug-regulated subpathways

We downloaded all pathways from the KEGG database in XML format. For each KEGG pathway map, we converted each pathway into a gene-gene network with genes as nodes and bilological relationships as edges. For metabolic pathways, two genes (enzymes) are connected by an edge if there is a common compound in their corresponding reaction. For non-metabolic pathways, two genes are connected by an edge if they have a relationship (such as interaction, binding or modification, etc.) indicated in the relation element of the XML file. We used the k-clique algorithm in SubpathwayMiner package to detect the subpathways, in which the distance between any two nodes is no greater than k (The default parameter k of the method was used in the study) (Li et al. 2009). In order to remove redundant information, we eliminated smaller subpathways with more than 80% overlap between subpathways that belong to the same pathway (Syed et al. 2018). The subpathway data was stored in our developed R-based “SubtypeDrugData” package, which is publicly available on Github (https://github.com/hanjunwei-lab/SubtypeDrugData).
CMap build 02 raw data is downloaded from the CMap website (Lamb et al. 2006). After normalizing gene expression profiles, the fold-change (FC) method was used to calculate differentially expressed levels of genes between the drug treatment (distinguish different concentrations of the same drug) and the control groups. For a given drug at a specific concentration, the genes in each subpathway were mapped to the ranked gene list of the drug repectively. We then used Gene Set Enrichment Analysis (GSEA) (Subramanian et al. 2005) to identify the subpathways regulated by the drug. A subpathway with a greater positive or negative enrichment score (ES) indicates that the drug may activate or inhibit (up- or down-regulate) this subpathway more strongly. An empirical gene-based permutation test procedure was used to estimate the statistically significance (p-value) of the ES of subpathway, which reflects the extent of association between the subpathway and drug. Finally all results of the drug-regulated subpathway were constituted the drug-subpatway association data which was stored in our SubtypeDrugData package (https://github.com/hanjunwei-lab/SubtypeDrugData).
The subpathway data and the drug-subpatway association data can be downloaded and used by the following code:

## Download SubtypeDrugData package from GitHub.
require(devtools)
install_github("hanjunwei-lab/SubtypeDrugData",force = TRUE)
require(SubtypeDrugData)
## Get subpathway list data.
## If the gene expression profile contains gene Symbol.
data(SpwSymbolList)
## If the gene expression profile contains gene Entrezid.
data(SpwEntrezidList)
## Get drug subpathway association data.
data(DrugSpwData)

Infer patient-specific subpathway activity profiles.

To identify cancer subtype-specific drugs, the SubtypeDrug package requires gene expression profiles with normal and cancer samples with subtype labels. The SubtypeDrug package provided two methods: gene set enrichment analysis (GSVA) (Hänzelmann, Castelo, and Guinney 2013) and single sample GSEA (ssGSEA) (Barbie et al. 2009), to infer patient-specific subpathway activity profiles from the gene expression profiles. For each subpathway, the differential activity value of each sample was estimated by comparing its activity with the mean and standard deviation of subpathway activities of accumulated normal samples (Ahn et al. 2014). The formula is as follows: \[{\mathop{{Z}}\nolimits_{{ij}}=\frac{{\mathop{{Sub}}\nolimits_{{ij}}-mean{ \left( {S\mathop{{ub}}\nolimits_{{i,normal}}} \right) }}}{{stdev{ \left( {\mathop{{Sub}}\nolimits_{{i,normal}}} \right) }}}}\] where \(Sub_{ij}\) is the activity value of \(i\) th subpathway in the \(j\) th cancer sample and \(Sub_{i,normal}\) is a vector of activity value of \(i\) th subpahtway in the normal samples. The \(Z_{ij}\) score denotes differential activity extent of the subpathway \(i\) for cancer sample \(j\) relative to normal samples.

Calculating drug-disease reverse association score.

To test if a drug could treat a specific cancer sample, we defined a drug-disease reverse association score (\(RS\)) to reflect the treatment extent of drug at the subpathway level. Specifically, for a given cancer sample \(j\), the subpathways were ranked in descending order based on the differential activity score to form a subpathway list \(L_j\). We mapped the up- and down-regulated subpathways induced by a drug respectively to the ranked list \(L_j\) to calculate RS. We provided two methods to estimate this score:

Non-weighed estimate method. For the set of up-regulated subpathways of the \(d\) th drug, we construct a vector \(V\) of the position (\(1...q\)) of each subpathway tags in the ordered list \(L_j\) and sort these components in ascending order such that \(V(g)\) is the position of tag \(g\) (\(p\) is the number of up-regulated subpathways of the \(d\) th drug and \(g=1,2,...p\)). The Kolmogorov-Smirnov (KS) statistic of the up-regulation subpathway of drug (\(KS_d^{up}\)) is calculated as follows:

\[{D\mathop{{}}\nolimits_{{1}}=max\mathop{{}}\nolimits_{{g=1}}^{{p}}{ \left[ {\frac{{g}}{{p}}-\frac{{V{ \left( {g} \right) }}}{{q}}} \right] }}\]

\[{\mathop{{D}}\nolimits_{{2}}=max\mathop{{}}\nolimits_{{g=1}}^{{p}}{ \left[ {\frac{{V{ \left( {g} \right) }}}{{q}}-\frac{{{ \left( {g-1} \right) }}}{{p}}} \right] }}\]

We set \(KS_d^{up}\)=\(D_{1}\), if \(D_{1}\)>\(D_{2}\) or \(KS_d^{up}\)=\(-D_{2}\), if \(D_{1}\)<\(D_{2}\). Like the above process, \(KS_d^{down}\) is also calculated for the down-regulation subpathways of drug. The drug-disease reverse association score (\(RS_{dj}\)) of the \(d\) th drug in the j th cancer sample is \(RS_{dj}=KS_d^{up}-KS_d^{down}\).

Weighed estimate method. For the ordered subpathway list of sample \(j\), \(L_j\), the drug up- and down-regulated subpathways are mapped to the list \(L_j\) repectively. We use the GSEA strategy to calculate the weight \(KS\) statistics of the drug up- and down-regulated subpathway set (termed \(ES^{up}\) and \(ES^{down}\)) with the subapthway differential activity score as weight. The RS is defined as: \(RS_{dj}=ES_d^{up}-ES_d^{down}\)

The magnitude of the \(RS\) depends on the correlation of the drug with the disease. The greater negative score indicates the drug may treat the disease with a larger extent, and the positive score indicates the drug may promote disease development. To allow inter-drug comparisons with \(RS_{dj}\), we further normalized the \(RS\) for each drug as follows:
\[NS_{dj}=RS_{dj}/|max(RS_d)|, \text{where }RS_{dj}>0\] or \[NS_{dj}=RS_{dj}/|min(RS_d)|, \text{where }RS_{dj}<0\] where max(\(RS_d\)) or min(\(RS_d\)) is the maximum or minimum value of \(RS\) of drug d across all samples.

Identifying cancer subtype-specific drugs

Calculating the cancer subtype-specific drug score

For each drug, we calculated the \(NS\) on each sample through the above method. Thus we obtained a normalized drug-disease reverse association matrix \(M = (NS_{dj})\) with rows as drugs and columns as samples. For a given drug, the samples were ranked in the dataset to form a sample list \(L\) according to decreasing \(NS\). If the samples in a cancer subtype enrich at the negative/positive \(NS\) region on the list \(L\), the drug may potentially treat/promote this cancer subtype. This process implements comparison of \(NS\) for the samples in different subtypes, and the different subtypes are competitive relationships in the list. To perform this sample set enrichment analysis, we defined a subype-specific drug score (\(SDS\)), and the formula of \(SDS\) is as follows:

\[{SDS\mathop{{}}\nolimits_{{t}}=\frac{{1}}{{\mathop{{ \beta }}\nolimits_{{t}}}}{\mathop{ \sum }\limits_{{j \in t}}{NS\mathop{{}}\nolimits_{{j}}}}}\]

where, \(\beta_{t}\) is the number of samples in the t th cancer subtype, \(NS_j\) is the normalized drug-disease reverse association score of j th cancer sample for the t th cancer subtype. The greater the negative \(SDS\) indicates the drug may have potentially stronger the potential therapeutic effect on the cancer subtype \(t\). Conversely, the greater positive \(SDS_t\) indicates that the drug may have stronger side effects on this cancer subtype.

Estimating statistically significance level of sutype-specific drug score

We first estimated the statistical significance (S_Pvalue) of \(SDS\) for each subtype by using an empirical sample-based permutation test procedure. Specifically, for a given drug, we permuted the class labels (subtype labels) of subpathway activity profiles, and recomputed the \(SDS\) for each subtype. In the pakage, the default permutation times are set at 1000. This will produce a null distribution \(SDS_{null}^*\) for all cancer subtypes. Based on law of large numbers, the \(SDS_{null}^*\) follows an approximately normal distribution. Subsequently, the S_Pvalue of the observerd \(SDS\) for each subtype is estimated based on the \(SDS_{null}^*\). To correct for multiple comparisons, we adjusted the S_Pvalues of drugs for each subtype by using the false discovery rate (FDR) method proposed by Benjamini and Hochberg (Benjamini, Hochberg, and etc. 1995).
Furthermore, to evaluate the drug treatment effect, we constructed subpathway activity profiles that contains only one cancer subtype and normal samples. The fold change (FC) of subpathway activity between the cancer subtype and normal samples were calculated, and the subpathways were ranked in descending order of log2FC to form a list. For a given drug, the drug targeted up-regulated and down-regulated subpathways were mapped to the ranked subpathway list repectively, and the \(RS\) was calculated, which reflects treatment effect of the drug on the subtype. Similarly, the permutation test procedure is also used to calculate the statistical significance (E_Pvalue) of \(RS\) for each drug, and the E_FDR was then calculated. If the gene expression data only contains cancer and normal samples, this process could be used to identify cancer candidate drugs. Finally, the drugs with both S_FDR and E_FDR less than a given threshold will be deemed as cancer subtype-specific.

Our method can identify not only cancer subtype-specific drugs in the datasets with multi-phenotype categories but also cancer-related drugs in the datasets with cancer and normal samples. Taking the simulative breast cancer data as an example, breast cancer-related and subtype-specific drug identification are as follows:

require(GSVA)
#> Loading required package: GSVA
require(parallel)
#> Loading required package: parallel
## Get simulated breast cancer gene expression profile data.
Geneexp<-get("Geneexp")
## Obtain sample subtype data and calculate breast cancer subtype-specific drugs.
Subtype_labels<-system.file("extdata", "Subtype_labels.cls", package = "SubtypeDrug")

# Identify breast subtype-specific drugs.
Subtype_drugs<-PrioSubtypeDrug(Geneexp,Subtype_labels,"Control",SpwSymbolList,
                               drug.spw.data=DrugSpwData,E_FDR=1,S_FDR=1)

## Results display.
str(Subtype_drugs)
#> List of 8
#>  $ Basal            :'data.frame':   1409 obs. of  8 variables:
#>   ..$ Drug                 : chr [1:1409] "1,5-isoquinolinediol(1e-04M)" "2-deoxy-D-glucose(0.01M)" "3-aminobenzamide(0.01M)" "4,5-dianilinophthalimide(1e-05M)" ...
#>   ..$ Target_upregulation  : chr [1:1409] "00140_10 00450_2 00590_2 00830_2 04010_7 04012_7 04014_3 04015_3 04020_1 04020_2 04022_1 04022_7 04024_2 04060_"| __truncated__ "00250_1 01524_2 03015_4 04022_1 04060_52 04130_2 04137_2 04137_5 04140_5 04140_13 04141_1 04141_5 04151_8 04210"| __truncated__ "00062_8 00071_1 00310_2 00310_4 00380_2 00380_5 00480_2 00510_8 00565_2 00980_11 03015_1 04014_11 04141_1 04261"| __truncated__ "00601_8 00604_5 00980_11 04010_7 04014_3 04015_3 04020_1 04020_2 04024_2 04060_69 04062_1 04064_9 04064_15 0407"| __truncated__ ...
#>   ..$ Target_downregulation: chr [1:1409] "00030_2 00071_1 00100_2 00100_5 00100_6 00100_7 00230_1 00230_6 00240_8 00250_1 00270_3 00310_2 00410_1 00510_1"| __truncated__ "00051_1 00100_6 00230_1 00240_8 00564_6 00565_3 00590_2 00790_2 00980_10 00980_11 00983_4 04015_9 04060_83 0406"| __truncated__ "00450_1 00562_8 04070_1 04071_8 04110_8 04141_3 04360_5 04360_6 04360_15 04360_17 04390_17 04670_16 04810_3 051"| __truncated__ "00010_2 00330_5 00510_10 00620_2 00640_5 00790_2 00790_8 03013_2 03013_8 03013_11 03013_18 03013_20 03015_2 041"| __truncated__ ...
#>   ..$ SDS                  : num [1:1409] -0.747 0.172 -0.0823 -0.0887 0.586 -0.261 0.0762 0.0304 0.134 -0.0991 ...
#>   ..$ E_Pvalue             : num [1:1409] 0.0391 0.9811 0.8075 0.1268 0.1089 ...
#>   ..$ E_FDR                : num [1:1409] 0.259 0.998 0.96 0.298 0.284 ...
#>   ..$ S_Pvalue             : num [1:1409] 0.165 0.998 0.748 0.604 0.556 0.536 0.682 0.859 0.658 0.541 ...
#>   ..$ S_FDR                : num [1:1409] 0.617 1 0.93 0.857 0.843 0.83 0.905 0.973 0.89 0.833 ...
#>  $ Her2             :'data.frame':   1409 obs. of  8 variables:
#>   ..$ Drug                 : chr [1:1409] "1,5-isoquinolinediol(1e-04M)" "2-deoxy-D-glucose(0.01M)" "3-aminobenzamide(0.01M)" "4,5-dianilinophthalimide(1e-05M)" ...
#>   ..$ Target_upregulation  : chr [1:1409] "00140_10 00450_2 00590_2 00830_2 04010_7 04012_7 04014_3 04015_3 04020_1 04020_2 04022_1 04022_7 04024_2 04060_"| __truncated__ "00250_1 01524_2 03015_4 04022_1 04060_52 04130_2 04137_2 04137_5 04140_5 04140_13 04141_1 04141_5 04151_8 04210"| __truncated__ "00062_8 00071_1 00310_2 00310_4 00380_2 00380_5 00480_2 00510_8 00565_2 00980_11 03015_1 04014_11 04141_1 04261"| __truncated__ "00601_8 00604_5 00980_11 04010_7 04014_3 04015_3 04020_1 04020_2 04024_2 04060_69 04062_1 04064_9 04064_15 0407"| __truncated__ ...
#>   ..$ Target_downregulation: chr [1:1409] "00030_2 00071_1 00100_2 00100_5 00100_6 00100_7 00230_1 00230_6 00240_8 00250_1 00270_3 00310_2 00410_1 00510_1"| __truncated__ "00051_1 00100_6 00230_1 00240_8 00564_6 00565_3 00590_2 00790_2 00980_10 00980_11 00983_4 04015_9 04060_83 0406"| __truncated__ "00450_1 00562_8 04070_1 04071_8 04110_8 04141_3 04360_5 04360_6 04360_15 04360_17 04390_17 04670_16 04810_3 051"| __truncated__ "00010_2 00330_5 00510_10 00620_2 00640_5 00790_2 00790_8 03013_2 03013_8 03013_11 03013_18 03013_20 03015_2 041"| __truncated__ ...
#>   ..$ SDS                  : num [1:1409] -0.766 0.593 -0.451 -0.142 0.711 -0.489 0.133 -0.115 0.114 0.206 ...
#>   ..$ E_Pvalue             : num [1:1409] 0.0202 0.5283 0.1565 0.0894 0.0663 ...
#>   ..$ E_FDR                : num [1:1409] 0.312 0.823 0.363 0.313 0.312 ...
#>   ..$ S_Pvalue             : num [1:1409] 0.13 0.271 0.0416 0.405 0.191 0.0599 0.475 0.503 0.709 0.205 ...
#>   ..$ S_FDR                : num [1:1409] 0.577 0.611 0.576 0.69 0.595 0.576 0.745 0.767 0.861 0.595 ...
#>  $ LumA             :'data.frame':   1409 obs. of  8 variables:
#>   ..$ Drug                 : chr [1:1409] "1,5-isoquinolinediol(1e-04M)" "2-deoxy-D-glucose(0.01M)" "3-aminobenzamide(0.01M)" "4,5-dianilinophthalimide(1e-05M)" ...
#>   ..$ Target_upregulation  : chr [1:1409] "00140_10 00450_2 00590_2 00830_2 04010_7 04012_7 04014_3 04015_3 04020_1 04020_2 04022_1 04022_7 04024_2 04060_"| __truncated__ "00250_1 01524_2 03015_4 04022_1 04060_52 04130_2 04137_2 04137_5 04140_5 04140_13 04141_1 04141_5 04151_8 04210"| __truncated__ "00062_8 00071_1 00310_2 00310_4 00380_2 00380_5 00480_2 00510_8 00565_2 00980_11 03015_1 04014_11 04141_1 04261"| __truncated__ "00601_8 00604_5 00980_11 04010_7 04014_3 04015_3 04020_1 04020_2 04024_2 04060_69 04062_1 04064_9 04064_15 0407"| __truncated__ ...
#>   ..$ Target_downregulation: chr [1:1409] "00030_2 00071_1 00100_2 00100_5 00100_6 00100_7 00230_1 00230_6 00240_8 00250_1 00270_3 00310_2 00410_1 00510_1"| __truncated__ "00051_1 00100_6 00230_1 00240_8 00564_6 00565_3 00590_2 00790_2 00980_10 00980_11 00983_4 04015_9 04060_83 0406"| __truncated__ "00450_1 00562_8 04070_1 04071_8 04110_8 04141_3 04360_5 04360_6 04360_15 04360_17 04390_17 04670_16 04810_3 051"| __truncated__ "00010_2 00330_5 00510_10 00620_2 00640_5 00790_2 00790_8 03013_2 03013_8 03013_11 03013_18 03013_20 03015_2 041"| __truncated__ ...
#>   ..$ SDS                  : num [1:1409] -0.213 0.648 -0.217 0.0296 0.388 -0.0515 -0.276 -0.36 0.097 -0.188 ...
#>   ..$ E_Pvalue             : num [1:1409] 0.1399 0.0464 0.8708 0.8703 0.212 ...
#>   ..$ E_FDR                : num [1:1409] 0.429 0.419 1 1 0.458 ...
#>   ..$ S_Pvalue             : num [1:1409] 0.999 0.144 0.375 0.863 0.96 0.955 0.136 0.0352 0.753 0.247 ...
#>   ..$ S_FDR                : num [1:1409] 1 1 1 1 1 1 1 0.846 1 1 ...
#>  $ LumB             :'data.frame':   1409 obs. of  8 variables:
#>   ..$ Drug                 : chr [1:1409] "1,5-isoquinolinediol(1e-04M)" "2-deoxy-D-glucose(0.01M)" "3-aminobenzamide(0.01M)" "4,5-dianilinophthalimide(1e-05M)" ...
#>   ..$ Target_upregulation  : chr [1:1409] "00140_10 00450_2 00590_2 00830_2 04010_7 04012_7 04014_3 04015_3 04020_1 04020_2 04022_1 04022_7 04024_2 04060_"| __truncated__ "00250_1 01524_2 03015_4 04022_1 04060_52 04130_2 04137_2 04137_5 04140_5 04140_13 04141_1 04141_5 04151_8 04210"| __truncated__ "00062_8 00071_1 00310_2 00310_4 00380_2 00380_5 00480_2 00510_8 00565_2 00980_11 03015_1 04014_11 04141_1 04261"| __truncated__ "00601_8 00604_5 00980_11 04010_7 04014_3 04015_3 04020_1 04020_2 04024_2 04060_69 04062_1 04064_9 04064_15 0407"| __truncated__ ...
#>   ..$ Target_downregulation: chr [1:1409] "00030_2 00071_1 00100_2 00100_5 00100_6 00100_7 00230_1 00230_6 00240_8 00250_1 00270_3 00310_2 00410_1 00510_1"| __truncated__ "00051_1 00100_6 00230_1 00240_8 00564_6 00565_3 00590_2 00790_2 00980_10 00980_11 00983_4 04015_9 04060_83 0406"| __truncated__ "00450_1 00562_8 04070_1 04071_8 04110_8 04141_3 04360_5 04360_6 04360_15 04360_17 04390_17 04670_16 04810_3 051"| __truncated__ "00010_2 00330_5 00510_10 00620_2 00640_5 00790_2 00790_8 03013_2 03013_8 03013_11 03013_18 03013_20 03015_2 041"| __truncated__ ...
#>   ..$ SDS                  : num [1:1409] -0.76 0.661 0.125 -0.0546 0.728 -0.292 0.331 0.548 0.415 0.259 ...
#>   ..$ E_Pvalue             : num [1:1409] 0.0137 0.067 0.9277 0.8274 0.0552 ...
#>   ..$ E_FDR                : num [1:1409] 0.252 0.262 0.987 0.974 0.253 ...
#>   ..$ S_Pvalue             : num [1:1409] 0.141 0.122 0.622 0.75 0.155 0.446 0.0737 0.00135 0.103 0.11 ...
#>   ..$ S_FDR                : num [1:1409] 0.538 0.529 0.812 0.868 0.539 0.704 0.518 0.518 0.52 0.52 ...
#>  $ DrugMatrix       : num [1:1409, 1:32] -0.711 0.798 0.172 -0.137 0.888 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:1409] "1,5-isoquinolinediol(1e-04M)" "2-deoxy-D-glucose(0.01M)" "3-aminobenzamide(0.01M)" "4,5-dianilinophthalimide(1e-05M)" ...
#>   .. ..$ : chr [1:32] "TCGA-C8-A12M-01A" "TCGA-AO-A03L-01A" "TCGA-AO-A0J8-01A" "TCGA-A2-A0T4-01A" ...
#>  $ SubpathwayMatrix : num [1:1113, 1:40] -0.458 0.535 0.38 -0.405 0.659 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:1113] "00010_1" "00010_2" "00010_5" "00010_6" ...
#>   .. ..$ : chr [1:40] "TCGA-BH-A0BZ-11A" "TCGA-BH-A1FR-11B" "TCGA-A7-A0DB-11A" "TCGA-A7-A0CH-11A" ...
#>  $ SampleInformation:'data.frame':   40 obs. of  2 variables:
#>   ..$ sampleId     : chr [1:40] "TCGA-BH-A0BZ-11A" "TCGA-BH-A1FR-11B" "TCGA-A7-A0DB-11A" "TCGA-A7-A0CH-11A" ...
#>   ..$ sampleSubtype: chr [1:40] "Control" "Control" "Control" "Control" ...
#>  $ Parameter        :List of 11
#>   ..$ control.label       : chr "Control"
#>   ..$ spw.min.sz          : num 1
#>   ..$ spw.max.sz          : num Inf
#>   ..$ spw.score.method    : chr "gsva"
#>   ..$ kcdf                : chr "Gaussian"
#>   ..$ drug.p.val.threshold: num 0.05
#>   ..$ drug.spw.min.sz     : num 1
#>   ..$ drug.spw.max.sz     : num Inf
#>   ..$ weighted.drug.score : logi TRUE
#>   ..$ nperm               : num 1000
#>   ..$ parallel.sz         : num 1

The PrioSubtypeDrug() function can also be used to identify breast cancer-related drugs in only two types of samples: breast cancer and normal.

Cancer_normal_labels<-system.file("extdata", "Cancer_normal_labels.cls", package = "SubtypeDrug")
Disease_drugs<-PrioSubtypeDrug(Geneexp,Cancer_normal_labels,"Control",SpwSymbolList,drug.spw.data=DrugSpwData,
                               E_FDR=1,S_FDR=1)

The function PrioSubtypeDrug() can also support user-defined data.

## User-defined drug regulation data should resemble the structure below.
UserDS<-get("UserDS")
str(UserDS[1:5])
#> List of 5
#>  $ 1,5-isoquinolinediol(1e-04M):List of 2
#>   ..$ Target_upregulation  : chr [1:64] "00140_10" "00450_2" "00590_2" "00830_2" ...
#>   ..$ Target_downregulation: chr [1:76] "00030_2" "00071_1" "00100_2" "00100_5" ...
#>  $ 2-deoxy-D-glucose(0.01M)    :List of 2
#>   ..$ Target_upregulation  : chr [1:68] "00250_1" "01524_2" "03013_1" "03015_4" ...
#>   ..$ Target_downregulation: chr [1:43] "00051_1" "00100_6" "00230_1" "00240_8" ...
#>  $ NA                          : NULL
#>  $ NA                          : NULL
#>  $ NA                          : NULL
## Need to load gene set data consistent with drug regulation data.
UserGS<-get("UserGS")
str(UserGS[1:5])
#> List of 5
#>  $ 00140_10: chr [1:49] "CYP11A1" "HSD3B1" "HSD3B2" "CYP17A1" ...
#>  $ 00250_1 : chr [1:17] "NIT2" "ASNS" "NAT8L" "IL4I1" ...
#>  $ 00030_2 : chr [1:28] "GPI" "DERA" "PRPS1L1" "PRPS1" ...
#>  $ 00071_1 : chr [1:17] "ACADSB" "ACADS" "EHHADH" "HADH" ...
#>  $ 00100_2 : chr [1:8] "EBP" "DHCR7" "SC5D" "DHCR24" ...
Drugs<-PrioSubtypeDrug(Geneexp,Subtype_labels,"Control",UserGS,spw.min.sz=1,drug.spw.data=UserDS,drug.spw.min.sz=1,
                       E_FDR=1,S_FDR=1)
#> Warning in .gsva_newAPI(expr = dataMatrix, gset.idx.list = mappedGeneSets, :
#> Some gene sets have size one. Consider setting 'minSize > 1'.
#> Estimating GSVA scores for 13 gene sets.
#> Estimating ECDFs with Gaussian kernels
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |======================================================================| 100%

Visualization

Plot a heat map of normalized drug-disease reverse association scores for cancer samples

require(pheatmap)
## Heat map of normalized disease-drug reverse association scores for all subtype-specific drugs.
plotDScoreHeatmap(data=Subtype_drugs,E_Pvalue.th=0.05,E_FDR.th=1,S_Pvalue.th=0.05,S_FDR.th=1,show.colnames = FALSE)

## Plot only Basal subtype-specific drugs.
plotDScoreHeatmap(Subtype_drugs,subtype.label="Basal",SDS="all",E_Pvalue.th=0.05,E_FDR.th=1,S_Pvalue.th=0.05,S_FDR.th=1,show.colnames = FALSE)

Plot heat map of patient-specific subpathway activity profiles of drug-regulated subpathway

## Plot a heat map of the individualized activity aberrance scores of subpathway regulated by drug pirenperone(1.02e-05M). 
## Basal-specific drugs pirenperone(1.02e-05M) regulated subpathways that show opposite activity from normal samples.
plotDSpwHeatmap(data=Subtype_drugs,drug.label="pirenperone(1.02e-05M)",subtype.label="Basal",show.colnames=FALSE)

Plot a global graph of the drug

## Plot a global graph of the Basal-specific drug pirenperone(1.02e-05M).
plotGlobalGraph(data=Subtype_drugs,drug.label="pirenperone(1.02e-05M)")

Polt a subpathway network graph

require(igraph)
# plot network graph of the subpathway 00020_4.
plotSpwNetGraph(spwid="00020_4")

Visualize the chemical structure of the drug

require(ChemmineR)
require(rvest)
## Plot the chemical structure of drug pirenperone.
Chem_str<-getDrugStructure(drug.label="pirenperone")
plot(Chem_str)

References

Ahn, TaeJin, Eunjin Lee, Nam Huh, and Taesung Park. 2014. “Personalized Identification of Altered Pathways in Cancer Using Accumulated Normal Tissue Data.” Bioinformatics 30 (17): i422–29.

Barbie, David A, Pablo Tamayo, Jesse S Boehm, So Young Kim, Susan E Moody, Ian F Dunn, Anna C Schinzel, et al. 2009. “Systematic RNA Interference Reveals That Oncogenic KRAS-Driven Cancers Require TBK1.” Nature 462 (7269): 108.

Benjamini, Y, Y Hochberg, and etc. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society 57 (1): 289–300.

Hänzelmann, Sonja, Robert Castelo, and Justin Guinney. 2013. “GSVA: Gene Set Variation Analysis for Microarray and RNA-Seq Data.” BMC Bioinformatics 14 (1): 7.

Lamb, Justin, Emily D Crawford, David Peck, Joshua W Modell, Irene C Blat, Matthew J Wrobel, Jim Lerner, et al. 2006. “The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease.” Science 313 (5795): 1929–35.

Li, Chunquan, Xia Li, Yingbo Miao, Qianghu Wang, Wei Jiang, Chun Xu, Jing Li, et al. 2009. “SubpathwayMiner: A Software Package for Flexible Identification of Pathways.” Nucleic Acids Research 37 (19): e131.

Subramanian, Aravind, Pablo Tamayo, Vamsi K Mootha, Sayan Mukherjee, Benjamin L Ebert, Michael A Gillette, Amanda Paulovich, et al. 2005. “Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles.” Proceedings of the National Academy of Sciences 102 (43): 15545–50.

Syed, Haider, Yao Cindy Q, Sabine Vicky S, and etc. 2018. “Pathway-Based Subnetworks Enable Cross-Disease Biomarker Discovery.” Nat Commun 9 (1): 4746.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.

SubtypeDrug: a software package for prioritization of candidate cancer subtype-specific drugs

Xudong HanJunwei HanChonghui Liu

2024-04-17

Introduction

Identifying drug-regulated subpathways

Infer patient-specific subpathway activity profiles.

Calculating drug-disease reverse association score.

Identifying cancer subtype-specific drugs

Calculating the cancer subtype-specific drug score

Estimating statistically significance level of sutype-specific drug score

Visualization

Plot a heat map of normalized drug-disease reverse association scores for cancer samples

Plot heat map of patient-specific subpathway activity profiles of drug-regulated subpathway

Plot a global graph of the drug

Polt a subpathway network graph

Visualize the chemical structure of the drug

References

Xudong Han
Junwei Han
Chonghui Liu