Author: Tal Galili ( Tal.Galili@gmail.com )

Introduction

A heatmap is a popular graphical method for visualizing high-dimensional data, in which a table of numbers are encoded as a grid of colored cells. The rows and columns of the matrix are ordered to highlight patterns and are often accompanied by dendrograms. Heatmaps are used in many fields for visualizing observations, correlations, missing values patterns, and more.

Interactive heatmaps allow the inspection of specific value by hovering the mouse over a cell, as well as zooming into a region of the heatmap by draging a rectangle around the relevant area.

This work is based on the ggplot2 and plotly.js engine. It produces similar heatmaps as d3heatmap, with the advantage of speed (plotly.js is able to handle larger size matrix), and the ability to zoom from the dendrogram.

Installation

To install the stable version on CRAN:

install.packages('heatmaply')

To install the GitHub version:

# You'll need devtools
install.packages.2 <- function (pkg) if (!require(pkg)) install.packages(pkg);
install.packages.2('devtools')
# make sure you have Rtools installed first! if not, then run:
#install.packages('installr'); install.Rtools()

devtools::install_github("ropensci/plotly") 
devtools::install_github('talgalili/heatmaply')

And then you may load the package using:

library("heatmaply")

Usage

Default

library(heatmaply)
heatmaply(mtcars)

Because the labels are somewhat long, we need to manually fix the margins (hopefully this will be fixed in future versions of plot.ly)

heatmaply(mtcars, margins = c(40, 130))
# heatmaply(mtcars) %>% layout(margin = list(l = 130, b = 40))

We can use this with correlation. Notice the use of limits to set the range of the colors, and how we color the branches:

heatmaply(cor(mtcars), margins = c(40, 40),
          k_col = 2, k_row = 2,
          limits = c(-1,1))

Various setiation options

heatmaply uses the seriation package to find optimal ordering of rows and columns. Optimal means to optimze the Hamiltonian path length that is restricted by the dendrogram structure. Which, in other words, means to rotate the branches so that the sum of distances between each adjacent leaf (label) will be minimized. This is related to a restricted version of the travel salesman problem. The default options is “OLO” (Optimal leaf ordering) which optimizes the above mention critirion (it works in O(n^4)). Another option is “GW” (Gruvaeus and Wainer) which aims for the same goal but uses a (faster?) heuristic. The option “mean” gives the output we would get by default from heatmap functions in other packages such as gplots::heatmap.2. The option “none” gives us the dendrograms without any rotation.

# The default of heatmaply:
heatmaply(mtcars[1:10,], margins = c(40, 130),
          seriate = "OLO")
# Similar to OLO but less optimal (since it is a heuristic)
heatmaply(mtcars[1:10,], margin = c(40, 130),
          seriate = "GW")
# the default by gplots::heatmaply.2
heatmaply(mtcars[1:10,], margins = c(40, 130),
          seriate = "mean")
# the default output from hclust
heatmaply(mtcars[1:10,],  margins = c(40, 130),
          seriate = "none")

This works heavily relies on the seriation package (their vignette is well worth the read), and also lightly on the dendextend package (see vignette )

Changing color palettes

We can use different colors than the default viridis. For example, we may want to use other color pallates in order to get divergent colors for the correlations (these will sadly be less friendly for color blind people):

# divergent_viridis_magma <- c(rev(viridis(100, begin = 0.3)), magma(100, begin = 0.3))
# rwb <- colorRampPalette(colors = c("darkred", "white", "darkgreen"))
library(RColorBrewer)
# display.brewer.pal(11, "BrBG")
BrBG <- colorRampPalette(brewer.pal(11, "BrBG"))
Spectral <- colorRampPalette(brewer.pal(11, "Spectral"))

heatmaply(cor(mtcars), margins = c(40, 40),
          k_col = 2, k_row = 2,
          colors = BrBG(256),
          limits = c(-1,1))

Another example for using colors:

heatmaply(mtcars, margins = c(40, 130),
          colors = heat.colors(100))

Or even more customized colors using scale_fill_gradient_fun:

heatmaply(mtcars, margins = c(40, 130),
          scale_fill_gradient_fun = ggplot2::scale_fill_gradient2(low = "blue", high = "red", midpoint = 200, limits = c(0, 500)))

Missing values

Reviewing missing values:

library(heatmaply)

# warning - using grid_color cannot handle a large matrix!
airquality[1:10,] %>% is.na10 %>% 
  heatmaply(color = c("white","black"), grid_color = "grey",
            k_col =3, k_row = 3,
            margins = c(40, 50)) 
# airquality %>% is.na10 %>% 
#   heatmaply(color = c("grey80", "grey20"), # grid_color = "grey",
#             k_col =3, k_row = 3,
#             margins = c(40, 50)) 
# 

Replicating the dendrogram ordering of heatmap.2

The following example shows how to get the same result in heatmaply as with heatmap.2:

x  <- as.matrix(datasets::mtcars)
gplots::heatmap.2(x, trace = "none", col = viridis(100), key = FALSE)

And heatmaply (the only difference is the side of the row dendrogram. This might be possible to modify in future versions of ggplot/plotly/heatmaply):

heatmaply::heatmaply(x, seriate = "mean")

Adding additional factors using RowSideColors

With heatmap.2

# Example for using RowSideColors

x  <- as.matrix(datasets::mtcars)
rc <- colorspace::rainbow_hcl(nrow(x))

library(gplots)
#> 
#> Attaching package: 'gplots'
#> The following object is masked from 'package:stats':
#> 
#>     lowess
library(viridis)
heatmap.2(x, trace = "none", col = viridis(100),
          RowSideColors=rc, key = FALSE)

With heatmaply

heatmaply(x, seriate = "mean",
          RowSideColors=rc)

A more sophisticated heatmap (the hover at the top row doesn’t work due to an issue with plotly. We hope this would get resolved in the future):

heatmaply(x[,-c(8,9)], seriate = "mean",
          col_side_colors = c(rep(0,5), rep(1,4)),
          row_side_colors = x[,8:9])
#> Warning in heatmaply.heatmapr(hm, scale_fill_gradient_fun =
#> scale_fill_gradient_fun, : The hover text for col_side_colors is currently
#> not implemented (due to an issue in plotly). We hope this would get
#> resolved in future releases.

Credit

This package is thanks to the amazing work done by MANY people in the open source community. Beyond the many people working on the pipeline of R, thanks should go to the plotly team, and especially to Carson Sievert and others working on the R package of plotly. Also, many of the design elements were inspired by the work done on heatmap, heatmap.2 and d3heatmap, so special thanks goes to the R core team, Gregory R. Warnes, and Joe Cheng from RStudio. The dendrogram side of the package is based on the work in dendextend, in which special thanks should go to Andrie de Vries for his original work on bringing dendrograms to ggplot2 (which evolved into the richer ggdend objects, as implemented in dendextend).

Contact

You are welcome to:

Latest news

You can see the most recent changes to the package in the NEWS.md file

Session info

sessionInfo()
#> R version 3.3.0 (2016-05-03)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 7 x64 (build 7601) Service Pack 1
#> 
#> locale:
#> [1] LC_COLLATE=C                   LC_CTYPE=Hebrew_Israel.1255   
#> [3] LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C                  
#> [5] LC_TIME=Hebrew_Israel.1255    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] gplots_3.0.1       RColorBrewer_1.1-2 knitr_1.13        
#> [4] heatmaply_0.7.0    viridis_0.3.4      plotly_4.5.5.9000 
#> [7] ggplot2_2.1.0.9001
#> 
#> loaded via a namespace (and not attached):
#>  [1] gtools_3.5.0       modeltools_0.2-21  reshape2_1.4.2    
#>  [4] purrr_0.2.2        kernlab_0.9-24     lattice_0.20-33   
#>  [7] colorspace_1.2-7   htmltools_0.3.5    stats4_3.3.0      
#> [10] viridisLite_0.1.3  yaml_2.1.13        base64enc_0.1-3   
#> [13] DBI_0.5-1          prabclus_2.2-6     registry_0.3      
#> [16] fpc_2.1-10         foreach_1.4.3      plyr_1.8.4        
#> [19] robustbase_0.92-5  stringr_1.1.0      munsell_0.4.3     
#> [22] gtable_0.2.0       caTools_1.17.1     htmlwidgets_0.7   
#> [25] mvtnorm_1.0-5      codetools_0.2-14   evaluate_0.9      
#> [28] labeling_0.3       seriation_1.2-0    flexmix_2.3-13    
#> [31] class_7.3-14       DEoptimR_1.0-4     trimcluster_0.1-2 
#> [34] Rcpp_0.12.7        KernSmooth_2.23-15 scales_0.4.0.9003 
#> [37] diptest_0.75-7     formatR_1.4        gdata_2.17.0      
#> [40] jsonlite_1.1       gridExtra_2.2.1    digest_0.6.10     
#> [43] gclus_1.3.1        stringi_1.1.2      dplyr_0.5.0       
#> [46] grid_3.3.0         bitops_1.0-6       tools_3.3.0       
#> [49] magrittr_1.5       lazyeval_0.2.0     tibble_1.2        
#> [52] cluster_2.0.4      whisker_0.3-2      tidyr_0.6.0       
#> [55] dendextend_1.4.0   MASS_7.3-45        assertthat_0.1    
#> [58] rmarkdown_0.9.6    httr_1.2.1         iterators_1.0.8   
#> [61] R6_2.2.0           TSP_1.1-4          mclust_5.2        
#> [64] nnet_7.3-12