The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Abstract
The packagepkggraph
is meant to interactively explore various dependencies of a package(s) (on CRAN like repositories) and perform analysis using tidy philosophy. Most of the functions return a tibble
object (enhancement of dataframe
) which can be used for further analysis. The package offers functions to produce network
and igraph
dependency graphs. The plot
method produces a static plot based on ggnetwork
and plotd3
function produces an interactive D3 plot based on networkD3
.
suppressPackageStartupMessages(library("dplyr")) # for tidy data manipulations
suppressPackageStartupMessages(library("magrittr")) # for friendly piping
suppressPackageStartupMessages(library("network")) # for plotting
suppressPackageStartupMessages(library("sna")) # for plotting
suppressPackageStartupMessages(library("statnet.common")) # for plotting
suppressPackageStartupMessages(library("networkD3")) # for plotting
suppressPackageStartupMessages(library("igraph")) # for graph computations
suppressPackageStartupMessages(library("pkggraph")) # attach the package
suppressMessages(init(local = TRUE)) # initiate the package
get_neighborhood("mlr") # a tibble, every row indicates a dependency
## # A tibble: 445 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 ada Depends rpart
## 2 adabag Depends rpart
## 3 adabag Depends mlbench
## 4 adabag Depends caret
## 5 bartMachine Depends randomForest
## 6 batchtools Depends data.table
## 7 bst Depends gbm
## 8 caret Depends ggplot2
## 9 clusterSim Depends cluster
## 10 clusterSim Depends MASS
## # ... with 435 more rows
# observe only 'Imports' and reverse 'Imports'
neighborhood_graph("mlr", relation = "Imports") %>%
plot()
# observe the neighborhood of 'tidytext' package
get_neighborhood("tidytext") %>%
make_neighborhood_graph() %>%
plot()
# interact with the neighborhood of 'tm' package
# legend does not appear in the vignette, but it appears directly
neighborhood_graph("tm") %>%
plotd3(700, 700)
# which packages work as 'hubs' or 'authorities' in the above graph
neighborhood_graph("tidytext", type = "igraph") %>%
extract2(1) %>%
authority_score() %>%
extract2("vector") %>%
tibble(package = names(.), score = .) %>%
top_n(10, score) %>%
ggplot(aes(reorder(package, score), score)) +
geom_bar(stat = "identity") +
xlab("package") +
ylab("score") +
coord_flip()
The package
pkggraph
aims to provide a consistent and intuitive platform to explore the dependencies of packages in CRAN like repositories.
The package attempts to strike a balance between two aspects:
So that, we do not see trees for the forest nor see only a forest !
The important features of pkggraph
are:
tibble
(pkg_1
, relation
, pkg_2
). The first row in the table below indicates that dplyr
package ‘Imports’ assertthat
package.get_imports(c("dplyr", "tidyr"))
## # A tibble: 20 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 dplyr Imports assertthat
## 2 dplyr Imports bindrcpp
## 3 dplyr Imports glue
## 4 dplyr Imports magrittr
## 5 dplyr Imports methods
## 6 dplyr Imports pkgconfig
## 7 dplyr Imports rlang
## 8 dplyr Imports R6
## 9 dplyr Imports Rcpp
## 10 dplyr Imports tibble
## 11 dplyr Imports utils
## 12 tidyr Imports dplyr
## 13 tidyr Imports glue
## 14 tidyr Imports magrittr
## 15 tidyr Imports purrr
## 16 tidyr Imports rlang
## 17 tidyr Imports Rcpp
## 18 tidyr Imports stringi
## 19 tidyr Imports tibble
## 20 tidyr Imports tidyselect
tibble
. ex: get_reverse_depends
pkggraph
object containing a network
or a igraph
object. ex: neighborhood_graph
plot
method which uses ggnetwork
package to generate a static plot.
plotd3
function uses networkD3
to produce a interactive D3 plot.
The five different types of dependencies a package can have over another are: Depends
, Imports
, LinkingTo
, Suggests
and Enhances
.
init
Always, begin with init()
. This creates two variables deptable
and packmeta
in the environment where it is called. The variables are created using local copy or computed after downloading from internet (when local = FALSE
, the default value). It is suggested to use init(local = FALSE)
to get up to date dependencies.
library("pkggraph")
init(local = FALSE)
The repository
argument takes CRAN, bioconductor and omegahat repositories. For other CRAN-like repositories not listed in repository
, an additional argument named repos
is required.
get
familytibble
packages
as their first argument.level
argument (Default value is 1).get_imports("ggplot2")
## # A tibble: 10 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 ggplot2 Imports digest
## 2 ggplot2 Imports grid
## 3 ggplot2 Imports gtable
## 4 ggplot2 Imports MASS
## 5 ggplot2 Imports plyr
## 6 ggplot2 Imports reshape2
## 7 ggplot2 Imports scales
## 8 ggplot2 Imports stats
## 9 ggplot2 Imports tibble
## 10 ggplot2 Imports lazyeval
Lets observe packages that ‘Suggest’ knitr
.
get_reverse_suggests("knitr", level = 1)
## # A tibble: 2,213 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 abbyyR Suggests knitr
## 2 ABC.RAP Suggests knitr
## 3 ABHgenotypeR Suggests knitr
## 4 AbSim Suggests knitr
## 5 ACMEeqtl Suggests knitr
## 6 acmeR Suggests knitr
## 7 acnr Suggests knitr
## 8 ACSNMineR Suggests knitr
## 9 adaptiveGPCA Suggests knitr
## 10 additivityTests Suggests knitr
## # ... with 2,203 more rows
By setting level = 2
, observe that packages from first level (first column of the previous table) and their suggestors are captured.
get_reverse_suggests("knitr", level = 2)
## # A tibble: 5,387 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 abbyyR Suggests knitr
## 2 ABCoptim Suggests covr
## 3 ABC.RAP Suggests knitr
## 4 abctools Suggests ggplot2
## 5 abd Suggests ggplot2
## 6 abd Suggests Hmisc
## 7 ABHgenotypeR Suggests knitr
## 8 AbSim Suggests knitr
## 9 acebayes Suggests R.rsp
## 10 ACMEeqtl Suggests knitr
## # ... with 5,377 more rows
What if we required to capture dependencies of more than one type, say both
Depends
andImports
?
get_all_dependencies
and get_all_reverse_dependencies
These functions capture direct and reverse dependencies until the suggested level for any subset of dependency type.
get_all_dependencies("mlr", relation = c("Depends", "Imports"))
## # A tibble: 9 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 mlr Depends ParamHelpers
## 2 mlr Imports BBmisc
## 3 mlr Imports backports
## 4 mlr Imports ggplot2
## 5 mlr Imports stringi
## 6 mlr Imports checkmate
## 7 mlr Imports data.table
## 8 mlr Imports parallelMap
## 9 mlr Imports survival
get_all_dependencies("mlr", relation = c("Depends", "Imports"), level = 2)
## # A tibble: 303 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 ada Depends rpart
## 2 adabag Depends rpart
## 3 adabag Depends mlbench
## 4 adabag Depends caret
## 5 bartMachine Depends rJava
## 6 bartMachine Depends bartMachineJARs
## 7 bartMachine Depends car
## 8 bartMachine Depends randomForest
## 9 bartMachine Depends missForest
## 10 batchtools Depends data.table
## # ... with 293 more rows
Observe that ada
‘Depends’ on rpart
.
Sometimes, we would like to capture only specified dependencies recursively. In this case, at second level, say we would like to capture only ‘Depends’ and ‘Imports’ of packages which were dependents/imports of mlr
. Then, set strict = TRUE
.
get_all_dependencies("mlr"
, relation = c("Depends", "Imports")
, level = 2
, strict = TRUE)
## # A tibble: 28 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 mlr Depends ParamHelpers
## 2 BBmisc Imports checkmate
## 3 checkmate Imports backports
## 4 ggplot2 Imports digest
## 5 ggplot2 Imports grid
## 6 ggplot2 Imports gtable
## 7 ggplot2 Imports MASS
## 8 ggplot2 Imports plyr
## 9 ggplot2 Imports reshape2
## 10 ggplot2 Imports scales
## # ... with 18 more rows
Notice that ada
was ’Suggest’ed by mlr
. That is why, it appeared when strict
was FALSE
(default).
What if we required to capture both dependencies and reverse dependencies until a specified level?
get_neighborhood
This function captures both dependencies and reverse dependencies until a specified level for a given subset of dependency type.
get_neighborhood("hash", level = 2)
## # A tibble: 62 x 3
## pkg_1 relation pkg_2
## <chr> <fct> <chr>
## 1 BOG Depends hash
## 2 COMBIA Depends hash
## 3 GABi Depends hash
## 4 HAP.ROR Depends hash
## 5 neuroim Depends hash
## 6 orderbook Depends hash
## 7 rpartitions Depends hash
## 8 Rtextrankr Depends KoNLP
## 9 CITAN Imports hash
## 10 covr Imports crayon
## # ... with 52 more rows
get_neighborhood("hash", level = 2) %>%
make_neighborhood_graph %>%
plot()
Observe that testthat
family appears due to Suggests
. Lets look at Depends
and Imports
only:
get_neighborhood("hash"
, level = 2
, relation = c("Imports", "Depends")
, strict = TRUE) %>%
make_neighborhood_graph %>%
plot()
Observe that the graph below captures the fact: parallelMap
‘Imports’ BBmisc
get_neighborhood("mlr", relation = "Imports") %>%
make_neighborhood_graph() %>%
plot()
get_neighborhood
looks if any packages until the specified level have a dependency on each other at one level higher. This can be done turned off by setting interconnect = FALSE
.
get_neighborhood("mlr", relation = "Imports", interconnect = FALSE) %>%
make_neighborhood_graph() %>%
plot()
neighborhood_graph
and make_neighborhood_graph
neighborhood_graph
creates a graph object of a set of packages of class pkggraph
. This takes same arguments as get_neighborhood
and additionally type
. Argument type
defaults to igraph
. The alternative is network
.neighborhood_graph("caret", relation = "Imports") %>%
plot()
make_neighborhood_graph
accepts the output of any get_*
as input and produces a graph object.
Essentially, you can get the information from
get_
function after some trial and error, then create a graph object for further analysis or plotting.
get_all_reverse_dependencies("rpart", relation = "Imports") %>%
make_neighborhood_graph() %>%
plot()
relies
For quick dependency checks, one could use infix operators: %depends%
, %imports%
, %linkingto%
, %suggests%
, %enhances%
.
"dplyr" %imports% "tibble"
## [1] TRUE
A package A
is said to rely on package B
if A
either ‘Depends’, ‘Imports’ or ‘LinkingTo’ B
, recursively. relies
function captures this.
relies("glmnet")[[1]]
## [1] "Matrix" "utils" "foreach" "methods" "graphics"
## [6] "grid" "stats" "lattice" "codetools" "iterators"
## [11] "grDevices"
# level 1 dependencies of "glmnet" are:
get_all_dependencies("glmnet", relation = c("Imports", "Depends", "LinkingTo"))[[3]]
## [1] "Matrix" "foreach"
"glmnet" %relies% "grid"
## [1] TRUE
reverse_relies("tokenizers")[[1]]
## [1] "covfefe" "ptstem" "tidytext" "statquotes" "widyr"
plot
and its handlesplot
produces a static plot from a pkggraph
object. The available handles are:
pkggraph::neighborhood_graph("hash") %>%
plot()
pkggraph::neighborhood_graph("hash") %>%
plot(nodeImportance = "in", background = "white")
pkggraph::neighborhood_graph("hash") %>%
plot(nodeImportance = "none", background = "white")
plotd3
For interactive exploration of large graphs, plotd3
might be better than static plots. Note that,
# legend does not appear in the vignette, but it appears directly
plotd3(neighborhood_graph("tibble"), height = 1000, width = 1000)
Package authors Srikanth KS and Nikhil Singh would like to thank
R
core, Hadley Wickham for tidyverse framework and the fantasticR
community!
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.