Visualizing Scientific Replication with repfigure

Prasad Patil

2017-07-18

scifigure contains simple utilites for generating visual representations of scientific reproduction and replication efforts. This package is a companion to an article (Patil, Peng, and Leek 2016) that defines these efforts and explains why such visualizations are helpful. In this vignette, we will demonstrate how to use the functions in this package and recreate the main figure in (Patil, Peng, and Leek 2016).

Basic usage

We have defined a general set of steps that comprise the conduct of a scientific study, and have compiled a set of icons to represent each step. Let us visualize these steps and icons for two studies using the repfigure package:

library(scifigure)
exps <- init_experiments(2)
hide_stages = NULL
sci_figure(exps)    

To produce this figure, we first initialize a data frame using init_experiments, passing the parameter value 2 to indicate that the data frame should represent two studies (with default names for each experiment). We then pass this data frame to sci_figure, which renders the figure based on the contents of exps. Note that a legend is automatically generated, indicating that each step in each of the experiments is “observed”. We can modify the contents of exps to change change how elements of each experiment are displayed.

exps <- init_experiments(2, c("Brady et. al.", "Thomas et. al."))
exps["analysis_plan", 1] <- "unobserved"
exps[c("experimenter", "analyst", "estimate"), 2] <- "different"
sci_figure(exps, hide_stages <- c("population", "hypothesis"))

Above we first specified experiment names for each experiment when we initialized the data frame. These names will be the column headers in the figure. Next, we modified some of the elements of the data frame: the analysis plan in the first study is missing, and the experimenter, analyst, and estimate in the second study differed from the first. Finally, when generating the visualization we hide the population and hypothesis stages.

We have also included convenience wrapper functions to generate visualizations of the definitions of reproducibility and replicability given in (Patil, Peng, and Leek 2016):

reproduce_figure()

replicate_figure()

Recreating Figure 1

The following code snippet recreates the columns in Figure 1 of (Patil, Peng, and Leek 2016):

exps <- init_experiments(9, c("Original", "Reproducible", "Orignal", "Replicable", "Begley", "Payne et. al.", "Vianello (OSF)", "Potti", "Baggerly & Coombs"))
exps["analyst", 2] <- "different" # Reproducible
exps[c("experimenter", "data", "analyst", "code", "estimate", "claim"), 4] <- "different" # Replicable
exps[c("population", "hypothesis", "experimental_design", "experimenter", "data", "analysis_plan", "analyst", "code", "estimate"), 5] <- "unobserved" # Begley
exps[c("population", "experimenter", "data", "analyst", "code", "estimate", "claim"), 7] <- "different" # Vianello (OSF)
exps[c("data", "code"), 8] <- "incorrect" # Potti
exps[c("data", "code"), 9] <- "different" # Baggerly & Coombes

sci_figure(exps)

References

Patil, Prasad, Roger D Peng, and Jeffrey Leek. 2016. “A Statistical Definition for Reproducibility and Replicability.” BioRxiv. Cold Spring Harbor Labs Journals, 066803.