Package Versions Matter

Gabriel Becker, Michael Lawrence

1 Reproducible Data Analyses

1.1 Four pillars of Data Analysis

  • Data
  • Code
  • Statistical Methods
  • Software Used

1.2 Our Focus

  • Data
  • Code
  • Statistical Methods
  • Software Used
    • including specific versions used

2 Package Cohorts

2.1 But Pkg versions don't live in isolation

  • Result depends on versions of dependencies, etc

2.2 Package Cohorts are crucial

  • Collaborations
    • Syncing package cohorts helps ensure comparability of results
  • Package maintenance
    • Differentiating and switching between development and analysis cohorts
  • Large organizations/depts
    • Specify/provide canonical package cohorts for use by all members

3 User needs

3.1 Users need tools

To allow effective management of pkgs at the cohort level

  • Package libraries
    • Create, populate, and switch between
  • Generalized installation
    • Version specific
      • Past releases and devel versions
    • Repo and non-repo sources
  • Describing cohorts
    • Define versioned or non-versioned cohorts
    • Publish cohorts