The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

SCIproj

Author: Saskia Otto License: MIT

An R package for the initialization and organization of a scientific project following reproducible research and FAIR principles.

Overview

SCIproj is an R package that allows users to initialize a project through its function create_proj() and manage a scientific project as an R package or a research compendium. This combines structure, where files are located, and workflow, how analyses are reproduced or replicated.

The package is built on modern reproducibility standards and guidelines such as:

Defaults

The package has some default settings to ensure reproducibility. These include:

Project structure

your-project/
├── DESCRIPTION             # Project metadata, dependencies, and author info (with ORCID).
├── README.Rmd              # Top-level project description.
├── your-project.Rproj      # RStudio project file.
├── CITATION.cff            # Machine-readable citation metadata for FAIR compliance.
├── CONTRIBUTING.md         # Contribution guidelines.
├── LICENSE.md              # Full license text (optional, requires add_license).
├── NAMESPACE               # Auto-generated by roxygen2 (do not edit by hand).
│
├── data-raw/               # Raw data files and pre-processing scripts.
│   ├── clean_data.R        # Script template for data cleaning.
│   ├── DATA_SOURCES.md     # Data provenance: source, license, DOI, download date.
│   └── ...
│
├── data/                   # Cleaned datasets stored as .rda files.
│
├── R/                      # Custom R functions and dataset documentation.
│   ├── function_ex.R       # Template for custom functions.
│   ├── data.R              # Template for dataset documentation.
│   └── ...
│
├── analyses/               # R scripts or R Markdown/Quarto documents for analyses.
│   ├── figures/            # Generated plots.
│   └── ...
│
├── docs/                   # Publication-ready documents (article, report, presentation).
├── trash/                  # Temporary files that can be safely deleted.
│
├── _targets.R              # Pipeline definition for reproducible workflow (default).
├── renv/                   # renv library and settings (default).
├── renv.lock               # Lockfile for reproducible package versions (default).
└── Dockerfile              # Container definition for full reproducibility (optional).

Why an R package as research compendium?

Installation and usage

Install the development version from GitHub:

### Using remotes
# install.packages("remotes")
remotes::install_github("saskiaotto/SCIproj")

### Or better: using the new pak package
# install.packages("pak")
pak::pkg_install("saskiaotto/SCIproj")

Creating the project

library("SCIproj")
create_proj("my_research_project")

This creates a project with renv, targets, CITATION.cff, and DATA_SOURCES.md by default.

Customize with parameters:

### Full-featured project with GitHub, CI, and ORCID
create_proj("my_research_project",
  add_license = "MIT",
  license_holder = "Jane Doe",
  orcid = "0000-0001-2345-67893",
  create_github_repo = TRUE,
  ci = "gh-actions"
)

### Minimal project without workflow tools
create_proj("my_research_project",
  use_renv = FALSE,
  use_targets = FALSE
)

Parameters

Parameter Default Description
data_raw TRUE Add data-raw/ folder with templates
makefile FALSE Add makefile.R template
testthat FALSE Add testthat infrastructure
use_pipe FALSE Add magrittr pipe (native \|> recommended)
add_license NULL License type: "MIT", "GPL", "Apache", etc.
license_holder "Your name" License holder / project author
orcid NULL ORCID iD for CITATION.cff
use_git TRUE Initialize local git repo
create_github_repo FALSE Create GitHub repo (needs GITHUB_PAT)
ci "none" CI type: "none" or "gh-actions"
use_renv TRUE Initialize renv for dependency management
use_targets TRUE Add _targets.R pipeline template
use_docker FALSE Add Dockerfile template
open_proj FALSE Open new project in RStudio

Developing the project

  1. Create the project with create_proj().

  2. Edit DESCRIPTION with project metadata: title, summary, contributors (with ORCID), license, dependencies.

  3. Edit README.Rmd with project details: objectives, timeline, workflow.

  4. Document your data provenance in data-raw/DATA_SOURCES.md: source, license, download date, DOI for each dataset.

  5. Place original (raw) data in data-raw/. Use clean_data.R (or more scripts) for pre-processing. Store clean datasets with usethis::use_data().

  6. Document clean datasets using roxygen in R/ (see template data.R). For details, see Documenting data.

  7. Place custom functions in R/ with roxygen documentation. See the documentation chapter in the R Packages book.

  8. Write tests for your functions in tests/ (set testthat = TRUE in create_proj()). See Testing basics.

  9. Place analysis scripts/notebooks in analyses/. Save plots in analyses/figures/.

  10. Place final manuscripts, reports, and presentations in docs/. Use R Markdown, Quarto, or templates from rticles, thesisdown, or Quarto journal extensions.

  11. Keep dependencies in sync: usethis::use_package() for DESCRIPTION, renv::snapshot() for the lockfile.

  12. Update CITATION.cff when you archive your project or publish.

Workflow

For a detailed introduction to targets, see the user manual.

For maximum reproducibility, consider also using Docker (use_docker = TRUE). See the Rocker Project for R-specific Docker images.

Archiving and DOI

When your project is finalized:

  1. Archive the GitHub repo to make it read-only.
  2. Get a DOI via Zenodo (integrates directly with GitHub) or another DOI Registration Agency.
  3. Update CITATION.cff with the DOI.
  4. Optionally, generate a codemeta.json with codemetar::write_codemeta() for richer metadata.

Useful resources

Guidelines and standards

R packages and tools

Research compendium concept

Credits

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.