The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

redquack

Transfer REDCap data to DuckDB with minimal memory overhead, designed for large datasets that exceed available RAM.

Features

Installation

From CRAN:

install.packages("redquack")

Development version:

# install.packages("pak")
pak::pak("dylanpieper/redquack")

Basic Usage

Data from REDCap is transferred to DuckDB in configurable chunks of record IDs:

library(redquack)

con <- redcap_to_duckdb(
  redcap_uri = "https://redcap.example.org/api/",
  token = "YOUR_API_TOKEN",
  record_id_name = "record_id",
  chunk_size = 1000  
  # Increase chunk size for memory-efficient systems (faster)
  # Decrease chunk size for memory-constrained systems (slower)
)

Working with the data

Query the data with dplyr:

library(dplyr)

demographics <- tbl(con, "data") |>
  filter(demographics_complete == 2) |>
  select(record_id, age, race, gender) |>
  collect()

age_summary <- tbl(con, "data") |>
  group_by(gender) |>
  summarize(
    n = n(),
    mean_age = mean(age, na.rm = TRUE),
    median_age = median(age, na.rm = TRUE)
  ) |>
  collect()

Create a Parquet file directly from DuckDB (efficient for sharing data):

DBI::dbExecute(con, "COPY (SELECT * FROM data) TO 'redcap.parquet' (FORMAT PARQUET)")

Remember to close the connection when finished:

DBI::dbDisconnect(con, shutdown = TRUE)

Database structure

The DuckDB database created by redcap_to_duckdb() contains two tables:

  1. data: Contains all exported REDCap records with optimized column types

    DBI::dbGetQuery(con, "SELECT * FROM data LIMIT 10")
  2. log: Contains timestamped logs of the transfer process for troubleshooting

    DBI::dbGetQuery(con, "SELECT timestamp, type, message FROM log ORDER BY timestamp")

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.