The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
db2pq is an R package for moving data from PostgreSQL
into Apache Parquet files. It is designed for both general PostgreSQL
sources and the WRDS PostgreSQL service.
last_modified metadata embedded in
Parquet files.# install.packages("pak")
pak::pak("iangow/db2pqr")You can pass a WRDS username directly for a first call:
library(db2pq)
wrds_update_pq("dsi", "crsp", wrds_id = "your_wrds_id")For repeated use, configure the WRDS username and PostgreSQL password
outside the call. The authentication
article documents the WRDS_ID, .pgpass,
and wrds::wrds_set_credentials() paths. The remaining WRDS
examples assume that setup is in place.
wrds_update_pq("dsi", "crsp", force = TRUE)wrds_update_pq("dsi", "crsp", use_sas = TRUE)SSH setup is only needed for this SAS metadata path. See the WRDS SSH setup article for the key-based setup used by that option.
wrds_schema_to_pq("crsp")wrds_sql_to_pq(
"SELECT permno, date, ret FROM crsp.dsf WHERE date >= '2024-01-01'",
table_name = "dsf_recent",
schema = "crsp"
)db_to_pq(
table_name = "company",
schema = "comp",
keep = c("gvkey", "conm"),
rename = c(conm = "company_name")
)pq_last_modified(schema = "crsp")The stable default transfer path uses DBI/RPostgres. The optional
ADBC path can be selected with transfer_method = "adbc"
when adbi and a PostgreSQL ADBC driver are installed:
adbc_diagnostics()
wrds_update_pq("dsi", "crsp", transfer_method = "adbc")If ADBC reports an SSL/libpq error, use
transfer_method = "dbi" or install a current SSL-capable
adbcpostgresql build.
Files are organized as:
<DATA_DIR>/<schema>/<table>.parquet
For example:
~/pq_data/crsp/dsi.parquet
The DATA_DIR environment variable sets the root
directory. It can also be passed directly as data_dir to
any function.
When archive = TRUE, replaced files are moved to:
<DATA_DIR>/<schema>/archive/<table>_<timestamp>.parquet
MIT License. See LICENSE.md.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.