The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
hubData
dependencyv3
if they are v3.0.0
or above,
not just v3.0.0
. Thanks to @M-7th for reporting.hubAdmin
Suggests dependency by moving test hub
configuration validation to CI (resolved: @annakrystalli,
https://github.com/hubverse-org/hubUtils/issues/158)read_config_file()
helper function to read a JSON
config file from a file path.extract_schema_version()
helper function to extract
the schema version from a schema id
or config
schema_version
property character string.is_v3_config
,
is_v3_config_file
and is_v3_config_hub
to
check whether a config object, file or hub is using schema version
3.jsonlite
) bug fix.hubUtils
package containing significant breaking changes. Much of the
package has been moved and split across two smaller and more dedicated
packages:
hubData
package: contains functions
for connecting to and interacting with hub data.
hubData
:
connect_hub()
, connect_model_output()
,
expand_model_out_val_grid()
,
create_model_out_submit_tmpl()
,
coerce_to_character()
, coerce_to_hub_schema()
and create_hub_schema()
.hubUtils
functions re-exported to hubData
:
as_model_out_tbl()
, validate_model_out_tbl()
,
model_id_split()
and model_id_merge()
.hubAdmin
package: contains functions
for administering Hubs, in particular creating and validating hub
configuration files. Exported functions moved to hubAdmin
:
create_config()
,
create_model_task()
, create_model_tasks()
,
create_output_type()
,
create_output_type_cdf()
,
create_output_type_mean()
,
create_output_type_median()
,
create_output_type_pmf()
,
create_output_type_quantile()
,
create_output_type_sample()
, create_round()
,
create_rounds()
, create_target_metadata()
,
create_target_metadata_item()
,
create_task_id()
, create_task_ids()
.validate_config()
,validate_model_metadata_schema()
,
validate_hub_config()
,
view_config_val_errors()
.tasks.json
config
files programmatically (#127).connect_hub()
and connect_model_output()
now identify and report on files that are present and should have been
opened but for which a connection was not successful (#124)validate_model_metadata_schema()
function and
included it as part of validate_hub_config()
(#110 &
#112).load_model_metadata()
function to compile hub
model metadata.coerce_to_character()
function for coercing all
model output columns to character. This can be much faster than coercing
to coerce_to_hub_schema()
, especially for dates.expand_model_out_val_grid()
:
all_character
: allow for returning all character
columns.as_arrow_table
: allow for returning an arrow data
table.bind_model_tasks
: allow for returning list of model
task level grids.expand_model_out_val_grid()
when
required_vals_only = TRUE
yet required task ID columns are
not consistent across modeling tasks. The function now pads missing task
ID column values with NA
s.coerce_to_hub_schema()
function and applied
it to create_model_out_submit_tmpl()
&
expand_model_out_val_grid()
to ensure column data types in
returned tibbles are consistent with the hub’s schema (#100).mean
/median
output types where being included erroneously when
required_vals_only = TRUE
.get_round_task_id_names()
(#99).read_config()
(#101).connect_hub()
to error when "csv"
was an
accepted hub file format but there were no CSV in the model output
directory. Now connect_hub()
checks for the presence of
files of each accepted file format and only opens datasets for file
formats of which files exists. If there are no files of any accepted
file_format in the model output directory, the S3
hub_connection
object returned consists of an empty
list.hubUtils
to be
loaded for std_colnames
to be internally available.create_model_out_submit_tmpl()
. Function now, by default,
returns rows of complete cases only and the behavior is controlled by
argument complete_cases_only
. Argument
remove_empty_cols
was also removed.create_model_out_submit_tmpl()
for generating
round specific model output template tibbles (#82).expand_model_out_val_grid()
for creating an expanded
grid of valid task ID and output type ID across round modeling tasks and
output types.get_round_idx()
: for getting an integer index of the
element in config_tasks$rounds
that a character round
identifier maps to.get_round_ids()
: for getting a list or character vector
of Hub round IDs.tasks.json
validation checks via
validate_config()
:
required
and optional
properties.round_id_from_variable
is
TRUE
, check that the specification of the task_id set as
round_id
is consistent across modeling tasks.round_id
values are unique across
rounds.std_colnames
which contains standard
column names used in hubverse model output data files, for use in other
hubverse packages (#88).as_model_out_tbl()
function to standardize model
output data by converting to a model_out_tbl
S3 class
object. (#32, #33, #63, #64, #66)model_id_merge()
and
model_id_split()
to create model_id
column
from separate team_abbr
and model_abbr
columns
and vice versa (#63).output_type_id_datatype
to
connect_hub()
to allow overriding default behavior of
automatically detecting the output_type_id
column data type
from the tasks.json
config file (#70).create_hub_schema()
argument
partitions
to connect_hub()
function to
accommodate custom hub partitioning.partition_names
to
connect_model_output()
to accommodate custom hub
partitioning.schema
to
connect_model_output()
to allow for overriding default
arrow
schema auto-detection.jsonvalidate
package to Imports so Hub
administrator functionality accessible through standard
installation.format
from
create_hub_schema()
which now creates the same schema from
a tasks.json
config file, regardless of the data file
format (#80).validate_hub_config()
allows maintainers
to check the validity of hub config files in a single call. Function
view_config_val_errors()
also modified to create combined
report for hub config files from output of
validate_hub_config()
.model-output
data are expected to
have output_type
& output_type_id
instead
of type
& type_id
respectively.connect_hub()
now automatically determines the
output_type_id
column data type from the
tasks.json
config file coercing to the highest possible
data type, “character” being the lowest denominator.create_hub_schema()
for determining
the schema for data in a hub’s model-output directory from a
tasks.json
config file.connect_hub()
now allows establishing connections to
hubs with multiple file type formats.create_output_type_categorical()
function was renamed
to create_output_type_pmf()
.model-output
data directory partitions, was renamed from
“model” to “model_id”.connect_hub()
function to open
connection to model-output
data implemented through an
arrow
FileSystemDataset
object. This allows
users to create custom dplyr
queries to access model output
data.validate_config()
function to validate JSON
configuration files against Hub schema as well as function
view_config_val_errors()
for viewing a concise and easier
to navigate table of validation errors.NEWS.md
file to track changes to the
package.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.