The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

r-econid website

econid

CRAN status CRAN downloads R-CMD-check Lint Codecov test coverage

Overview

The econid R package is a foundational building block of the econdataverse family of packages aimed at helping economists and financial professionals work with sovereign-level economic data. The package is aimed at domain experts in economics and finance who need to analyze and join data across multiple sources, but who aren’t necessarily R programming experts.

Motivation

Economic and financial datasets present unique challenges when working with country-level data:

  1. Mixed Entity Types

Datasets often combine different types of entities in the same “country” column:

  1. Inconsistent Naming

The same entity might appear in various formats:

  1. Complex Analysis Needs

Researchers often need to:

econid addresses these challenges through:

Design Philosophy

The design philosophy of the package follows tidyverse design principles and the tidy tools manifesto. We strive to practice human-centered design, with clear documentation and examples and graceful handling of edge cases. We invite you to submit suggestions for improvements and extensions on the package’s Github Issues page.

We have designed the package to handle only the most common entities financial and economic professionals might encounter in a dataset (249 in total), not to handle every edge case. However, the package allows users to extend the standardization list with custom entities to flexibly accommodate any unconventional use case.

Installation

To install the package from CRAN, you can use the install.packages() function:

install.packages("econid")

To install a development version from GitHub, you can use the remotes package:

remotes::install_github("Teal-Insights/r-econid")

Then, load the package in your R session or Quarto or RMarkdown notebook:

library(econid)

Usage

Below is a high-level overview of how econid works in practice, followed by a more detailed description of the main function and its parameters. The examples and tests illustrate typical usage patterns.

Use these patterns to explore the package and integrate it into your data cleaning workflows. For finer-grained operations (e.g., fuzzy filter and search), keep an eye on the package for future enhancements.

Package Summary

  1. Input validation
    The package checks if your input dataset and specified columns exist. It also ensures you only request valid output columns (e.g., "entity_name", "entity_id", "entity_type", "iso2c", and "iso3c"). Any invalid columns raise an error.

  2. Name and code matching
    The function standardize_entity() looks in your dataset for names (and optionally codes) that might match an entity. It:

  3. Merging standardized columns
    Once the function finds a match, it returns a new or augmented data frame with standardized columns (e.g., "entity_id", "entity_name", "entity_type", etc.). You control exactly which standardized columns appear via the output_cols argument.

  4. Handling missing and custom cases

Workflow

Workflow

standardize_entity() Function

# Basic example
df <- data.frame(
  entity = c("United States", "China", "NotACountry"),
  code = c("USA", "CHN", "ZZZ"),
  obs_value = c(1, 2, 3)
)

# Using with dplyr pipeline
library(dplyr)

df |>
  standardize_entity(entity, code) |>
  filter(!is.na(entity_id)) |>
  mutate(entity_category = case_when(
    entity_type == "economy" ~ "Country",
    TRUE ~ "Other"
  )) |>
  select(entity_name, entity_category, obs_value)
##     entity_name entity_category obs_value
## 1 United States         Country         1
## 2         China         Country         2

You can also use the function directly without a pipeline:

standardize_entity(
  data = df,
  entity, code,
  output_cols = c("entity_id", "entity_name", "entity_type"),
  fill_mapping = c(entity_name = "entity"),
  default_entity_type = NA_character_,
  warn_ambiguous = TRUE
)
##   entity_id   entity_name entity_type        entity code obs_value
## 1       USA United States     economy United States  USA         1
## 2       CHN         China     economy         China  CHN         2
## 3      <NA>   NotACountry        <NA>   NotACountry  ZZZ         3

Parameters

Returns

A data frame (or tibble) the same size as data, augmented with the requested standardized columns.

Working with Multiple Entities

The standardize_entity() function can be used to standardize multiple entities in the same dataset by using the prefix parameter:

df <- data.frame(
  country_name = c("United States", "France"),
  counterpart_name = c("China", "Germany")
)

df |>
  standardize_entity(country_name) |>
  standardize_entity(counterpart_name, prefix = "counterpart")
##   counterpart_entity_id counterpart_entity_name counterpart_entity_type
## 1                   CHN                   China                 economy
## 2                   DEU                 Germany                 economy
##   entity_id   entity_name entity_type  country_name counterpart_name
## 1       USA United States     economy United States            China
## 2       FRA        France     economy        France          Germany

add_entity_pattern() Function

The add_entity_pattern() function allows you to add custom entity patterns to the package. This is useful if you need to standardize entities that are not in the default list.

add_entity_pattern(
  "BJ-CITY",
  "Beijing City",
  entity_type = "economy",
  aliases = c("Beijing Municipality")
)

df_custom <- data.frame(entity = c("United States", "Beijing Municipality"))
result_custom <- standardize_entity(df_custom, entity)
print(result_custom)
##   entity_id   entity_name entity_type               entity
## 1       USA United States     economy        United States
## 2   BJ-CITY  Beijing City     economy Beijing Municipality

reset_custom_entity_patterns() Function

The reset_custom_entity_patterns() function allows you to clear all custom entity patterns that have been added during the current R session. This is useful when you want to start fresh with only the default entity patterns.

Contributing

We welcome your feedback and contributions! Please submit suggestions for improvements and extensions on the package’s Github Issues page.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.