Primary Keys
dm
offers dm_enum_pk_candidates()
to identify primary keys and dm_add_pk()
to add them.
## [90m# A tibble: 9 x 3[39m
## columns candidate why
## [3m[90m<keys>[39m[23m [3m[90m<lgl>[39m[23m [3m[90m<chr>[39m[23m
## [90m1[39m tailnum TRUE [90m"[39m[90m"[39m
## [90m2[39m engine FALSE [90m"[39mhas duplicate values: 4 Cycle, Reciprocating, Tur…
## [90m3[39m engines FALSE [90m"[39mhas duplicate values: 1, 2, 3, 4[90m"[39m
## [90m4[39m manufactur… FALSE [90m"[39mhas duplicate values: AIRBUS, AIRBUS INDUSTRIE, A…
## [90m5[39m model FALSE [90m"[39mhas duplicate values: 717-200, 737-301, 737-3G7, …
## [90m6[39m seats FALSE [90m"[39mhas duplicate values: 2, 4, 5, 6, 7, … (>= 7 tota…
## [90m7[39m speed FALSE [90m"[39mhas duplicate values: 90, 105, 162, 432, NA[90m"[39m
## [90m8[39m type FALSE [90m"[39mhas duplicate values: Fixed wing multi engine, Fi…
## [90m9[39m year FALSE [90m"[39mhas duplicate values: 1959, 1963, 1975, 1976, 197…
Now, add the primary keys that you have identified:
## [32m──[39m [32mTable source[39m [32m───────────────────────────────────────────────────────────[39m
## src: <environment: R_GlobalEnv>
## [90m──[39m [90mMetadata[39m [90m───────────────────────────────────────────────────────────────[39m
## Tables: `airlines`, `airports`, `flights`, `planes`, `weather`
## Columns: 53
## Primary keys: 3
## Foreign keys: 0
To review the primary keys after setting them, call dm_get_all_pks()
.
## [90m# A tibble: 3 x 2[39m
## table pk_col
## [3m[90m<chr>[39m[23m [3m[90m<keys>[39m[23m
## [90m1[39m airlines carrier
## [90m2[39m airports faa
## [90m3[39m planes tailnum
Foreign Keys
## [90m# A tibble: 19 x 3[39m
## columns candidate why
## [3m[90m<keys>[39m[23m [3m[90m<lgl>[39m[23m [3m[90m<chr>[39m[23m
## [90m 1[39m carrier TRUE [90m"[39m[90m"[39m
## [90m 2[39m tailnum FALSE [90m"[39m334264 entries (99.3%) of `flights$tailnum` not…
## [90m 3[39m dest FALSE [90m"[39m336776 entries (100%) of `flights$dest` not in …
## [90m 4[39m origin FALSE [90m"[39m336776 entries (100%) of `flights$origin` not i…
## [90m 5[39m air_time FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m 6[39m arr_delay FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m 7[39m arr_time FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m 8[39m day FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m 9[39m dep_delay FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m10[39m dep_time FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m11[39m distance FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m12[39m flight FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m13[39m hour FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m14[39m minute FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m15[39m month FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m16[39m sched_arr_t… FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m17[39m sched_dep_t… FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
## [90m18[39m time_hour FALSE [90m"[39mcannot join a POSIXct object with an object tha…
## [90m19[39m year FALSE [90m"[39mCan't join on 'value' x 'value' because of inco…
To define how your tables are related, use dm_add_fk()
to add foreign keys. First, define the tables that you wish to connect by parameterizing the dm_add_fk()
function with table
and ref_table
options.
Then indicate in column
which column of table
refers to ref_table
’s primary key, which you’ve defined above. Use check = FALSE
to omit consistency checks.
## [32m──[39m [32mTable source[39m [32m───────────────────────────────────────────────────────────[39m
## src: <environment: R_GlobalEnv>
## [90m──[39m [90mMetadata[39m [90m───────────────────────────────────────────────────────────────[39m
## Tables: `airlines`, `airports`, `flights`, `planes`, `weather`
## Columns: 53
## Primary keys: 3
## Foreign keys: 3
Retrieving Keys
To retrieve your keys later on, use dm_get_all_fks()
, or dm_get_fk()
for its singular version.
## [90m# A tibble: 3 x 2[39m
## table pk_col
## [3m[90m<chr>[39m[23m [3m[90m<keys>[39m[23m
## [90m1[39m airlines carrier
## [90m2[39m airports faa
## [90m3[39m planes tailnum
Voilà, here’s your dm
object that you can work with:
## [32m──[39m [32mTable source[39m [32m───────────────────────────────────────────────────────────[39m
## src: <environment: R_GlobalEnv>
## [90m──[39m [90mMetadata[39m [90m───────────────────────────────────────────────────────────────[39m
## Tables: `airlines`, `airports`, `flights`, `planes`, `weather`
## Columns: 53
## Primary keys: 3
## Foreign keys: 3