The format set of functions can be combined to format a summarised_result object (see R package omopgenerics) into a nice dataframe, flextable or gt table to display. In what follows, we show the pipline for formatting summarised_results using these functions.
First, we load the relevant libraries and generate a mock summarised_result.
library(visOmopResults)
library(dplyr)
mock_sr <- mockSummarisedResult()
mock_sr |> glimpse()
#> Rows: 126
#> Columns: 16
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mock", "mock", "mock…
#> $ result_type <chr> "mock_summarised_result", "mock_summarised_result", "…
#> $ package_name <chr> "visOmopResults", "visOmopResults", "visOmopResults",…
#> $ package_version <chr> "0.2.0", "0.2.0", "0.2.0", "0.2.0", "0.2.0", "0.2.0",…
#> $ group_name <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_…
#> $ group_level <chr> "cohort1", "cohort1", "cohort1", "cohort1", "cohort1"…
#> $ strata_name <chr> "overall", "age_group &&& sex", "age_group &&& sex", …
#> $ strata_level <chr> "overall", "<40 &&& Male", ">=40 &&& Male", "<40 &&& …
#> $ variable_name <chr> "number subjects", "number subjects", "number subject…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ estimate_name <chr> "count", "count", "count", "count", "count", "count",…
#> $ estimate_type <chr> "integer", "integer", "integer", "integer", "integer"…
#> $ estimate_value <chr> "3141771", "8701935", "2161016", "905399", "434671", …
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
This function provides tools to format the estimate_value column. It
allows to change the number of decimals to display by estimate_type or
estimate_name (decimals
), and to change the
decimal and thousand/million separator mark (decimalMark
and bigMark
respectively). By default, decimals of integer
values are set to 0, numeric to 2, percentage to 1, and proportion to 3.
The defaulted decimal mark is “.” while the thousand/million separator
is “,”.
mock_sr <- mock_sr |> formatEstimateValue()
mock_sr |> glimpse()
#> Rows: 126
#> Columns: 16
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mock", "mock", "mock…
#> $ result_type <chr> "mock_summarised_result", "mock_summarised_result", "…
#> $ package_name <chr> "visOmopResults", "visOmopResults", "visOmopResults",…
#> $ package_version <chr> "0.2.0", "0.2.0", "0.2.0", "0.2.0", "0.2.0", "0.2.0",…
#> $ group_name <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_…
#> $ group_level <chr> "cohort1", "cohort1", "cohort1", "cohort1", "cohort1"…
#> $ strata_name <chr> "overall", "age_group &&& sex", "age_group &&& sex", …
#> $ strata_level <chr> "overall", "<40 &&& Male", ">=40 &&& Male", "<40 &&& …
#> $ variable_name <chr> "number subjects", "number subjects", "number subject…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ estimate_name <chr> "count", "count", "count", "count", "count", "count",…
#> $ estimate_type <chr> "integer", "integer", "integer", "integer", "integer"…
#> $ estimate_value <chr> "3,141,771", "8,701,935", "2,161,016", "905,399", "43…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
This functions helps to manipulate estimate_name and estimate_value columns. For instance, if we want that all the variables for which we have counts and percentage to be display in a single row showing both as “N (%)” we can do it with this function.
The estimateNameFormat
is where all combinations or
renaming of estimates can be specified. Values from
estimate_name’s column should be specified between <…>.
The new estimate_name values to use will be the names of the
vector, or the value itself when it is not named.
mock_sr <- mock_sr |>
formatEstimateName(
estimateNameFormat = c(
"N (%)" = "<count> (<percentage>%)",
"N" = "<count>",
"Mean (SD)" = "<mean> (<sd>)"
),
keepNotFormatted = FALSE,
useFormatOrder = FALSE
)
mock_sr |> glimpse()
#> Rows: 72
#> Columns: 16
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mock", "mock", "mock…
#> $ result_type <chr> "mock_summarised_result", "mock_summarised_result", "…
#> $ package_name <chr> "visOmopResults", "visOmopResults", "visOmopResults",…
#> $ package_version <chr> "0.2.0", "0.2.0", "0.2.0", "0.2.0", "0.2.0", "0.2.0",…
#> $ group_name <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_…
#> $ group_level <chr> "cohort1", "cohort1", "cohort1", "cohort1", "cohort1"…
#> $ strata_name <chr> "overall", "age_group &&& sex", "age_group &&& sex", …
#> $ strata_level <chr> "overall", "<40 &&& Male", ">=40 &&& Male", "<40 &&& …
#> $ variable_name <chr> "number subjects", "number subjects", "number subject…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ estimate_name <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N"…
#> $ estimate_type <chr> "character", "character", "character", "character", "…
#> $ estimate_value <chr> "3,141,771", "8,701,935", "2,161,016", "905,399", "43…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
Additional input arguments are keepNotFormatted
to
specify whether not formatted rows should be returned or dropped, and
useFormatOrder
to define if rows should be sorted as in
estimateNameFormat
or if the original order should be kept.
In the latter scenario, when more than one estimate is pulled together,
the new estimate position will be the first of the estimates being
merged.
This function helps to create a nice header for a flextable, gt table and other table formatting packages.
To this aim, this function pivots the columns specified in
header
“widening” the table. The name of the new columns
can be formatted with the arguments header
,
delim
, inlcudeHeaderName
, and
includeHeaderKey
to later convert obtain a nice header of
the formatted table (flextable or gt table).
There are 3 different types of headers, identified with the keys “header”, “header_name”, and “header_level”.
header
but which are not part of the input table (are not columns names or
values).For instance, we might want to pivot by “group_level” and have an upper header called “Names of the cohorts”. To do that we would proceed as follows:
mock_sr |>
formatHeader(
header = c("Names of the cohorts", "group_level"),
delim = "\n",
includeHeaderName = TRUE,
includeHeaderKey = TRUE
) |>
glimpse()
#> Rows: 36
#> Columns: 16
#> $ result_id <int> …
#> $ cdm_name <chr> …
#> $ result_type <chr> …
#> $ package_name <chr> …
#> $ package_version <chr> …
#> $ group_name <chr> …
#> $ strata_name <chr> …
#> $ strata_level <chr> …
#> $ variable_name <chr> …
#> $ variable_level <chr> …
#> $ estimate_name <chr> …
#> $ estimate_type <chr> …
#> $ additional_name <chr> …
#> $ additional_level <chr> …
#> $ `[header]Names of the cohorts\n[header_name]group_level\n[header_level]cohort1` <chr> …
#> $ `[header]Names of the cohorts\n[header_name]group_level\n[header_level]cohort2` <chr> …
The labels indicating which type of header refers to in the new
column names can be removed with includeHeaderKey
. However,
having these keys in our header will allow to style separately the
different header types in the next step (fxTable
and
gtTable
).
Continuing with our example, we want to pivot by strata (name and level), but, we do not want the column names to appear in the header:
mock_sr <- mock_sr |>
mutate(across(c("strata_name", "strata_level"), ~ gsub("&&&", "and", .x))) |>
formatHeader(
header = c("Stratifications", "strata_name", "strata_level"),
delim = "\n",
includeHeaderName = FALSE,
includeHeaderKey = TRUE
)
mock_sr |> glimpse()
#> Rows: 8
#> Columns: 22
#> $ result_id <int> …
#> $ cdm_name <chr> …
#> $ result_type <chr> …
#> $ package_name <chr> …
#> $ package_version <chr> …
#> $ group_name <chr> …
#> $ group_level <chr> …
#> $ variable_name <chr> …
#> $ variable_level <chr> …
#> $ estimate_name <chr> …
#> $ estimate_type <chr> …
#> $ additional_name <chr> …
#> $ additional_level <chr> …
#> $ `[header]Stratifications\n[header_level]overall\n[header_level]overall` <chr> …
#> $ `[header]Stratifications\n[header_level]age_group and sex\n[header_level]<40 and Male` <chr> …
#> $ `[header]Stratifications\n[header_level]age_group and sex\n[header_level]>=40 and Male` <chr> …
#> $ `[header]Stratifications\n[header_level]age_group and sex\n[header_level]<40 and Female` <chr> …
#> $ `[header]Stratifications\n[header_level]age_group and sex\n[header_level]>=40 and Female` <chr> …
#> $ `[header]Stratifications\n[header_level]sex\n[header_level]Male` <chr> …
#> $ `[header]Stratifications\n[header_level]sex\n[header_level]Female` <chr> …
#> $ `[header]Stratifications\n[header_level]age_group\n[header_level]<40` <chr> …
#> $ `[header]Stratifications\n[header_level]age_group\n[header_level]>=40` <chr> …
Notice, how we substitute the keyWord “&&&” to “and”, to get a nice header.
Finally, we have the functions gtTable
and
fxTable
which will transform our tibble into a gt
or flextable object respectively. These functions provide
several tools to personalise the formatted table.
Let’s start by manipulating the dataframe to have the columns that we
want to display, and then use gtTable
with default
values:
# first we select the columns we want:
mock_sr <- mock_sr |>
splitGroup() |>
select(!all_of(c("cdm_name", "result_type", "package_name",
"package_version", "estimate_type", "result_id",
"additional_name", "additional_level")))
mock_sr |> gtTable()
Stratifications | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
cohort_name | variable_name | variable_level | estimate_name | overall | age_group and sex | sex | age_group | |||||
overall | <40 and Male | >=40 and Male | <40 and Female | >=40 and Female | Male | Female | <40 | >=40 | ||||
cohort1 | number subjects | - | N | 3,141,771 | 8,701,935 | 2,161,016 | 905,399 | 434,671 | 7,226,341 | 3,877,885 | 4,173,733 | 875,670 |
cohort2 | number subjects | - | N | 7,125,706 | 1,654,423 | 2,374,407 | 7,145,022 | 4,761,928 | 7,017,924 | 6,051,193 | 7,218,595 | 6,793,597 |
cohort1 | age | - | Mean (SD) | 23.59 (6.94) | 84.29 (9.06) | 20.37 (0.56) | 14.01 (6.38) | 86.75 (3.79) | 14.68 (2.34) | 90.14 (5.48) | 33.34 (8.71) | 95.88 (6.79) |
cohort2 | age | - | Mean (SD) | 20.92 (3.19) | 41.16 (0.99) | 66.77 (9.83) | 19.81 (5.60) | 23.74 (5.34) | 48.83 (3.41) | 16.68 (7.16) | 31.49 (7.75) | 28.06 (0.11) |
cohort1 | Medications | Amoxiciline | N (%) | 60,330 (95.8%) | 45,647 (78.2%) | 27,825 (99.4%) | 90,490 (4.6%) | 30,157 (47.9%) | 23,308 (57.9%) | 26,575 (11.9%) | 3,985 (95.7%) | 48,569 (90.3%) |
cohort2 | Medications | Amoxiciline | N (%) | 57,652 (99.5%) | 42,721 (39.0%) | 65,450 (4.8%) | 40,650 (13.3%) | 31,039 (40.5%) | 36,190 (5.1%) | 80,676 (79.1%) | 60,151 (44.4%) | 78,586 (92.2%) |
cohort1 | Medications | Ibuprofen | N (%) | 41,669 (97.6%) | 38,727 (97.5%) | 65,786 (66.3%) | 22,238 (90.7%) | 52,168 (87.1%) | 22,996 (37.9%) | 68,207 (9.6%) | 91,644 (96.5%) | 22,343 (59.1%) |
cohort2 | Medications | Ibuprofen | N (%) | 12,688 (80.5%) | 57,463 (73.5%) | 48,475 (61.0%) | 73,907 (52.4%) | 3,664 (89.9%) | 82,134 (75.6%) | 18,684 (74.9%) | 78,177 (54.1%) | 77,862 (66.6%) |
Now, we want to group results by “cohort_name”. More specifically we
want to have a row which the name of each cohort before the results of
that cohort, and that cohort1 comes before cohort2.
Additionally, we want to merge those rows what contain the same
information for all the columns. To get this table we will use
gtTable
as follows:
mock_sr |>
gtTable(
groupNameCol = "cohort_name",
groupNameAsColumn = FALSE,
groupOrder = c("cohort1", "cohort2"),
colsToMergeRows = "all_columns"
)
Stratifications | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_level | estimate_name | overall | age_group and sex | sex | age_group | |||||
overall | <40 and Male | >=40 and Male | <40 and Female | >=40 and Female | Male | Female | <40 | >=40 | |||
cohort1 | |||||||||||
number subjects | - | N | 3,141,771 | 8,701,935 | 2,161,016 | 905,399 | 434,671 | 7,226,341 | 3,877,885 | 4,173,733 | 875,670 |
age | - | Mean (SD) | 23.59 (6.94) | 84.29 (9.06) | 20.37 (0.56) | 14.01 (6.38) | 86.75 (3.79) | 14.68 (2.34) | 90.14 (5.48) | 33.34 (8.71) | 95.88 (6.79) |
Medications | Amoxiciline | N (%) | 60,330 (95.8%) | 45,647 (78.2%) | 27,825 (99.4%) | 90,490 (4.6%) | 30,157 (47.9%) | 23,308 (57.9%) | 26,575 (11.9%) | 3,985 (95.7%) | 48,569 (90.3%) |
Ibuprofen | N (%) | 41,669 (97.6%) | 38,727 (97.5%) | 65,786 (66.3%) | 22,238 (90.7%) | 52,168 (87.1%) | 22,996 (37.9%) | 68,207 (9.6%) | 91,644 (96.5%) | 22,343 (59.1%) | |
cohort2 | |||||||||||
number subjects | - | N | 7,125,706 | 1,654,423 | 2,374,407 | 7,145,022 | 4,761,928 | 7,017,924 | 6,051,193 | 7,218,595 | 6,793,597 |
age | - | Mean (SD) | 20.92 (3.19) | 41.16 (0.99) | 66.77 (9.83) | 19.81 (5.60) | 23.74 (5.34) | 48.83 (3.41) | 16.68 (7.16) | 31.49 (7.75) | 28.06 (0.11) |
Medications | Amoxiciline | N (%) | 57,652 (99.5%) | 42,721 (39.0%) | 65,450 (4.8%) | 40,650 (13.3%) | 31,039 (40.5%) | 36,190 (5.1%) | 80,676 (79.1%) | 60,151 (44.4%) | 78,586 (92.2%) |
Ibuprofen | N (%) | 12,688 (80.5%) | 57,463 (73.5%) | 48,475 (61.0%) | 73,907 (52.4%) | 3,664 (89.9%) | 82,134 (75.6%) | 18,684 (74.9%) | 78,177 (54.1%) | 77,862 (66.6%) |
We might also want to modify the default style of the table. For
instance, we might want to highlight the cohort_name labels with a blue
background, have the body text in red, and use a combination of orange
and yellow for the header. We can do it with the style
argument:
mock_sr |>
gtTable(
style = list(
"header" = list(gt::cell_text(weight = "bold"),
gt::cell_fill(color = "orange")),
"header_level" = list(gt::cell_text(weight = "bold"),
gt::cell_fill(color = "yellow")),
"column_name" = gt::cell_text(weight = "bold"),
"group_label" = list(gt::cell_fill(color = "blue"),
gt::cell_text(color = "white", weight = "bold")),
"body" = gt::cell_text(color = "red")
),
groupNameCol = "cohort_name",
groupNameAsColumn = FALSE,
groupOrder = c("cohort1", "cohort2"),
colsToMergeRows = "all_columns"
)
Stratifications | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
variable_name | variable_level | estimate_name | overall | age_group and sex | sex | age_group | |||||
overall | <40 and Male | >=40 and Male | <40 and Female | >=40 and Female | Male | Female | <40 | >=40 | |||
cohort1 | |||||||||||
number subjects | - | N | 3,141,771 | 8,701,935 | 2,161,016 | 905,399 | 434,671 | 7,226,341 | 3,877,885 | 4,173,733 | 875,670 |
age | - | Mean (SD) | 23.59 (6.94) | 84.29 (9.06) | 20.37 (0.56) | 14.01 (6.38) | 86.75 (3.79) | 14.68 (2.34) | 90.14 (5.48) | 33.34 (8.71) | 95.88 (6.79) |
Medications | Amoxiciline | N (%) | 60,330 (95.8%) | 45,647 (78.2%) | 27,825 (99.4%) | 90,490 (4.6%) | 30,157 (47.9%) | 23,308 (57.9%) | 26,575 (11.9%) | 3,985 (95.7%) | 48,569 (90.3%) |
Ibuprofen | N (%) | 41,669 (97.6%) | 38,727 (97.5%) | 65,786 (66.3%) | 22,238 (90.7%) | 52,168 (87.1%) | 22,996 (37.9%) | 68,207 (9.6%) | 91,644 (96.5%) | 22,343 (59.1%) | |
cohort2 | |||||||||||
number subjects | - | N | 7,125,706 | 1,654,423 | 2,374,407 | 7,145,022 | 4,761,928 | 7,017,924 | 6,051,193 | 7,218,595 | 6,793,597 |
age | - | Mean (SD) | 20.92 (3.19) | 41.16 (0.99) | 66.77 (9.83) | 19.81 (5.60) | 23.74 (5.34) | 48.83 (3.41) | 16.68 (7.16) | 31.49 (7.75) | 28.06 (0.11) |
Medications | Amoxiciline | N (%) | 57,652 (99.5%) | 42,721 (39.0%) | 65,450 (4.8%) | 40,650 (13.3%) | 31,039 (40.5%) | 36,190 (5.1%) | 80,676 (79.1%) | 60,151 (44.4%) | 78,586 (92.2%) |
Ibuprofen | N (%) | 12,688 (80.5%) | 57,463 (73.5%) | 48,475 (61.0%) | 73,907 (52.4%) | 3,664 (89.9%) | 82,134 (75.6%) | 18,684 (74.9%) | 78,177 (54.1%) | 77,862 (66.6%) |
To obtain a similar result but with a flextable object, we
can use fxTable
with the same arguments as before, however,
style
must be adapted to use the officer package
since it is the accepted by flextable.
mock_sr |>
fxTable(
style = list(
"header" = list(
"cell" = officer::fp_cell(background.color = "orange"),
"text" = officer::fp_text(bold = TRUE)),
"header_level" = list(
"cell" = officer::fp_cell(background.color = "yellow"),
"text" = officer::fp_text(bold = TRUE)),
"column_name" = list("text" = officer::fp_text(bold = TRUE)),
"group_label" = list(
"cell" = officer::fp_cell(background.color = "blue"),
"text" = officer::fp_text(bold = TRUE, color = "white")),
"body" = list("text" = officer::fp_text(color = "red"))
),
groupNameCol = "cohort_name",
groupNameAsColumn = FALSE,
groupOrder = c("cohort1", "cohort2"),
colsToMergeRows = "all_columns"
)
cohort_name | variable_name | variable_level | estimate_name | Stratifications | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
overall | age_group and sex | sex | age_group | |||||||||
overall | <40 and Male | >=40 and Male | <40 and Female | >=40 and Female | Male | Female | <40 | >=40 | ||||
cohort1 | ||||||||||||
number subjects | - | N | 3,141,771 | 8,701,935 | 2,161,016 | 905,399 | 434,671 | 7,226,341 | 3,877,885 | 4,173,733 | 875,670 | |
age | - | Mean (SD) | 23.59 (6.94) | 84.29 (9.06) | 20.37 (0.56) | 14.01 (6.38) | 86.75 (3.79) | 14.68 (2.34) | 90.14 (5.48) | 33.34 (8.71) | 95.88 (6.79) | |
Medications | Amoxiciline | N (%) | 60,330 (95.8%) | 45,647 (78.2%) | 27,825 (99.4%) | 90,490 (4.6%) | 30,157 (47.9%) | 23,308 (57.9%) | 26,575 (11.9%) | 3,985 (95.7%) | 48,569 (90.3%) | |
Ibuprofen | N (%) | 41,669 (97.6%) | 38,727 (97.5%) | 65,786 (66.3%) | 22,238 (90.7%) | 52,168 (87.1%) | 22,996 (37.9%) | 68,207 (9.6%) | 91,644 (96.5%) | 22,343 (59.1%) | ||
cohort2 | ||||||||||||
number subjects | - | N | 7,125,706 | 1,654,423 | 2,374,407 | 7,145,022 | 4,761,928 | 7,017,924 | 6,051,193 | 7,218,595 | 6,793,597 | |
age | - | Mean (SD) | 20.92 (3.19) | 41.16 (0.99) | 66.77 (9.83) | 19.81 (5.60) | 23.74 (5.34) | 48.83 (3.41) | 16.68 (7.16) | 31.49 (7.75) | 28.06 (0.11) | |
Medications | Amoxiciline | N (%) | 57,652 (99.5%) | 42,721 (39.0%) | 65,450 (4.8%) | 40,650 (13.3%) | 31,039 (40.5%) | 36,190 (5.1%) | 80,676 (79.1%) | 60,151 (44.4%) | 78,586 (92.2%) | |
Ibuprofen | N (%) | 12,688 (80.5%) | 57,463 (73.5%) | 48,475 (61.0%) | 73,907 (52.4%) | 3,664 (89.9%) | 82,134 (75.6%) | 18,684 (74.9%) | 78,177 (54.1%) | 77,862 (66.6%) |