The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Conditioned data frames, or cnd_df
, are a powerful tool
in the {sdtm.oak}
package designed to facilitate
conditional transformations on data frames. This article explains how to
create and use conditioned data frames, particularly in the context of
SDTM domain derivations.
A conditioned data frame is a regular data frame extended with a
logical vector cnd
that marks rows for subsequent
conditional transformations. The condition_add()
function
is used to create these conditioned data frames.
Consider a simple data frame df
:
## # A tibble: 3 × 2
## x y
## <int> <chr>
## 1 1 a
## 2 2 b
## 3 3 c
We can create a conditioned data frame where only rows where
x > 1
are marked:
## # A tibble: 3 × 2
## # Cond. tbl: 2/1/0
## x y
## <int> <chr>
## 1 F 1 a
## 2 T 2 b
## 3 T 3 c
Here, only the second and third rows are marked as
TRUE
.
The real power of conditioned data frames manifests when they are
used with functions such as assign_no_ct
,
assign_ct
, hardcode_no_ct
, and
hardcode_ct
. These functions perform derivations only for
the records that match the pattern of TRUE
values in
conditioned data frames.
Consider a simplified dataset of concomitant medications, where we
want to derive a new variable CMGRPID (Concomitant Medication Group ID)
based on the condition that the medication treatment (CMTRT) is
"BENADRYL"
.
Here is a simplified raw Concomitant Medications data set
(cm_raw
):
cm_raw <- tibble::tibble(
oak_id = seq_len(14L),
raw_source = "ConMed",
patient_number = c(375L, 375L, 376L, 377L, 377L, 377L, 377L, 378L, 378L, 378L, 378L, 379L, 379L, 379L),
MDNUM = c(1L, 2L, 1L, 1L, 2L, 3L, 5L, 4L, 1L, 2L, 3L, 1L, 2L, 3L),
MDRAW = c(
"BABY ASPIRIN", "CORTISPORIN", "ASPIRIN",
"DIPHENHYDRAMINE HCL", "PARCETEMOL", "VOMIKIND",
"ZENFLOX OZ", "AMITRYPTYLINE", "BENADRYL",
"DIPHENHYDRAMINE HYDROCHLORIDE", "TETRACYCLINE",
"BENADRYL", "SOMINEX", "ZQUILL"
)
)
cm_raw
## # A tibble: 14 × 5
## oak_id raw_source patient_number MDNUM MDRAW
## <int> <chr> <int> <int> <chr>
## 1 1 ConMed 375 1 BABY ASPIRIN
## 2 2 ConMed 375 2 CORTISPORIN
## 3 3 ConMed 376 1 ASPIRIN
## 4 4 ConMed 377 1 DIPHENHYDRAMINE HCL
## 5 5 ConMed 377 2 PARCETEMOL
## 6 6 ConMed 377 3 VOMIKIND
## 7 7 ConMed 377 5 ZENFLOX OZ
## 8 8 ConMed 378 4 AMITRYPTYLINE
## 9 9 ConMed 378 1 BENADRYL
## 10 10 ConMed 378 2 DIPHENHYDRAMINE HYDROCHLORIDE
## 11 11 ConMed 378 3 TETRACYCLINE
## 12 12 ConMed 379 1 BENADRYL
## 13 13 ConMed 379 2 SOMINEX
## 14 14 ConMed 379 3 ZQUILL
To derive the CMTRT
variable we use the
assign_no_ct()
function to map the MDRAW
variable to the CMTRT
variable:
## # A tibble: 14 × 4
## oak_id raw_source patient_number CMTRT
## <int> <chr> <int> <chr>
## 1 1 ConMed 375 BABY ASPIRIN
## 2 2 ConMed 375 CORTISPORIN
## 3 3 ConMed 376 ASPIRIN
## 4 4 ConMed 377 DIPHENHYDRAMINE HCL
## 5 5 ConMed 377 PARCETEMOL
## 6 6 ConMed 377 VOMIKIND
## 7 7 ConMed 377 ZENFLOX OZ
## 8 8 ConMed 378 AMITRYPTYLINE
## 9 9 ConMed 378 BENADRYL
## 10 10 ConMed 378 DIPHENHYDRAMINE HYDROCHLORIDE
## 11 11 ConMed 378 TETRACYCLINE
## 12 12 ConMed 379 BENADRYL
## 13 13 ConMed 379 SOMINEX
## 14 14 ConMed 379 ZQUILL
Then we create a conditioned data frame from the target data set
(tgt_dat
), meaning we create a conditioned data frame where
only rows with CMTRT
equal to "BENADRYL"
are
marked:
## # A tibble: 14 × 4
## # Cond. tbl: 2/12/0
## oak_id raw_source patient_number CMTRT
## <int> <chr> <int> <chr>
## 1 F 1 ConMed 375 BABY ASPIRIN
## 2 F 2 ConMed 375 CORTISPORIN
## 3 F 3 ConMed 376 ASPIRIN
## 4 F 4 ConMed 377 DIPHENHYDRAMINE HCL
## 5 F 5 ConMed 377 PARCETEMOL
## 6 F 6 ConMed 377 VOMIKIND
## 7 F 7 ConMed 377 ZENFLOX OZ
## 8 F 8 ConMed 378 AMITRYPTYLINE
## 9 T 9 ConMed 378 BENADRYL
## 10 F 10 ConMed 378 DIPHENHYDRAMINE HYDROCHLORIDE
## 11 F 11 ConMed 378 TETRACYCLINE
## 12 T 12 ConMed 379 BENADRYL
## 13 F 13 ConMed 379 SOMINEX
## 14 F 14 ConMed 379 ZQUILL
Finally, we derive the CMGRPID
variable conditionally.
Using assign_no_ct()
, we derive CMGRPID
which
indicates the group ID for the medication, based on the conditioned
target data set:
derived_tgt_dat <- assign_no_ct(
tgt_dat = cnd_tgt_dat,
tgt_var = "CMGRPID",
raw_dat = cm_raw,
raw_var = "MDNUM"
)
derived_tgt_dat
## # A tibble: 14 × 5
## oak_id raw_source patient_number CMTRT CMGRPID
## <int> <chr> <int> <chr> <int>
## 1 1 ConMed 375 BABY ASPIRIN NA
## 2 2 ConMed 375 CORTISPORIN NA
## 3 3 ConMed 376 ASPIRIN NA
## 4 4 ConMed 377 DIPHENHYDRAMINE HCL NA
## 5 5 ConMed 377 PARCETEMOL NA
## 6 6 ConMed 377 VOMIKIND NA
## 7 7 ConMed 377 ZENFLOX OZ NA
## 8 8 ConMed 378 AMITRYPTYLINE NA
## 9 9 ConMed 378 BENADRYL 1
## 10 10 ConMed 378 DIPHENHYDRAMINE HYDROCHLORIDE NA
## 11 11 ConMed 378 TETRACYCLINE NA
## 12 12 ConMed 379 BENADRYL 1
## 13 13 ConMed 379 SOMINEX NA
## 14 14 ConMed 379 ZQUILL NA
Conditioned data frames in the {sdtm.oak}
package
provide a flexible way to perform conditional transformations on data
sets. By marking specific rows for transformation, users can efficiently
derive SDTM variables, ensuring that only relevant records are
processed.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.