The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Multiple Assignment with unpack
Nina Zumel and John Mount
2023-08-19
In R
there are many functions that return named lists or
other structures keyed by names. Often, you want to unpack the elements
of such a list into separate variables, for ease of use. One example is
the use of split()
to partition a larger data frame into a
named list of smaller data frames, each corresponding to some
grouping.
library(wrapr)
# example data
d <- data.frame(
x = 1:9,
group = c('train', 'calibrate', 'test'),
stringsAsFactors = FALSE)
knitr::kable(d)
1 |
train |
2 |
calibrate |
3 |
test |
4 |
train |
5 |
calibrate |
6 |
test |
7 |
train |
8 |
calibrate |
9 |
test |
# split the d by group
(parts <- split(d, d$group))
## $calibrate
## x group
## 2 2 calibrate
## 5 5 calibrate
## 8 8 calibrate
##
## $test
## x group
## 3 3 test
## 6 6 test
## 9 9 test
##
## $train
## x group
## 1 1 train
## 4 4 train
## 7 7 train
train_data <- parts$train
calibrate_data <- parts$calibrate
test_data <- parts$test
knitr::kable(train_data)
1 |
1 |
train |
4 |
4 |
train |
7 |
7 |
train |
knitr::kable(calibrate_data)
2 |
2 |
calibrate |
5 |
5 |
calibrate |
8 |
8 |
calibrate |
3 |
3 |
test |
6 |
6 |
test |
9 |
9 |
test |
A multiple assignment notation allows us to assign all the smaller
data frames to variables in one step, and avoid leaving a possibly large
temporary variable such as parts
in our environment. One
such notation is unpack()
.
Basic unpack()
example
# clear out the earlier results
rm(list = c('train_data', 'calibrate_data', 'test_data', 'parts'))
# split d and unpack the smaller data frames into separate variables
unpack(split(d, d$group),
train_data = train,
test_data = test,
calibrate_data = calibrate)
knitr::kable(train_data)
1 |
1 |
train |
4 |
4 |
train |
7 |
7 |
train |
knitr::kable(calibrate_data)
2 |
2 |
calibrate |
5 |
5 |
calibrate |
8 |
8 |
calibrate |
3 |
3 |
test |
6 |
6 |
test |
9 |
9 |
test |
You can also use unpack
with an assignment notation
similar to the notation used with the
zeallot::%<-%
pipe:
# split d and unpack the smaller data frames into separate variables
unpack[traind = train, testd = test, cald = calibrate] := split(d, d$group)
knitr::kable(traind)
1 |
1 |
train |
4 |
4 |
train |
7 |
7 |
train |
2 |
2 |
calibrate |
5 |
5 |
calibrate |
8 |
8 |
calibrate |
3 |
3 |
test |
6 |
6 |
test |
9 |
9 |
test |
Reusing the list names as variables
If you are willing to assign the elements of the list into variables
with the same names, you can just use the names:
unpack(split(d, d$group), train, test, calibrate)
knitr::kable(train)
1 |
1 |
train |
4 |
4 |
train |
7 |
7 |
train |
2 |
2 |
calibrate |
5 |
5 |
calibrate |
8 |
8 |
calibrate |
3 |
3 |
test |
6 |
6 |
test |
9 |
9 |
test |
# try the unpack[] assignment notation
rm(list = c('train', 'test', 'calibrate'))
unpack[test, train, calibrate] := split(d, d$group)
knitr::kable(train)
1 |
1 |
train |
4 |
4 |
train |
7 |
7 |
train |
2 |
2 |
calibrate |
5 |
5 |
calibrate |
8 |
8 |
calibrate |
3 |
3 |
test |
6 |
6 |
test |
9 |
9 |
test |
Mixed notation is allowed:
rm(list = c('train', 'test', 'calibrate'))
unpack(split(d, d$group), train, holdout=test, calibrate)
knitr::kable(train)
1 |
1 |
train |
4 |
4 |
train |
7 |
7 |
train |
2 |
2 |
calibrate |
5 |
5 |
calibrate |
8 |
8 |
calibrate |
3 |
3 |
test |
6 |
6 |
test |
9 |
9 |
test |
Unpacking only parts of a list
You can also unpack only a subset of the list’s elements:
rm(list = c('train', 'holdout', 'calibrate'))
unpack(split(d, d$group), train, test)
knitr::kable(train)
1 |
1 |
train |
4 |
4 |
train |
7 |
7 |
train |
3 |
3 |
test |
6 |
6 |
test |
9 |
9 |
test |
# we didn't unpack the calibrate set
calibrate
## Error in eval(expr, envir, enclos): object 'calibrate' not found
unpack
checks for unknown elements
If unpack
is asked to unpack an element it doesn’t
recognize, it throws an error. In this case, none of the elements are
unpacked, as unpack
is deliberately an atomic
operation.
# the split call will not return an element called "holdout"
unpack(split(d, d$group), training = train, testing = holdout)
## Error in write_values_into_env(unpack_environment = unpack_environment, : wrapr::unpack all source names must be in value, missing: 'holdout'.
# train was not unpacked either
training
## Error in eval(expr, envir, enclos): object 'training' not found
Other multiple assignment packages
zeallot
The zeallot
package already supplies excellent positional or ordered unpacking.
The primary difference between zeallot
’s
%<-%
pipe and unpack
is that %<-%
is a
positional unpacker: you must unpack the list based on the
order of the elements in the list. This style may be more
appropriate in the Python world where many functions return un-named
tuples of results.
unpack
is a named unpacker: assignments are
based on the names of elements in the list, and the assignments
can be in any order. We feel this is more appropriate for R, as R has
not emphasized positional unpacking; R functions tend to return named
lists or named structures. For named lists or named structures it may
not be safe to rely on value positions.
For unpacking named lists, we recommend unpack
. For
unpacking unnamed lists, use %<-%
.
vadr
vadr::bind
supplies named unpacking, but appears to use a “SOURCE =
DESTINATION
” notation. That is the reverse of a
“DESTINATION = SOURCE
” which is how both R assignments and
argument binding are already written.
tidytidbits
tidytidbits
supplies positional unpacking with a %=%
notation.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.