The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Author: Mark Rieke
License: MIT
{nplyr}
is a grammar of nested data manipulation that
allows users to perform dplyr-like manipulations on data
frames nested within a list-col of another data frame. Most dplyr verbs
have nested equivalents in nplyr. A (non-exhaustive) list of
examples:
nest_mutate()
is the nested equivalent of
mutate()
nest_select()
is the nested equivalent of
select()
nest_filter()
is the nested equivalent of
filter()
nest_summarise()
is the nested equivalent of
summarise()
nest_group_by()
is the nested equivalent of
group_by()
As of version 0.2.0, nplyr also supports nested versions of some tidyr functions:
nest_drop_na()
is the nested equivalent of
drop_na()
nest_extract()
is the nested equivalent of
extract()
nest_fill()
is the nested equivalent of
fill()
nest_replace_na()
is the nested equivalent of
replace_na()
nest_separate()
is the nested equivalent of
separate()
nest_unite()
is the nested equivalent of
unite()
nplyr is largely a wrapper for dplyr. For the most up-to-date information on dplyr please visit dplyr’s website. If you are new to dplyr, the best place to start is the data transformation chapter in R for data science.
You can install the released version of nplyr from CRAN or the development version from github with the devtools or remotes package:
# install from CRAN
install.packages("nplyr")
# install from github
::install_github("markjrieke/nplyr") devtools
To get started, we’ll create a nested column for the country data within each continent from the gapminder dataset.
library(nplyr)
<-
gm_nest ::gapminder_unfiltered %>%
gapminder::nest(country_data = -continent)
tidyr
gm_nest#> # A tibble: 6 × 2
#> continent country_data
#> <fct> <list>
#> 1 Asia <tibble [578 × 5]>
#> 2 Europe <tibble [1,302 × 5]>
#> 3 Africa <tibble [637 × 5]>
#> 4 Americas <tibble [470 × 5]>
#> 5 FSU <tibble [139 × 5]>
#> 6 Oceania <tibble [187 × 5]>
dplyr can perform operations on the top-level data frame, but with nplyr, we can perform operations on the nested data frames:
<-
gm_nest_example %>%
gm_nest nest_filter(country_data, year == max(year)) %>%
nest_mutate(country_data, pop_millions = pop/1000000)
# each nested tibble is now filtered to the most recent year
gm_nest_example#> # A tibble: 6 × 2
#> continent country_data
#> <fct> <list>
#> 1 Asia <tibble [43 × 6]>
#> 2 Europe <tibble [34 × 6]>
#> 3 Africa <tibble [53 × 6]>
#> 4 Americas <tibble [33 × 6]>
#> 5 FSU <tibble [9 × 6]>
#> 6 Oceania <tibble [11 × 6]>
# if we unnest, we can see that a new column for pop_millions has been added
%>%
gm_nest_example slice_head(n = 1) %>%
::unnest(country_data)
tidyr#> # A tibble: 43 × 7
#> continent country year lifeExp pop gdpPercap pop_millions
#> <fct> <fct> <int> <dbl> <int> <dbl> <dbl>
#> 1 Asia Afghanistan 2007 43.8 31889923 975. 31.9
#> 2 Asia Azerbaijan 2007 67.5 8017309 7709. 8.02
#> 3 Asia Bahrain 2007 75.6 708573 29796. 0.709
#> 4 Asia Bangladesh 2007 64.1 150448339 1391. 150.
#> 5 Asia Bhutan 2007 65.6 2327849 4745. 2.33
#> 6 Asia Brunei 2007 77.1 386511 48015. 0.387
#> 7 Asia Cambodia 2007 59.7 14131858 1714. 14.1
#> 8 Asia China 2007 73.0 1318683096 4959. 1319.
#> 9 Asia Hong Kong, China 2007 82.2 6980412 39725. 6.98
#> 10 Asia India 2007 64.7 1110396331 2452. 1110.
#> # … with 33 more rows
nplyr also supports grouped operations with
nest_group_by()
:
<-
gm_nest_example %>%
gm_nest nest_group_by(country_data, year) %>%
nest_summarise(
country_data, n = n(),
lifeExp = median(lifeExp),
pop = median(pop),
gdpPercap = median(gdpPercap)
)
gm_nest_example#> # A tibble: 6 × 2
#> continent country_data
#> <fct> <list>
#> 1 Asia <tibble [58 × 5]>
#> 2 Europe <tibble [58 × 5]>
#> 3 Africa <tibble [13 × 5]>
#> 4 Americas <tibble [57 × 5]>
#> 5 FSU <tibble [44 × 5]>
#> 6 Oceania <tibble [56 × 5]>
# unnesting shows summarised tibbles for each continent
%>%
gm_nest_example slice(2) %>%
::unnest(country_data)
tidyr#> # A tibble: 58 × 6
#> continent year n lifeExp pop gdpPercap
#> <fct> <int> <int> <dbl> <dbl> <dbl>
#> 1 Europe 1950 22 65.8 7408264 6343.
#> 2 Europe 1951 18 65.7 7165515 6509.
#> 3 Europe 1952 31 65.9 7124673 5210.
#> 4 Europe 1953 17 67.3 7346100 6774.
#> 5 Europe 1954 17 68.0 7423300 7046.
#> 6 Europe 1955 17 68.5 7499400 7817.
#> 7 Europe 1956 17 68.5 7575800 8224.
#> 8 Europe 1957 31 67.5 7363802 6093.
#> 9 Europe 1958 18 69.6 8308052. 8833.
#> 10 Europe 1959 18 69.6 8379664. 9088.
#> # … with 48 more rows
More examples can be found in the package vignettes and function documentation.
If you notice a bug, want to request a new feature, or have recommendations on improving documentation, please open an issue in the package repository.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.