The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Adding new segregation indices is not a big trouble. Please open an issue on GitHub to request an index to be added.
If you use the dplyr
package, one pattern that works well is to use group_modify
. Here, we compute the pairwise Black-White dissimilarity index for each state separately:
library("segregation")
library("dplyr")
%>%
schools00 filter(race %in% c("black", "white")) %>%
group_by(state) %>%
group_modify(~ dissimilarity(
data = .x,
group = "race",
unit = "school",
weight = "n"
))#> # A tibble: 3 × 3
#> # Groups: state [3]
#> state stat est
#> <fct> <chr> <dbl>
#> 1 A D 0.706
#> 2 B D 0.655
#> 3 C D 0.704
A similar pattern works also well with data.table
:
library("data.table")
<- as.data.table(schools00)
schools00
schools00[%in% c("black", "white"),
race dissimilarity(data = .SD, group = "race", unit = "school", weight = "n"),
= .(state)
by
]#> state stat est
#> <fctr> <char> <num>
#> 1: A D 0.7063595
#> 2: B D 0.6548485
#> 3: C D 0.7042057
To compute many decompositions at once, it’s easiest to combine the data for the two time points. For instance, here’s a dplyr
solution to decompose the state-specific M indices between 2000 and 2005:
# helper function for decomposition
<- function(df, group) {
diff <- filter(df, year == 2000)
data1 <- filter(df, year == 2005)
data2 mutual_difference(data1, data2, group = "race", unit = "school", weight = "n")
}
# add year indicators
$year <- 2000
schools00$year <- 2005
schools05<- bind_rows(schools00, schools05)
combine
%>%
combine group_by(state) %>%
group_modify(diff) %>%
head(5)
#> # A tibble: 5 × 3
#> # Groups: state [1]
#> state stat est
#> <fct> <chr> <dbl>
#> 1 A M1 0.409
#> 2 A M2 0.445
#> 3 A diff 0.0359
#> 4 A additions -0.0159
#> 5 A removals 0.0390
Again, here’s also a data.table
solution:
setDT(combine)
diff(.SD), by = .(state)] %>% head(5)
combine[, #> state stat est
#> <fctr> <char> <num>
#> 1: A M1 0.40859652
#> 2: A M2 0.44454379
#> 3: A diff 0.03594727
#> 4: A additions -0.01585879
#> 5: A removals 0.03903106
tidycensus
to compute segregation indices?Here are a few examples thanks to Kyle Walker, the author of the tidycensus package.
First, download the data:
library("tidycensus")
<- get_acs(
cook_data geography = "tract",
variables = c(
white = "B03002_003",
black = "B03002_004",
asian = "B03002_006",
hispanic = "B03002_012"
),state = "IL",
county = "Cook"
)#> Getting data from the 2017-2021 5-year ACS
Because this data is in “long” format, it’s easy to compute segregation indices:
# compute index of dissimilarity
%>%
cook_data filter(variable %in% c("black", "white")) %>%
dissimilarity(
group = "variable",
unit = "GEOID",
weight = "estimate"
)#> stat est
#> <char> <num>
#> 1: D 0.7855711
# compute multigroup M/H indices
%>%
cook_data mutual_total(
group = "variable",
unit = "GEOID",
weight = "estimate"
)#> stat est
#> <char> <num>
#> 1: M 0.5114435
#> 2: H 0.4089561
Producing a map of local segregation scores is also not hard:
library("tigris")
library("ggplot2")
<- mutual_local(cook_data,
local_seg group = "variable",
unit = "GEOID",
weight = "estimate",
wide = TRUE
)
# download shapefile
<- tracts("IL", "Cook", cb = TRUE, progress_bar = FALSE) %>%
seg_geom left_join(local_seg, by = "GEOID")
#> Retrieving data for the year 2021
ggplot(seg_geom, aes(fill = ls)) +
geom_sf(color = NA) +
coord_sf(crs = 3435) +
scale_fill_viridis_c() +
theme_void() +
labs(
title = "Local segregation scores for Cook County, IL",
fill = NULL
)
When using mutual_difference
, supply method = "shapley_detailed"
to get two different local segregation scores that are margins-adjusted (one is coming from adjusting forward, the other from adjusting backwards). By averaging them we can create a single margins-adjusted local segregation score:
<- mutual_difference(schools00, schools05, "race", "school",
diff weight = "n", method = "shapley_detailed"
)
%in% c("ls_diff1", "ls_diff2"),
diff[stat ls_diff_adjusted = mean(est)),
.(= .(school)
by
]#> school ls_diff_adjusted
#> <fctr> <num>
#> 1: A1_3 -0.088983164
#> 2: A2_2 -0.044338042
#> 3: A2_3 -0.101696519
#> 4: A2_4 -0.020134162
#> 5: A2_6 -0.138567163
#> ---
#> 1706: C164_2 -0.031329845
#> 1707: C165_1 -0.023978101
#> 1708: C165_3 0.003781632
#> 1709: C166_1 0.010270713
#> 1710: C167_1 -0.002663687
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.