This worked example attempts to document a common workflow a user might follow when using the fingertipsR
package.
Suppose you want to plot healthy life expectancy and life expectancy by deprivation for a given year of data that fingertips contains - you will begin by wondering where to start.
There is one function in the fingertipsR
package that extracts data from the Fingertips API: fingertips_data()
. This function has the following inputs:
One of IndicatorID, DomainID or ProfileID must be complete. AreaCode needs completion if you are extracting data for a particular area or group of areas only. AreaTypeID determines the geography to extract the data for. In this case we want County and Unitary Authority level. ParentAreaTypeID requires an area type code that the AreaTypeID maps to, though if left out one will be chosen automatically.
Therefore, the inputs to the fingertips_data
function that we need to find out are the ID codes for:
We need to begin by calling the fingertipsR
package:
There are two indicators we are interested in for this exercise. Without consulting the Fingertips website, we know approximately what they are called:
We can use the indicators()
function to return a list of all the indicators within Fingertips. We can then filter the name field for the term life expectancy (note, the IndicatorName field has been converted to lower case in the following code chunk to ensure matches will not be overlooked as a result of upper case letters).
inds <- indicators()
life_expectancy <- inds[grepl("life expectancy", tolower(inds$IndicatorName)),]
# Because the same indicators are used in multiple profiles, there are many repeated indicators in this table (some with varying IndicatorName but same IndicatorID)
# This returns a record for each IndicatorID
life_expectancy <- unique(life_expectancy[duplicated(life_expectancy$IndicatorID) == FALSE,
c("IndicatorID", "IndicatorName")])
knitr::kable(life_expectancy, row.names = FALSE) #note, this line will only work in a markdown file (*.Rmd). It presents the table for a report
IndicatorID | IndicatorName |
---|---|
90362 | Healthy life expectancy at birth: the average number of years a person would expect to live in good health based on contemporary mortality rates and prevalence of self-reported good health. (in PHOF 0.1i) |
90366 | Life expectancy at birth: the average number of years a person would expect to live based on contemporary mortality rates. (in PHOF 0.1ii) |
92901 | Slope index of inequality in life expectancy at birth within English local authorities, based on local deprivation deciles: the range in years of life expectancy across the social gradient within each local authority, from most to least deprived. (In PHOF 0.2iii) |
92031 | Slope index of inequality in healthy life expectancy within local authorities, based on deprivation within Middle Super Output Areas (MSOAs): the range in years of life expectancy across the social gradient within each local authority. (in PHOF 0.2vi) |
91102 | 0.1ii - Life expectancy at 65: the average number of years a person would expect to live based on contemporary mortality rates. |
90365 | 0.2iv - The gap in years between overall life expectancy at birth in each English local authority and life expectancy at birth for England as a whole. |
650 | Life expectancy - MSOA based |
90823 | 0.2ii - Number of upper tier local authorities for which the local slope index of inequality in life expectancy (as defined in indicator 0.2iii) has decreased |
90825 | 0.2v - Slope index of inequality in healthy life expectancy at birth based on national deprivation deciles within England: the range in years of life expectancy across the social gradient, from most to least deprived. |
92900 | 0.2i - Slope index of inequality in life expectancy at birth based on national deprivation deciles within England: the range in years of life expectancy across the social gradient, from most to least deprived. |
92902 | 0.2vii - Slope index of inequality in life expectancy at birth within English region, based on regional deprivation deciles: the range in years of life expectancy across the social gradient within each local authority, from most to least deprived. |
The two indicators we are interested in from this table are:
We can work out what the AreaTypeID codes we are interested in using the function area_types()
. We’ve decided that we want to produce the graph at County and Unitary Authority level. From the section Where to start we need codes for AreaTypeID and ParentAreaTypeID.
areaTypes <- area_types()
DT::datatable(areaTypes, filter = "top", rownames = FALSE) #note, this line will only work in a markdown file (*.Rmd). It presents the table for a report
The table shows that the AreaID for County and Unitary Authority level is 102. The third column, ParentAreaTypeID, shows the IDs of the area types that these map to. In the case of County and Unitary Authorities, these are:
ParentAreaTypeID | ParentAreaTypeName |
---|---|
6 | Government Office Region |
42 | 2013 PHE Regions (4) |
10039 | Depriv. decile (IMD 2015) |
103 | PHEC 2013 only plus PHEC unchanged |
10002 | Depriv. decile (IMD 2010) |
104 | PHEC 2015 new plus PHEC 2013 unchanged |
126 | Combined authorities |
ParentAreaTypeID is 6 by default for the fingertips_data()
function for AreaTypeID
of 102 (though it change if different AreaTypeID
s are entered), so we can stick with that in this example.
We want to plot life expectancy against deprivation information. The package has a deprivation_deciles()
function that allows us to return this information. This is populated from the Department for Communities and Local Government Indices of Multiple Deprivation (IMD). Note, there is only information for upper and lower tier local authorities (AreaTypeID = 102 and 101 respectively). IMD has only been produced for the years 2010 and 2015.
dep <- deprivation_decile(AreaTypeID = 102, Year = 2015)
DT::datatable(dep, filter = "top", rownames = FALSE) #note, this line will only work in a markdown file (*.Rmd). It presents the table for a report
Finally, we can use the fingertips_data()
function with the inputs we have determined previously.
indicators <- c(90362, 90366)
data <- fingertips_data(IndicatorID = indicators,
AreaTypeID = 102)
pander::pandoc.table(tail(data),
style="rmarkdown",
split.tables = 90,
keep.line.breaks = TRUE) #note, this line will only work in a markdown file (*.Rmd). It presents the table for a report
##
##
## | | IndicatorID | IndicatorName | ParentCode |
## |:----------:|:-------------:|:---------------------------------------:|:------------:|
## | **6151** | 90362 | 0.1i - Healthy life expectancy at birth | E12000005 |
## | **6152** | 90362 | 0.1i - Healthy life expectancy at birth | E12000006 |
## | **6153** | 90362 | 0.1i - Healthy life expectancy at birth | E12000008 |
## | **6154** | 90362 | 0.1i - Healthy life expectancy at birth | E12000005 |
## | **6155** | 90362 | 0.1i - Healthy life expectancy at birth | E12000008 |
## | **6156** | 90362 | 0.1i - Healthy life expectancy at birth | E12000005 |
##
## Table: Table continues below
##
##
##
## | | ParentName | AreaCode | AreaName | AreaType |
## |:----------:|:----------------------:|:----------:|:--------------:|:-----------:|
## | **6151** | West Midlands region | E10000028 | Staffordshire | County & UA |
## | **6152** | East of England region | E10000029 | Suffolk | County & UA |
## | **6153** | South East region | E10000030 | Surrey | County & UA |
## | **6154** | West Midlands region | E10000031 | Warwickshire | County & UA |
## | **6155** | South East region | E10000032 | West Sussex | County & UA |
## | **6156** | West Midlands region | E10000034 | Worcestershire | County & UA |
##
## Table: Table continues below
##
##
##
## | | Sex | Age | CategoryType | Category | Timeperiod |
## |:----------:|:------:|:--------:|:--------------:|:----------:|:------------:|
## | **6151** | Female | All ages | NA | NA | 2013 - 15 |
## | **6152** | Female | All ages | NA | NA | 2013 - 15 |
## | **6153** | Female | All ages | NA | NA | 2013 - 15 |
## | **6154** | Female | All ages | NA | NA | 2013 - 15 |
## | **6155** | Female | All ages | NA | NA | 2013 - 15 |
## | **6156** | Female | All ages | NA | NA | 2013 - 15 |
##
## Table: Table continues below
##
##
##
## | | Value | LowerCIlimit | UpperCIlimit | Count | Denominator |
## |:----------:|:-------:|:--------------:|:--------------:|:-------:|:-------------:|
## | **6151** | 63.83 | 62.29 | 65.37 | NA | NA |
## | **6152** | 66.7 | 65.16 | 68.24 | NA | NA |
## | **6153** | 68.76 | 67.54 | 69.99 | NA | NA |
## | **6154** | 67.62 | 66 | 69.24 | NA | NA |
## | **6155** | 66.57 | 64.96 | 68.17 | NA | NA |
## | **6156** | 67.67 | 66.02 | 69.32 | NA | NA |
##
## Table: Table continues below
##
##
##
## | | Valuenote | RecentTrend |
## |:----------:|:-----------:|:--------------------:|
## | **6151** | NA | Cannot be calculated |
## | **6152** | NA | Cannot be calculated |
## | **6153** | NA | Cannot be calculated |
## | **6154** | NA | Cannot be calculated |
## | **6155** | NA | Cannot be calculated |
## | **6156** | NA | Cannot be calculated |
##
## Table: Table continues below
##
##
##
## | | ComparedtoEnglandvalueorpercentiles |
## |:----------:|:-------------------------------------:|
## | **6151** | Same |
## | **6152** | Better |
## | **6153** | Better |
## | **6154** | Better |
## | **6155** | Better |
## | **6156** | Better |
##
## Table: Table continues below
##
##
##
## | | Comparedtosubnationalparentvalueorpercentiles | TimeperiodSortable |
## |:----------:|:-----------------------------------------------:|:--------------------:|
## | **6151** | Same | 20130000 |
## | **6152** | Same | 20130000 |
## | **6153** | Better | 20130000 |
## | **6154** | Better | 20130000 |
## | **6155** | Same | 20130000 |
## | **6156** | Better | 20130000 |
The data frame returned by fingertips_data()
contains 21 variables. For this exercise, we are only interested in a few of them:
cols <- c("IndicatorID", "AreaCode", "Sex", "Timeperiod", "Value")
data <- data[data$AreaType == "County & UA" & data$Timeperiod == "2012 - 14", cols]
# merge deprivation onto data
data <- merge(data, dep, by.x = "AreaCode", by.y = "AreaCode", all.x = TRUE)
# remove NA values
data <- data[complete.cases(data),]
DT::datatable(data, filter = "top", rownames = FALSE) #note, this line will only work in a markdown file (*.Rmd). It presents the table for a report
Using ggplot2
it is possible to plot the outputs
p <- ggplot(data, aes(x = IMDscore, y = Value, col = factor(IndicatorID)))
p <- p +
geom_point() +
geom_smooth(se = FALSE, method = "loess") +
facet_wrap(~ Sex) +
scale_colour_manual(name = "Indicator",
breaks = c("90366", "90362"),
labels = c("Life expectancy", "Healthy life expectancy"),
values = c("#128c4a", "#88c857")) +
scale_x_reverse() +
labs(x = "IMD deprivation",
y = "Age",
title = "Life expectancy and healthy life expectancy at birth \nfor Upper Tier Local Authorities (2012 - 2014)") +
theme_bw()
print(p)