| Type: | Package |
| Title: | Retrieve, Harmonise and Map Open Data Regarding the Italian School System |
| Version: | 0.2.8 |
| Author: | Leonardo Cefalo |
| Maintainer: | Leonardo Cefalo <leonardo.cefalo@uniba.it> |
| Description: | Compiles and displays the available data sets regarding the Italian school system, with a focus on the infrastructural aspects. Input datasets are downloaded from the web, with the aim of updating everything to real time. The functions are divided in four main modules, namely 'Get', to scrape raw data from the web 'Util', various utilities needed to process raw data 'Group', to aggregate data at the municipality or province level 'Map', to visualize the output datasets. |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| URL: | https://github.com/lcef97/SchoolDataIT |
| LazyData: | true |
| RoxygenNote: | 7.3.2 |
| Imports: | curl, dplyr, ggplot2, grDevices, httr, leafpop, magrittr, mapview, readr, rlang, rvest, sf, stringr, tidyr, utils, xml2 |
| Suggests: | knitr, readxl, rmarkdown, testthat (≥ 3.0.0), tidyverse |
| Config/testthat/edition: | 3 |
| Depends: | R (≥ 2.10) |
| NeedsCompilation: | no |
| Packaged: | 2025-10-02 14:05:51 UTC; Leonardo |
| Repository: | CRAN |
| Date/Publication: | 2025-10-02 14:30:02 UTC |
SchoolDataIT: Retrieve, Harmonise and Map Open Data Regarding the Italian School System
Description
Compiles and displays the available data sets regarding the Italian school system, with a focus on the infrastructural aspects. Input datasets are downloaded from the web, with the aim of updating everything to real time. The functions are divided in four main modules, namely 'Get', to scrape raw data from the web 'Util', various utilities needed to process raw data 'Group', to aggregate data at the municipality or province level 'Map', to visualize the output datasets.
Author(s)
Maintainer: Leonardo Cefalo leonardo.cefalo@uniba.it (ORCID)
Other contributors:
Alessio Pollice alessio.pollice@uniba.it (ORCID) [contributor, thesis advisor]
Paolo Maranzano pmaranzano.ricercastatistica@gmail.com (ORCID) [contributor]
See Also
Useful links:
Download the names and codes of Italian LAU and NUTS-3 administrative units
Description
This function downloads a file provided by the Italian National Institute of Statistics including all the codes of administrative units in Italy. As of today, it is the easiest way to map directly cadastral codes to municipality codes.
Usage
Get_AdmUnNames(Date = Sys.Date(), autoAbort = FALSE)
Arguments
Date |
Character. The date at which administrative unit codes are sought for. Important: must be in the format: "yyyy-mm-dd". Current date by default. |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Value
An object of class tbl_df, tbl and data.frame, including: NUTS-3 code, NUTS-3 abbreviation,
LAU code, LAU name (description) and cadastral code. All variables are characters except for the NUTS-3 code.
Source
<https://situas.istat.it/web/#/territorio>
Examples
Get_AdmUnNames("2025-01-01", autoAbort = TRUE)
Download the data regarding the broad band connection activation in Italian schools
Description
Retrieves the data regarding the activation date of the ultra-broadband connection in schools and indicates whether the connection was activated or not at a certain date.
Usage
Get_BroadBand(
Date = as.Date(format(as.Date(format(Sys.Date(), "%Y-%m-01")) - 1, "%Y-%m-01")),
include_municipality_code = TRUE,
input_School2mun = NULL,
input_Registry = NULL,
input_AdmUnNames = NULL,
verbose = TRUE,
autoAbort = FALSE
)
Arguments
Date |
Object of class |
include_municipality_code |
Logical. Whether to include municipality codes.
|
input_School2mun |
Object of class |
input_Registry |
If |
input_AdmUnNames |
If |
verbose |
Logical. If |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Details
Ultra - Broadband is defined as everlasting internet connection with a maximum speed of 1 gigabit per second, with a minimum guaranteed speed of 100 megabits/second both on the uploading and downloading operations, until the peering point is reached, as declared on the data provider's website: <https://bandaultralarga.italia.it/scuole-voucher/progetto-scuole/>. In the example the broadband availability at the beginning of school year 2022/23 (1st september 2022) is shown.
Value
An object of class tbl_df, tbl and data.frame.
The variables BB_Activation_date and BB_Activation_staus indicate
the activation date and activation status of the broadband connection at the selected date.
Source
Broadband dashboard: <https://bandaultralarga.italia.it/scuole-voucher/dashboard-scuole/>
Examples
Broadband_220901 <- Get_BroadBand(Date = as.Date("2022-09-01"), autoAbort = TRUE)
Broadband_220901
Broadband_220901[, c(9,6,13,14)]
Download the database of Italian public schools buildings
Description
This function downloads the School Buildings Open Database provided by the Italian Ministry of Education, University and Research.
It is one of the main sources of information regarding the infrastructure system of public schools in Italy. For a given year, all available data are downloaded (except for the structural units section, which has a different level of detail) and gathered into a unique dataframe.
Usage
Get_DB_MIUR(
Year = 2023,
verbose = TRUE,
input_Registry = NULL,
input_AdmUnNames = NULL,
show_col_types = FALSE,
certifications = FALSE,
autoAbort = FALSE
)
Arguments
Year |
Numeric or character value. Reference school year (last available is 2023).
Available in the formats: |
verbose |
Logical. If |
input_Registry |
Object of class |
input_AdmUnNames |
Object of class |
show_col_types |
Logical. If |
certifications |
Logical. From year 2021/22 onwards, whether to include some safety certifications in the database.
Given the particular level of definition of this file, it requires extra computational time (other than the downloading time). |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Details
This function downloads the raw data; missing observations are not edited; all variables are characters.
Since certifications are defined at the level of structural units of the single buildings, here
the fields read as the percentage of structural units in a building having a given certificate.
To edit the output of this function and convert the relevant variables to numeric or Boolean, please Util_DB_MIUR_num.
Schools different from primary, middle or high schools are classified as "NR". In the example, the data for school year 2022/23 are retrieved.
Value
An object of class tbl_df, tbl and data.frame.
Source
Examples
input_DB23_MIUR <- Get_DB_MIUR(2023, autoAbort = TRUE)
input_DB23_MIUR[-c(1,4,6,9)]
Download the classification of peripheral municipalities
Description
Retrieves the classification of Italian municipalities into six categories; classes D, E, and F are the so-called internal/inner areas; classes A, B and C are the central areas.
Usage
Get_InnerAreas(verbose = TRUE, autoAbort = FALSE)
Arguments
verbose |
Logical. Whether to keep track of computational time. |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Details
Classes are defined according to these criteria; see the methodological note (in Italian) for more detail:
A - Standalone pole municipalities, the highest degree of centrality; they are characterised by a thorough and self-sufficient combined endowment of school, health and transport infrastructure, i.e. there are at least a lyceum and a technical high school; a railway station of medium dimensions and a hospital provided with an emergency ward.
B - Intermunicipality poles; the endowment of such infrastructures is complete if a small set of contiguous municipalities is considered
The remaining classes are defined in terms of the national distribution of the road distances from a municipality to the closest pole:
C - Belt municipalities, travel time below the median (< 27'42”) .
D - Intermediate municipalities, travel time between the median and the third quartile (27'42” - 40'54”).
E - Peripheral municipalities, travel time between the third quartile and 97.5th percentile (40'54” - 1h 6' 54”).
F - Ultra-peripheral municipalities, travel time over the 97.5th percentile (>1h 6' 54”).
For more information regarding the dataset, it is possible to check the ISTAT methodological note (in Italian) available at <https://www.istat.it/it/files//2022/07/FOCUS-AREE-INTERNE-2021.pdf>
Value
An object of class tbl_df, tbl and data.frame.
Source
<https://www.istat.it/notizia/la-geografia-delle-aree-interne-nel-2020-vasti-territori-tra-potenzialita-e-debolezze/>
Examples
InnerAreas <- Get_InnerAreas(autoAbort = TRUE)
InnerAreas[, c(1,9,13)]
Download the Invalsi census survey data
Description
Downloads the full database of the Invalsi scores, detailed either at the municipality or province level.
Usage
Get_Invalsi_IS(
level = "LAU",
verbose = TRUE,
show_col_types = FALSE,
multiple_out = TRUE,
autoAbort = FALSE,
category = FALSE
)
Arguments
level |
Character. The level of aggregation of Invalsi census data. Either |
verbose |
Logical. If |
show_col_types |
Logical. If |
multiple_out |
Logical. Wheter keeping
multiple dataframes as outputs (thus overriding the |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
category |
Logical. Whether to focus on a specific category of students participating to the census survey. Warning: experimental. |
Details
Numeric variables provided are:
-
Average_percentage_scoreAverage direct score (percentage of sufficient tests) -
Std_dev_percentage_scoreStandard deviation of the direct score -
WLE_average_scoreAverage WLE score. The WLE score is calculated through the Rasch's psychometric model and is suitable for middle and high schools in that it is cleaned from the effect of cheating (which would affect both the average score and the score variability). By construction it has a mean around 200 points. -
Std_dev_WLE_scoreStandard deviation of the WLE score. By construction it ranges around 40 points at the school level. -
Students_coverageStudents coverage percentage
Additional numeric variables, not always available for all observational units, are:
Mean and SD of ESCS indicator
-
First-Fifth_Level: Distribution of the proficiency level of students -
Targets_percentage: Percentage of students reaching targets
Numeric codes 888 and 999 denote not applicable and not available fields respectively.
If multiple_out == TRUE, provides the following datasets:
-
Municipality_data: LAU-level data -
Province_data: NUTS-3-level data -
Region_data: NUTS-2-level data -
LLS_data: data at the level of local labour systems (Sistemi Locali del Lavoro; see ISTAT webpage for details) -
Inner_Areas_2021_dataaggregated data for inner areas according to the 2020 taxonomy -
Inner_Areas_2014_dataaggregated data for inner areas according to the former 2014 taxonomy -
Macroarea_datadata aggregated for North-West, North-East, Center, South and Islands
Value
Unless multiple_out == TRUE, an object of class tbl_df, tbl and data.frame.
Otherwise, a list including objects of the aforementioned classes
Source
<https://serviziostatistico.invalsi.it/en/archivio-dati/?_sft_invalsi_ss_data_collective=open-data>
Examples
Get_Invalsi_IS(level = "NUTS-3", autoAbort = TRUE, verbose = FALSE)
Download the registry of Italian public schools from the school registry section
Description
This function returns two main pieces of information regarding Italian schools, namely:
The denomination of the region, province and municipality to which the school belongs.
The mechanographical code to the reference institute of each school.
It is possible to access schools in all the national territory, including the autonomous provinces of Aosta, Trento and Bozen.
Usage
Get_Registry(
Year = 2023,
filename = c("SCUANAGRAFESTAT", "SCUANAAUTSTAT"),
show_col_types = FALSE,
autoAbort = FALSE
)
Arguments
Year |
Numeric or character. Reference school year (last available is 2024).
Available in the formats: |
filename |
Character. A string included in the name of the file to download, identifying the schools included.
By default it is For the registry of private schools, either in all the national territory except for the aforementioned provinces, and for these provinces, please use |
show_col_types |
Logical. If |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Details
Schools different from primary, middle or high schools are classified as "NR".
Value
An object of class tbl_df, tbl and data.frame.
Source
Examples
Get_Registry(2024, filename = "SCUANAGRAFESTAT", autoAbort = TRUE)
Associate a Municipality (LAU) code to each school
Description
This function associates the relevant municipality codes to all the schools listed in the two main registries provided by the Italian Ministry of Education, University and Research, namely:
The registry of school buildings, here referred to as
Registry_from_buildings(Get_DB_MIUR)The official schools registry, here referred to as
Registry_from_registry(seeGet_Registry)
Usage
Get_School2mun(
Year = 2023,
show_col_types = FALSE,
verbose = TRUE,
input_AdmUnNames = NULL,
input_Registry = NULL,
autoAbort = FALSE
)
Arguments
Year |
Numeric or character value (last available is 2023).
Available in the formats: |
show_col_types |
Logical. If |
verbose |
Logical. If |
input_AdmUnNames |
Object of class |
input_Registry |
Object of class |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Value
An object of class list, including 4 elements:
-
$Registry_from_buildings: Object of classtbl_df,tblanddata.frame: the schools listed in the buildings registry -
$Registry_from_registry: Object of classtbl_df,tblanddata.frame: the schools listed in the schools registry -
$Any: Object of classtbl_df,tblanddata.frame: schools listed anywhere -
$Both: Object of classtbl_df,tblanddata.frame: schools listed in both the sections
Source
Buildings registry (2021 onwards); Buindings registry(until 2019); Schools registry
Examples
Get_School2mun(Year = 2023, autoAbort = TRUE)
Download the shapefiles of Italian NUTS-3 and LAU administrative units
Description
Downloads either the boundaries or the centroids of the relevant administrative units, either provinces or municipalities, from the ISTAT website. Geometries are expressed in meters.
Usage
Get_Shapefile(
Year,
level = "LAU",
lightShp = TRUE,
autoAbort = FALSE,
centroids = FALSE
)
Arguments
Year |
Numeric. Reference year for the administrative units. |
level |
Character. Either |
lightShp |
Logical. If |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
centroids |
Logical. Whether to switch from polygon geometry to point geometry. In the latter case, the point is located at the centroid of the relevant area. |
Value
A spatial data frame of class data.frame and sf.
Source
<https://www.istat.it/it/archivio/222527>
Examples
library(magrittr)
Prov23_shp <- Get_Shapefile(2023, lightShp = TRUE, level = "NUTS-3", autoAbort = TRUE)
ggplot2::ggplot() + ggplot2::geom_sf(data = Prov23_shp) +
ggplot2::ggtitle("Italian provinces in 2023/01/01")
Download students' number data
Description
This functions downloads the data regarding the number of students, from the open website of the Italian Ministry of Education, University and Research
Usage
Get_nstud(
Year = 2023,
filename = c("ALUCORSOETASTA", "ALUCORSOINDCLASTA"),
verbose = TRUE,
show_col_types = FALSE,
autoAbort = FALSE
)
Arguments
Year |
Numeric or character. Reference school year (last available is 2023).
Available in the formats: |
filename |
Character. A string included in the name of the file to download.
By default it is Other file names are the following. The output is not currently supported by the remainder of the functions involving the number of students.
|
verbose |
Logical. If |
show_col_types |
Logical. If |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Value
By default, a list of two tbl_df, tbl and data.frame objects:
-
$ALUCORSOETASTA: The number of students by school, school grade and age. It provides a higher number of school than the other element -
$ALUCORSOINDCLASTA: The number of students and classes by school and school grade. This is a long-format dataframe.
Source
Examples
Get_nstud(2023, filename = "ALUCORSOINDCLASTA", autoAbort = TRUE)
Download the number of teachers in Italian schools by province
Description
This functions downloads the number of teachers by province from the open website of the Italian Ministry of Education, University and Research.
Usage
Get_nteachers_prov(
Year = 2023,
verbose = TRUE,
show_col_types = FALSE,
filename = c("DOCTIT", "DOCSUP"),
autoAbort = FALSE
)
Arguments
Year |
Numeric or character value. Reference school year for the school registry data (last available is 2023).
Available in the formats: |
verbose |
Logical. If |
show_col_types |
Logical. If |
filename |
Character. Which data to retrieve among the province counts of teachers/school personnel.
By default it is
|
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Details
Please notice that by default, the function returns the count of the number of tenured and temporary teachers.
If either the count of non-teaching personnel or the count of a single category of teaching personnel is needed, please adapt
the filename argument accordingly.
Value
An object of class tbl_df, tbl and data.frame.
Source
Examples
nteachers23 <- Get_nteachers_prov(2023, filename = "DOCTIT", autoAbort = TRUE)
nteachers23[, c(3,4,5)]
Aggregate the database of Italian public schools buildings at the municipality and province level
Description
This function transforms the output of the Util_DB_MIUR_num function (which is detailed at the level of single school buildings) at the municipality/LAU and province/NUTS-3 level.
It also allows the user to classify the grade of centrality of municipalities through the variable Inner_area.
Usage
Group_DB_MIUR(
data = NULL,
Year = 2023,
count_units = TRUE,
countname = "nbuildings",
count_missing = TRUE,
verbose = TRUE,
track_deleted = TRUE,
InnerAreas = TRUE,
ord_InnerAreas = FALSE,
input_InnerAreas = NULL,
autoAbort = FALSE,
...
)
Arguments
data |
Object of class |
Year |
Numeric or Character. The reference school year, if either |
count_units |
Logical. Whether the rows to aggregate at each level must be counted or not. True by default. |
countname |
character. The name of the variable indicating the number of schools included in each municipality of province,
if the argument 'count' is |
count_missing |
Logical. Whether the function should return two dataframes including the percentage of NAs in the |
verbose |
Logical. If |
track_deleted |
Logical. If |
InnerAreas |
Logical. Whether an indicator of the percentage of schools belonging to peripheral (Inner) areas mus be included or not. |
ord_InnerAreas |
Logical. Whether the Inner areas classification should be treated as an ordinal variable rather than as a binary one (see |
input_InnerAreas |
Object of class |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
... |
Additional arguments to the function |
Details
Numerical variables are summarised by the mean; Boolean variables are summarised by the mean as well, thus they become frequency indicators. Qualitative values, if included, are summarised by the mode. Summary measures do not include NAs. The output dataframes are also detailed at the school order level (i.e. Primary, Midde, High school, or different orders). This means that rows are unique combinations of territorial unities and school order.
Value
An object of class list including:
-
$Municipality_data: object of classtbl_df,tblanddata.frame, the output dataframe detailed at the municipality level; all variables besides the first 5 (which identify the record) are numeric -
$Province_data: object of class 'tbl_df', 'tbl' and 'data.frame', the output dataframe detailad at the province level; all variables besides the first 3 (which identify the record) are numeric -
$Municipality_missing(Only ifcount_missing == TRUE); object of classtbl_df,tblanddata.frame, the percentage of NAs in each variable at the municipality level. -
$Province_missing: (Only ifcount_missing == TRUE); object of class 'tbl_df', 'tbl' and 'data.frame', the percentage of NAs in each variable at the province level. -
$deleted: character vector. The schools removed from the original dataframe for data quality reasons. This object is returned only iftrack_deleted == TRUE
Examples
library(magrittr)
DB23_MIUR <- example_input_DB23_MIUR %>% Util_DB_MIUR_num(verbose = FALSE) %>%
Group_DB_MIUR(InnerAreas = FALSE)
DB23_MIUR$Municipality_data[, -c(1,2,4)]
summary(DB23_MIUR$Municipality_data)
DB23_MIUR$Province_data[, -c(1,3)]
summary(DB23_MIUR$Province_data)
Aggregate the students number data by class at the municipality and province level
Description
This function creates two dataframes with the number of students, classes and students by class, aggregated at the province and municipality level
Usage
Group_nstud(
data = NULL,
Year = 2023,
check = TRUE,
verbose = TRUE,
check_registry = "Any",
InnerAreas = TRUE,
ord_InnerAreas = FALSE,
check_ggplot = FALSE,
missing_to_1 = FALSE,
input_Registry = NULL,
input_InnerAreas = NULL,
input_Prov_shp = NULL,
input_School2mun = NULL,
input_AdmUnNames = NULL,
autoAbort = FALSE,
...
)
Arguments
data |
Either an object of class |
Year |
Numeric or character value. The reference school year, if either of the |
check |
Logical. If |
verbose |
Logical. If |
check_registry |
Character. If |
InnerAreas |
Logical. If |
ord_InnerAreas |
Logical. If |
check_ggplot |
Logical. If |
missing_to_1 |
Logical. Only needed if |
input_Registry |
Object of class |
input_InnerAreas |
Object of class |
input_Prov_shp |
Object of class |
input_School2mun |
Object of class |
input_AdmUnNames |
Object of class |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
... |
Additional arguments to the function |
Details
Numerical variables are summarised by the mean; Boolean variables are summarised by the mean as well, thus they become frequency indicators. Qualitative values, if included, are summarised by the mode. Summary measures do not include NAs.
Value
An object of class list including:
-
$Municipality_data: object of classtbl_df,tblanddata.frame, the output dataframe detailed at the municipality level -
$Province_data: object of class 'tbl_df', 'tbl' and 'data.frame', the output dataframe detailad at the province level
Examples
Year <- 2023
nstud23_aggr <- Group_nstud(data = example_input_nstud23, Year = Year,
input_Registry = example_input_Registry23,
InnerAreas = FALSE,
input_School2mun = example_School2mun23)
summary(nstud23_aggr$Municipality_data[,c(46,47,48)])
summary(nstud23_aggr$Province_data[,c(44,45,46)])
Arrange the number of teachers per students in public Italian schools at the province level
Description
This function provides the average number of teachers per students in Italian public schools at the province level.
Usage
Group_teachers4stud(
Year = 2023,
input_nteachers = NULL,
nteachers_filename = c("DOCTIT", "DOCSUP"),
verbose = TRUE,
input_nstud_raw = NULL,
input_nstud_aggr = NULL,
autoAbort = FALSE,
...
)
Arguments
Year |
Numeric or character value. Reference school year for the school registry data (last available is 2022).
Available in the formats: |
input_nteachers |
Object of class |
nteachers_filename |
Character. If |
verbose |
Logical. If |
input_nstud_raw |
Object of class 'list', including two objects of class |
input_nstud_aggr |
Object of class |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
... |
Arguments to |
Value
An object of class tbl_df, tbl and data.frame
Examples
input_nstud23 <- Get_nstud(2023, filename ="ALUCORSOINDCLASTA", autoAbort = TRUE)
Registry23 <- Get_Registry(2023, autoAbort = TRUE)
School2mun23 <- Get_School2mun(2023, input_Registry = Registry23, autoAbort = TRUE)
nstud23.aggr <- Group_nstud(Year = 2023, data = input_nstud23,
input_Registry = Registry23, input_School2mun = School2mun23,
autoAbort = TRUE)
input_nteachers23 <- Get_nteachers_prov(2023, autoAbort = TRUE)
teachers4stud <- Group_teachers4stud(Year = 2023,
input_nteachers = input_nteachers23,
input_nstud_aggr = nstud23.aggr, autoAbort = TRUE)
teachers4stud[, -c(1, 2, 10, 11)]
summary(teachers4stud)
Map school data
Description
This function displays a map of the data arranged trough the function Set_DB.
It supports two kinds of map:
Interactive map (default option), which allows the user to visualize all the data in scope through the interactive popup, and
Static map (ggplot), which can be easily exported in
.pdfobjects.
The user must select a variable to display.
It is possible to insert either a readily-downloaded database obtained through the function Set_DB or the basic inputs to plug in that function, other than an input shapefile. Relevant arguments not provided by the user will be download automatically, but not saved into the global environment. However we suggest to plug in at least some inputs, as otherwise the running time may be long.
This function generalises the functionalities of the more data-specific functions Map_School_Buildings and Map_Invalsi.
Usage
Map_DB(
data = NULL,
Year = 2023,
field,
level = "LAU",
plot = "mapview",
popup_height = 200,
col_rev = FALSE,
pal = "viridis",
input_shp = NULL,
region_code = c(1:20),
main_pos = "top",
main = "",
order = NULL,
autoAbort = FALSE,
only_observed = FALSE,
...
)
Arguments
data |
Object of class |
Year |
Numeric or Character. The reference school year, needed if either |
field |
Character. The variable to display in the map. |
level |
Character. The administrative level of detailed at which the target variable must be displayed. Either |
plot |
Character. The type of map to display; either |
popup_height |
Numeric. The height of the popup table in terms of pixels if the |
col_rev |
Logical. Whether the scale of the colour palette should be reverted or not. |
pal |
Character. The palette to use if the |
input_shp |
Object of class |
region_code |
Numeric. The NUTS-2 codes of the units that must be displayed.
If the level is set to |
main_pos |
Character.Where the header should be placed if the |
main |
Character. The title to display in the |
order |
Character. The educational level. Either |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
only_observed |
Logical. Whether to remove unobserved areas from the plot. |
... |
Additional arguments for the input database, if not provided; see |
Value
If plot == "mapview", an object of class mapview. Otherwise, if plot == "ggplot", an object of class gg and ggplot.
Examples
DB23 <- Set_DB(Year = 2023, level = "NUTS-3",
Invalsi_grade = c(10,13), NA_autoRM = TRUE,
input_Invalsi_IS = example_Invalsi23_prov, input_nstud = example_input_nstud23,
input_InnerAreas = example_InnerAreas,
input_School2mun = example_School2mun23,
input_AdmUnNames = example_AdmUnNames20220630,
nteachers = FALSE, BroadBand = FALSE, SchoolBuildings = FALSE)
Map_DB(DB23, field = "Students_per_class_13", input_shp = example_Prov22_shp, level = "NUTS-3",
col_rev = TRUE, plot = "ggplot")
Map_DB(DB23, field = "Inner_area", input_shp = example_Prov22_shp, order = "High",
level = "NUTS-3",col_rev = TRUE, plot = "ggplot")
Map_DB(DB23, field = "M_Mathematics_10", input_shp = example_Prov22_shp, level = "NUTS-3",
plot = "ggplot")
Display a map of Invalsi scores
Description
This function displays either a static or interactive map of the Invalsi scores, either at the municipality or province level. It supports two kinds of map:
Interactive map (default option), which allows the user to visualize all the data in scope through the interactive popup, and
Static map (ggplot), which can be easily exported in
.pdfobjects.
Usage
Map_Invalsi(
data = NULL,
Year = 2023,
subj_toplot = "ITA",
grade = 8,
level = "LAU",
main = "",
main_pos = "top",
region_code = c(1:20),
plot = "mapview",
pal = "viridis",
WLE = FALSE,
col_rev = FALSE,
popup_height = 200,
only_observed = FALSE,
verbose = TRUE,
input_shp = NULL,
autoAbort = FALSE
)
Arguments
data |
Object of class |
Year |
Numeric or character value. Reference school year for the data (last available is 2022/23).
Available in the formats: |
subj_toplot |
Character. The school subject to display in the map,
The school subject to include, one among:
|
grade |
Numeric. The school grade to chose. Either |
level |
Character. The level of aggregation of Invalsi census data. Either |
main |
Character. A customary title to the map. If |
main_pos |
Character.Where the header should be placed if the |
region_code |
Numeric. The NUTS-2 codes of the units that must be displayed.
If the level is set to |
plot |
Character. The type of map to display; either |
pal |
Character. The palette to use if the |
WLE |
Logical. Whether the variable to chose should be the average WLE score rather that the percentage of sufficient tests, if both are available. |
col_rev |
Logical. Whether the scale of the colour palette should be reverted or not, if the |
popup_height |
Numeric. The height of the popup table in terms of pixels if the |
only_observed |
Logical. Whether to remove unobserved areas from the plot. |
verbose |
Logical. If |
input_shp |
Object of class |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Value
If plot == "mapview", an object of class mapview. Otherwise, if plot == "ggplot", an object of class gg and ggplot.
Examples
Map_Invalsi(subj = "Italian", grade = 13, level = "NUTS-3", Year = 2023, WLE = FALSE,
data = example_Invalsi23_prov, input_shp = example_Prov22_shp, plot = "ggplot")
Map_Invalsi(subj = "Italian", grade = 5, level = "NUTS-3", Year = 2023, WLE = TRUE,
data = example_Invalsi23_prov, input_shp = example_Prov22_shp, plot = "ggplot")
Display data fom the school buildings database
Description
This function displays a map of the data downloaded trough the Get_DB_MIUR function.
It supports two kinds of map:
Interactive map (default option), which allows the user to visualize all the data in scope through the interactive popup, and
Static map (ggplot), which can be easily exported in
.pdfobjects.
Usage
Map_School_Buildings(
data = NULL,
field,
order = NULL,
level = "LAU",
region_code = c(1:20),
plot = "mapview",
pal = "viridis",
col_rev = FALSE,
popup_height = 200,
main_pos = "top",
main = "",
only_observed = FALSE,
verbose = TRUE,
input_shp = NULL,
autoAbort = FALSE,
...
)
Arguments
data |
Object of class |
field |
Character. The variable to display in the map. |
order |
Character. The school order. Either |
level |
Character. The administrative level of detailed at which the target variable must be displayed.
Either |
region_code |
Numeric. The NUTS-2 codes of the units that must be displayed.
If the level is set to |
plot |
Character. The type of map to display; either |
pal |
Character. The palette to use if the |
col_rev |
Logical. Whether the scale of the colour palette should be reverted or not, if the |
popup_height |
Numeric. The height of the popup table in terms of pixels if the |
main_pos |
Character. Where the header should be placed if the |
main |
Character. The customary title to display in the |
only_observed |
Logical. Whether to remove unobserved areas from the plot. |
verbose |
Logical. If |
input_shp |
Object of class |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
... |
If |
Value
If plot == "mapview", an object of class mapview. Otherwise, if plot == "ggplot", an object of class gg and ggplot.
Examples
library(magrittr)
DB23_MIUR <- example_input_DB23_MIUR %>%
Util_DB_MIUR_num(track_deleted = FALSE) %>%
Group_DB_MIUR(InnerAreas = FALSE, count_missing = FALSE)
DB23_MIUR %>% Map_School_Buildings(field = "School_bus",
order = "Primary",level = "NUTS-3", plot = "ggplot",
input_shp = example_Prov22_shp)
DB23_MIUR %>% Map_School_Buildings(field = "Railway_transport",
order = "High",level = "NUTS-3", plot = "ggplot",
input_shp = example_Prov22_shp)
DB23_MIUR %>% Map_School_Buildings(field = "Context_without_disturbances",
order = "Middle",level = "NUTS-3", plot = "ggplot",
input_shp = example_Prov22_shp, col_rev = TRUE)
Build up a comprehensive database regarding the school system
Description
This function generates a unique dataframe of the school system data including a customary choice of available datasets. This function allows the user to aggregate the desired datasets, when available, among these:
Invalsi census survey
School buildings
Number of students and school classes
Number of teachers
Broadband connection availability
To save as much time as possible it is possible to plug in ready-made input data; otherwise they will be downloaded automatically but not saved in the global environment When a new dataset is joined to the existing ones, it is possible that some observations in this datasets are missing. In this case, by default, the choice of keeping as much observational units as possible, or to remove units with missing variables is left to the user.
Usage
Set_DB(
Year = 2023,
level = "LAU",
conservative = TRUE,
Invalsi = TRUE,
SchoolBuildings = TRUE,
nstud = TRUE,
nteachers = TRUE,
BroadBand = TRUE,
verbose = TRUE,
show_col_types = FALSE,
Invalsi_subj = c("ELI", "ERE", "ITA", "MAT"),
Invalsi_grade = c(2, 5, 8, 10, 13),
Invalsi_WLE = FALSE,
SchoolBuildings_certifications = FALSE,
SchoolBuildings_include_numerics = TRUE,
SchoolBuildings_include_qualitatives = FALSE,
SchoolBuildings_row_cutout = FALSE,
SchoolBuildings_unique_buildings = TRUE,
SchoolBuildings_col_cut_thresh = 20000,
SchoolBuildings_flag_outliers = TRUE,
SchoolBuildings_count_missing = FALSE,
nstud_imputation_thresh = 19,
nstud_missing_to_1 = FALSE,
UB_nstud_byclass = 99,
LB_nstud_byclass = 1,
UB_nstud_byclass_grade = NULL,
LB_nstud_byclass_grade = NULL,
nstud_filter_by_grade = FALSE,
InnerAreas = TRUE,
ord_InnerAreas = FALSE,
nstud_check = TRUE,
nstud_check_registry = "Any",
BroadBand_impute_missing = TRUE,
Date = as.Date(paste0(substr(year.patternA(Year), 1, 4), "-09-01")),
NA_autoRM = NULL,
input_Invalsi_IS = NULL,
input_Registry = NULL,
input_SchoolBuildings = NULL,
input_nstud = NULL,
input_School2mun = NULL,
input_AdmUnNames = NULL,
input_InnerAreas = NULL,
input_teachers4student = NULL,
input_nteachers = NULL,
input_BroadBand = NULL,
autoAbort = FALSE
)
Arguments
Year |
Numeric or Character. The relevant school year. Available in the formats: |
level |
Character. The administrative level of detail at which data must be aggregated.
Either |
conservative |
Logical. If |
Invalsi |
Logical. Whether the Invalsi census data must be included (see |
SchoolBuildings |
Logical. Whether the school buildings dataset must be included (see |
nstud |
Logical. Whether the students number per class must be included (see |
nteachers |
Logical. Whether the number of teachers by province must be included (see |
BroadBand |
Logical. Whether the broadband availability in schools must be included (see |
verbose |
Logical. If |
show_col_types |
Logical. If |
Invalsi_subj |
Character. If |
Invalsi_grade |
Numeric. If |
Invalsi_WLE |
Logical. Whether to express Invalsi scores as averagev WLE score rather that the percentage of sufficient tests, if both are Invalsi_grade is either or |
SchoolBuildings_certifications |
Logical. If the school buldings database has to be downloaded, whether to include safety certifications. Only relevant from schol year 2020/21 onwards (see |
SchoolBuildings_include_numerics |
Logical. Whether to include strictly numeric variables alongside with Boolean ones in the school buildings database (see |
SchoolBuildings_include_qualitatives |
Logical. Whether to include qualitative variables alongside with Boolean ones in the school buildings database (see |
SchoolBuildings_row_cutout |
Logical. Whether to filter out rows including missing fields in the school buildings database (see |
SchoolBuildings_unique_buildings |
Logical. If school buildings DB is included at the building level,
whether to remove records in which the building code is duplicated and all other fields are as well.
As rows are combinations of building ID and school ID, if a school is hosted by several buildings, and each field other than
|
SchoolBuildings_col_cut_thresh |
Numeric. The threshold of missing values allowed for each variable in the school buildings database (see |
SchoolBuildings_flag_outliers |
Logical. Whether to assign NA to outliers in numeric variables; see |
SchoolBuildings_count_missing |
Logical. Whether the function should return the percentage of NAs in the input school buildings database (see also |
nstud_imputation_thresh |
Numeric. If |
nstud_missing_to_1 |
Numeric. If |
UB_nstud_byclass |
Numeric. Either a unique value for all school orders, or a vector of three order-specific values in the order: primary, middle, high.
If focus is on class size, the upper limit of the acceptable school-level (if |
LB_nstud_byclass |
Numeric. Either a unique value for all school orders, or a vector of three order-specific values in the order: primary, middle, wide.
If focus is on class size, the lower limit of the acceptable school-level (if |
UB_nstud_byclass_grade |
Numeric. IF |
LB_nstud_byclass_grade |
Numeric. IF |
nstud_filter_by_grade |
Logical. If focus is on class size, whether to remove all school grades with average class size outside of the acceptance boundaries. |
InnerAreas |
Logical. Whether the percentage of schools belonging to inner/internal areas must be included (see |
ord_InnerAreas |
Logical. If |
nstud_check |
Logical. If |
nstud_check_registry |
Character. If |
BroadBand_impute_missing |
Whether the schools not included in the Broadband dataset must be considered in the total of schools (i.e. the denominator to the Broadband availability indicator). |
Date |
Character or Date. The threshold date to broadband activation to consider it activated for a school, i.e. the date before which the works of broadband activation must be finished in order to consider a school as provided with the broadband. By default, September 1st at the beginning of the school year. |
NA_autoRM |
Logical. Either |
input_Invalsi_IS |
Object of class |
input_Registry |
Object of class |
input_SchoolBuildings |
Object of class |
input_nstud |
Object of class |
input_School2mun |
Object of class |
input_AdmUnNames |
Object of class |
input_InnerAreas |
Object of class |
input_teachers4student |
Object of class |
input_nteachers |
Object of class |
input_BroadBand |
Object of classs |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Value
An object of class tbl_df, tbl and data.frame
See Also
Util_DB_MIUR_num, Group_DB_MIUR, Group_nstud, Util_Check_nstud_availability, Get_School2mun
for similar arguments.
Examples
DB23_prov <- Set_DB(Year = 2023, level = "NUTS-3",Invalsi_grade = c(5, 8, 13),
Invalsi_subj = "Italian",nteachers = FALSE, BroadBand = FALSE,
SchoolBuildings_count_missing = FALSE,NA_autoRM= TRUE,
input_SchoolBuildings = example_input_DB23_MIUR[, -c(11:18, 10:27)],
input_Invalsi_IS = example_Invalsi23_prov,
input_nstud = example_input_nstud23,
input_InnerAreas = example_InnerAreas,
input_School2mun = example_School2mun23,
input_AdmUnNames = example_AdmUnNames20220630)
DB23_prov
summary(DB23_prov[, -c(22:62)])
Map schools included in the ultra-broadband plan to their LAU codes.
Description
Helper function to provide the ultra-broadband dataset obtained with Get_BroadBand
with the statistical codes of the relevant municipalities, obtained with Get_School2mun,
in case the ultra-broadband dataset has been downloaded with argument include_municipality_code = FALSE.
Usage
Util_BroadBand2mun(
data,
input_School2mun = NULL,
input_Registry = NULL,
input_AdmUnNames = NULL,
verbose = FALSE,
autoAbort = FALSE
)
Arguments
data |
Object of class |
input_School2mun |
Object of class |
input_Registry |
If |
input_AdmUnNames |
If |
verbose |
Logical. If |
autoAbort |
Logical. Whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Details
see Get_BroadBand
Value
An object of class tbl_df, tbl and data.frame,
identical to the output of Get_BroadBand with an additional column for LAU codes
Source
Broadband dashboard: <https://bandaultralarga.italia.it/scuole-voucher/dashboard-scuole/> . ISTAT LAU codes: <https://situas.istat.it/web/#/territorio>
Check how many schools in the school registries are included in the students count dataframe
Description
This function checks for which schools listed in the two registries (the buildings registry and the properly said schools registry)
the count of students is available. The first registry is referred to as as Registry_from_buildings and the second one as Registry_from_registry.
Usage
Util_Check_nstud_availability(
data,
Year,
cutout = c("IC", "IS", "NR"),
verbose = TRUE,
ggplot = TRUE,
toplot_registry = "Any",
InnerAreas = TRUE,
ord_InnerAreas = FALSE,
input_Registry = NULL,
input_InnerAreas = NULL,
input_Prov_shp = NULL,
input_AdmUnNames = NULL,
input_School2mun = NULL,
autoAbort = FALSE
)
Arguments
data |
Object of class |
Year |
Numeric or character value. Reference school year.
Available in the formats: |
cutout |
Character. The types of schools not to be taken into account (because not relevant or because they are out of scope in the students number section). By default |
verbose |
Logical. If |
ggplot |
Logical. If |
toplot_registry |
Character. If the |
InnerAreas |
Logical. Whether it must be checked if municipalities belong to inner areas or not. |
ord_InnerAreas |
Logical. Whether the inner areas classification should be treated as an ordinal variable rather than as a categorical one (see |
input_Registry |
Object of class |
input_InnerAreas |
Object of class |
input_Prov_shp |
Object of class |
input_AdmUnNames |
Object of class |
input_School2mun |
Object of class |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Value
An object of class list including two elements:
-
$Municipality_data -
$Province_data
Both the elements are objects of class list including four elements:
-
$Registry_from_buildings: object of class of classtbl_df,tblanddata.frame: the availability of the number of students in the schools listed in the buildings section. -
$Registry_from_registry: object of class of classtbl_df,tblanddata.frame: the availability of the number of students in the schools listed in the registry section. -
$Any: object of class of classtbl_df,tblanddata.frame: the availability of the number of students in the schools listed anywhere. -
$Both: object of class of classtbl_df,tblanddata.frame: the availability of the number of students in the schools listed in both sections.
Source
Buildings Registry; Schools Registry
Examples
nstud23 <- Util_nstud_wide(example_input_nstud23, verbose = FALSE)
Util_Check_nstud_availability(nstud23, Year = 2023,
input_Registry = example_input_Registry23, InnerAreas = FALSE,
input_School2mun = example_School2mun23, input_Prov_shp = example_Prov22_shp)
Clean and convert the raw school buildings data to Boolean variables
Description
This function cleans the output of the Get_DB_MIUR function from missing values in two steps:
First, it deletes both the columns exceeding a threshold of missing values (1000 by default) and the columns that cannot be converted into Boolean variables
Then, it deletes the rows in which missing values remain
Finally, the remaining data are converted into Boolean variables. It is possible to keep track of the deleted rows.
Usage
Util_DB_MIUR_bool(
data = NULL,
cutout = NULL,
col_cut_thresh = 10^3,
verbose = TRUE,
track_deleted = TRUE,
autoAbort = autoAbort,
...
)
Arguments
data |
Object of class |
cutout |
Character. The columns to cut out. If |
col_cut_thresh |
Numeric. The threshold of missing values allowed for each variable.
If a variable as a higher number of missing observations, then it is cut out. |
verbose |
Logical. If |
track_deleted |
Logical. If |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Value
If track_deleted == TRUE, An object of class list including two objects:
-
$data: object of classtbl_df,tblanddata.frame, the output dataframe. All variables besides the first 8 ones (which identify the record) are numeric. -
$deleted: character. The school codes corresponding to deleted rows
If track_deleted == FALSE, the output is only the first element of the list.
Convert the raw school buildings data to numeric or Boolean variables
Description
This function transforms the output variables of the Get_DB_MIUR into Boolean or Numeric.
Additionally, it removes the columns with an excessive number of missing observations (20.000 by default), and if required it may also delete the rows including missing fields.
In this case, it is possible to keep track of the deleted rows.
Usage
Util_DB_MIUR_num(
data = NULL,
include_numerics = TRUE,
include_qualitatives = FALSE,
row_cutout = FALSE,
track_deleted = TRUE,
verbose = TRUE,
col_cut_thresh = 20000,
unique_buildings = TRUE,
flag_outliers = TRUE,
autoAbort = FALSE,
...
)
Arguments
data |
Object of class |
include_numerics |
Logical. Whether to include strictly numeric variables alongside with Boolean ones. |
include_qualitatives |
Logical. Whether to include qualitative variables alongside with Boolean ones. |
row_cutout |
Logical. Whether to filter out rows including missing fields. |
track_deleted |
Logical. If |
verbose |
Logical. If |
col_cut_thresh |
Numeric. The threshold of missing values allowed for each variable.
If a variable as a higher number of missing observations, then it is cut out. |
unique_buildings |
Logical. Whether to remove records in which the building code is duplicated and all other fields are as well.
As rows are combinations of building ID and school ID, if a school is hosted by several buildings, and each field other than
|
flag_outliers |
Logical. Whether to assign NA to outliers in numeric variables. |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
... |
Additional arguments to the function |
Details
The outliers to be set to NA if flag_outliers is active are defined as follows: School area or free area surface of less than 50 squared meters,
building volume of less than 150 cubic meters, 0 floors in the building.
Value
If track_deleted == TRUE, An object of class list including two objects:
-
$data: object of classtbl_df,tblanddata.frame, the output dataframe. -
$deleted: object of classtbl_df,tblanddata.frame. The school IDs of the deleted units.
If track_deleted == FALSE, the output is only the first element of the list.
Examples
library(magrittr)
DB23_MIUR_num <- example_input_DB23_MIUR %>% Util_DB_MIUR_num(track_deleted = FALSE)
DB23_MIUR_num[, -c(1,4,6,8,9,10)]
summary(DB23_MIUR_num)
Filter the Invalsi data by subject, school grade and year.
Description
This function filters the database of Invalsi scores (see Get_Invalsi_IS) by school year, education grade and subject and returns a dataframe in wide format.
Each row corresponds to one territorial unit (either municipality or province); the numerical variables are three (the mean score, the score's standard deviation and the students coverage percentage) for each selected subject.
Usage
Util_Invalsi_filter(
data = NULL,
subj = c("ELI", "ERE", "ITA", "MAT"),
grade = 8,
level = "LAU",
WLE = FALSE,
Year = 2023,
verbose = TRUE,
autoAbort = FALSE
)
Arguments
data |
Object of class |
subj |
Character. The school subject(s) to include, among |
grade |
Numeric. The school grade to chose. Either |
level |
Character. The level of aggregation of Invalsi census data. Either |
WLE |
Logical. Whether the variable to choose should be the average WLE score rather that the percentage of sufficient tests, if both are available. |
Year |
Numeric or character value. Reference school year for the data (last available is 2022/23).
Available in the formats: |
verbose |
Logical. If |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
Value
An object of class tbl_df, tbl and data.frame. For all subjects and school grades, the variables indicate:
-
MThe mean score, either WLE or percentage of sufficient tests -
SThe standard deviation of the score -
CThe students coverage percentage (expressed in the scale 1 - 100)
Examples
Util_Invalsi_filter(subj = c("Italian", "Mathematics"), grade = 5, level = "NUTS-3", Year = 2023,
WLE = FALSE, data = example_Invalsi23_prov)
Util_Invalsi_filter(subj = c("Italian", "Mathematics"), grade = 5, level = "NUTS-3", Year = 2023,
WLE = TRUE, data = example_Invalsi23_prov)
Invalsi23_high <- Util_Invalsi_filter(subj = "Italian", grade = c(10,13), level = "NUTS-3",
Year = 2023, data = example_Invalsi23_prov)
summary(Invalsi23_high)
Clean the raw dataframe of the number of students and arrange it in a wide format
Description
This function rearranges the output of the Get_nstud function in such a way to represent the
counts of students and, if required, either the number of students by class and number of classes, or
the counts of students per school timetable (running time) in a unique observation per school.
If the focus is on class size, this function firstly cleans the data from the outliers in terms of
average number of students by class at the school level and imputates the number of classes to 1 when missing.
Usage
Util_nstud_wide(
data = NULL,
missing_to_1 = FALSE,
nstud_imputation_thresh = 19,
UB_nstud_byclass = 99,
LB_nstud_byclass = 1,
filter_by_grade = FALSE,
UB_nstud_byclass_grade = NULL,
LB_nstud_byclass_grade = NULL,
verbose = TRUE,
autoAbort = FALSE,
...
)
Arguments
data |
Object of class |
missing_to_1 |
Logical. If focus is on class size, whether the number of classes should be imputed to 1 when it is missing and the number of students is below a threshold (argument |
nstud_imputation_thresh |
Numeric. If focus is on class size, the minimum threshold below which the number of classes is imputed to 1 if missing, if |
UB_nstud_byclass |
Numeric. Either a unique value for all school orders, or a vector of three order-specific values in the order: primary, middle, high.
If focus is on class size, the upper limit of the acceptable school-level (if |
LB_nstud_byclass |
Numeric. Either a unique value for all school orders, or a vector of three order-specific values in the order: primary, middle, wide.
If focus is on class size, the lower limit of the acceptable school-level (if |
filter_by_grade |
Logical. If focus is on class size, whether to remove all school grades with average class size outside of the acceptance boundaries. |
UB_nstud_byclass_grade |
Numeric. IF |
LB_nstud_byclass_grade |
Numeric. IF |
verbose |
Logical. If |
autoAbort |
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. |
... |
Arguments to |
Details
In the example, we compare the dataframe obtained with the default settings and the one imposed setting narrow inclusion criteria
Value
An object of class tbl_df, tbl and data.frame
Examples
nstud.default <- Util_nstud_wide(example_input_nstud23)
nstud.narrow <- Util_nstud_wide(example_input_nstud23,
UB_nstud_byclass = 35, LB_nstud_byclass = 5 )
nrow(nstud.default)
nrow(nstud.narrow)
nstud.default
summary(nstud.default)
Subset of the administrative codes of municipalities
Description
This table includes the administrative codes of the municipalities from four regions: Molise, Campania, Apulia and Basilicata,
as of June 30th 2022; some strings in field Municipality_description including accents have been forced to ASCII.
The whole dataset can be retrieved with the command Get_AdmUnNames(Date = "2022-06-30")
Usage
example_AdmUnNames20220630
Format
## 'example_AdmUnNames20220630' A data frame with 1,074 rows and 5 columns:
-
Province_codeNumeric; the NUTS-3 administrative code -
Province_initialsCharacter;abbreviated NUTS-3 denomination. -
Municipality_codeCharacter; the ISTAT LAU (municipality) ID. -
Municipality_descriptionCharacter; the municipality name. -
Cadastral_codeCharacter; a LAU - level ID code, different from the official ISTAT municipality code. It is used in the school registry (seeexample_input_Registry23)
Source
<https://www.istat.it/it/archivio/6789>
See Also
Subset of the school registry in school year 2022/23
Description
This dataframe includes the classification of municipalities , from four regions: Molise, Campania, Apulia and Basilicata.
Only the first 10 columns are included;
some strings in field Municipality_description including accents have been forced to ASCII.
The whole dataset can be retrieved with the command Get_InnerAreas().
For the definition of ISTAT inner areas class, see Get_InnerAreas
Usage
example_InnerAreas
Format
## 'example_InnerAreas' A data frame with 1074 rows and 10 columns:
-
Municipality_codeCharacter; the ISTAT LAU (municipality) ID. -
Municipality_code_numericNumeric; the ISTAT LAU (municipality) ID in numeric format. -
Cadastral_codeCharacter; a LAU - level ID code, different from the official ISTAT municipality code. -
Region_codeNumeric; the region (NUTS-2 administrative level) ID -
Region_descriptionCharacter; the region (NUTS-2 administrative level) name. -
Province_codeNumeric; the NUTS-3 administrative code. -
Province_initialsCharacter; abbreviated NUTS-3 denomination. -
Province_descriptionCharacter; the province (NUTS-3 administrative level) denomination. -
Municipality_descriptionCharacter; the municipality name. -
Inner_area_code_2014_2020Character; the ISTAT inner areas classification between 2014 and 2020. -
Inner_area_description_2014_2020Character; the description of the classes identified in the previous column -
Inner_area_code_2021_2027Character; the ISTAT inner areas classification between 2021 and 2027. -
Inner_area_description_2021_2027Character; the description of the classes identified in the previous column -
Destination_municipality_codeCharacter; For non-central municipalities (classes C, D, E, F), the ID of the closest pole municipality according to the 2021-2027 classification -
Destination_municipality_codeCharacter; The denomination of the municipalities in the previous column -
Destination_pole_codeCharacter; An internal ID convention for the destination poles; it includes a letter (the class of the destination pole, either A or B); a number of two digits (the region code of the destination pole) and the progressive number of poles within a region.
Source
<https://www.istat.it/it/archivio/273176>
See Also
Subset of the Invalsi scores in school year 2022/23
Description
This dataframe includes the Invalsi scores of the schools from four regions: Molise, Campania, Apulia and Basilicata, for the school year 2022/23.
The whole dataset can be retrieved with the command Get_Invalsi_IS(level = "NUTS-3")
Usage
example_Invalsi23_prov
Format
## 'example_Invalsi23_prov' A data frame with 240 rows and 11 columns:
-
YearCharacter; the school year. -
GradeNumeric; the school grade; only includes the school grades subjected to the Invalsi survey. Either 2, 5, 8, 10 or 13. -
SubjectCharacter; the school subject in which the test is taken; either Italian, Mathematics, English reading or English listening. -
Province_codeNumeric; the NUTS-3 administrative code. -
Province_initialsCharacter; abbreviated NUTS-3 denomination. -
Province_descriptionCharacter; the province (NUTS-3 administrative level) denomination. -
Average_percentage_scoreNumeric; the province-level percentage of sufficient tests, only for primary schools; ranges 0-100. -
Std_dev_percentage_scoreNumeric; the standard deviation of the percentage of sufficient tests, only for primary schools. -
WLE_average_scoreNumeric; the province-level average WLE (Weighted Likelihood Estimator) score. -
Std_dev_WLE_scoreNumeric; the standard deviation of WLE scores. -
Students_coverageNumeric; the percentage of students for which the Invalsi tests are reported.
Source
<https://serviziostatistico.invalsi.it/en/archivio-dati/?_sft_invalsi_ss_data_collective=open-data>
See Also
Subset of Italian provinces shapefile
Description
This is the shapefile for the provinces belonging to four regions: Molise, Campania, Apulia and Basilicata,
as of January 1st 2022. These are the latest administrative units boundaries relevant at the beginning of the school year 2022/23.
The whole shapefile can be retrieved with the command Get_Shapefile(Year = 2022, level = "NUTS-3")
Usage
example_Prov22_shp
Format
## 'example_Prov22_shp' A Spatial polygon data frame with 13 rows/polygons and 15 columns:
-
COD_RIPNumeric; the code for the macroarea (1 for Northwest, 2 for Northeast, 3 for Center, 4 for South and 5 for Isles) -
COD_REGNumeric; the region (NUTS-2 administrative level) ID -
COD_PROVNumeric; the NUTS-3 administrative code -
COD_CMNumeric; the administrative code for Metropolitan Cities (which are always at the NUTS-3 level), obtained as 200 + NUTS-3 code, if the unit is a Metropolitan city; 0 otherwise. -
COD_UTSNumeric; the administrative code for Metropolitan cities if the unit is a Metropolitan City; the province code otherwise. -
DEN_PROVCharacter; the province (NUTS-3 administrative level) name, if the unit is not a Metropolitan City; blank otherwise. -
DEN_CMCharacter; the Metropolitan City (NUTS-3 administrative level) name, if the unit is a Metropolitan City; blank otherwise. -
DEN_UTSCharacter; the province or Metropolitan City (NUTS-3 administrative level) name. -
SIGLACharacter; abbreviated NUTS-3 denomination. -
TIPO_UTSCharacter; the NUTS-3 type of the unit; either "Provincia" (Province) or "Citta metropolitana" (Metropolitan City) -
Shape_LengNumeric; the polygon perimeter. -
Shape_AreaNumeric; the polygon area. -
geometrythe polygon geometry.
Source
<https://www.istat.it/it/archivio/222527>
See Also
Association of the municipality code to a subset of public schools 2022/23
Description
This list maps the IDs of the schools from four regions (Molise, Campania, Apulia and Basilicata) to the corresponding LAU codes.
The whole dataset can be retrieved with the command Get_School2mun(2023)
Usage
example_School2mun23
Format
## 'example_School2mun23' A list of four elements
-
Registry_from_buildingsA data frame of 5527 rows and 5 columns, including the schools listed in the buildings registry. -
Registry_from_registryA data frame of 5929 rows and 5 columns, including the schools listed in the schools registry. -
AnyA data frame of 5954 rows and 5 columns, including schools listed in any of the registryes -
BothA data frame of 5510 rows and 5 columns, including schools listed in both registries
For each element, rows correspond to school IDs; the columns are:
-
School_codeCharacter; the school ID. -
Province_codeNumeric; the NUTS-3 administrative code. -
Province_initialsCharacter; abbreviated NUTS-3 denomination. -
Municipality_codeCharacter; the ISTAT LAU (municipality) ID. -
Municipality_descriptionCharacter; the municipality name.
Source
Buildings registry (2021 onwards); Buindings registry(until 2019); Schools registry
See Also
Subset of the school buildings database in school year 2022/23
Description
This dataframe includes the schools directly identifiable as primary, middle or high school, from four regions: Molise, Campania, Apulia and Basilicata.
Only the first 35 columns are included. Some strings including accents in fields Other_disturbances_proximity,
Other_specific_criticalities and Other have been forced to ASCII.
The whole dataset can be retrieved with the command Get_DB_MIUR(2023)
Usage
example_input_DB23_MIUR
Format
## 'example_input_DB23_MIUR' A data frame with 7479 rows and 35 columns:
-
YearNumeric; the school year. -
School_codeCharacter; the school ID. -
OrderCharacter; the school order, either primary, middle or high school. -
Reference_institute_codeCharacter; the ID of the reference institute. -
Building_codeCharacter; the building ID; the first 6 digits usually identify the municipality. -
Municipality_codeCharacter; the ISTAT LAU (municipality) ID. -
Municipality_descriptionCharacter; the municipality name. -
Province_initialsCharacter; abbreviated NUTS-3 denomination. -
Postal_codeCharacter; the ZIP code; slightly finer than municipality boundaries. for big municipalities. -
Context_without_disturbancesCharacter; whether the school belongs to an environment devoid of disturbances; otherwise, the types of disturbances are listed in columns 11 - 18. -
Dumps_proximityCharacter; whether the school is close to dumps (disturbance element). -
Pollutant_industries_proximityCharacter; whether the school is close to pollutant industries (disturbance element). -
Pollutant_waters_proximityCharacter; whether the school is close to pollutant or stagnant streams or ponds (disturbance element). -
Air_pollution_sourcer_proximityCharacter; whether the school is close to sources of air pollution (disturbance element). -
Acoustic_pollution_sourcer_proximityCharacter; whether the school is close to sources of acoustic pollution (disturbance element). -
Electromagnetic_radiation_sources_proximityCharacter; whether the school is close to sources of electromagnetic radiation (disturbance element). -
Graveyards_proximityCharacter; whether the school is close to a graveyard (disturbance element). -
Other_disturbances_proximityCharacter; other disturbance elements to which the school is close, other than those already listed. -
School_area_specific_criticalitiesCharacter; whether any specific criticality element occurs inside the school area; specified in columns 20 - 27. -
Layby absenceCharacter; whether the access to the area pertaining to the school building lacks a lay-by or pitch (school area criticality element). -
Unfenced areaCharacter; whether the school building area lacks fences or enclosures (school area criticality element). -
Large_trafficCharacter; whether the school area is close to large traffic streams (school area criticality element). -
Railway_trafficCharacter; whether the school area is close to railway traffic streams (school area criticality element). -
Abandoned_industriesCharacter; whether the school area is located in pre-existences of abandoned industries (school area criticality element). -
Decayed_urban_areaCharacter; whether the school belongs or is close to a decayed area (school area criticality element). -
Risky_industries_proximityCharacter; whether the school is close to perilous industrial areas (school area criticality element). -
Other_specific_criticalitiesCharacter; specific criticality elements regarding the school area, other than those already listed. -
School_busCharacter; whether the school is reached by school-bus service. -
Urban_public_transportCharacter; whether the school is served by a urban public transport station in the range of 250 meters. -
Interurban_public_transportCharacter; whether the school is served by a inter-urban public transport station in the range of 500 meters. -
Railway_transportCharacter; whether the school ranges 500 meters or less from a train station. -
Private_transportCharacter; whether the school can be reached by private transport. -
Disabled_people_transportCharacter; whether the school is provided with disabled people specific transport. -
Bicycle_laneCharacter; whether the building is in proximity of a bicycle/bike lane. -
OtherCharacter; whether the building can be reached in any other specific way.
Source
Homepage; more in detail, the dataset blocks are downloaded respectively from: cols 10-18; cols 20-27; cols 28-35
See Also
Subset of the school registry in school year 2022/23
Description
This dataframe includes the schools directly identifiable as primary, middle or high school, from four regions: Molise, Campania, Apulia and Basilicata.
Only the first 10 columns are included.
The whole dataset can be retrieved with the command Get_Registry(2023)
Usage
example_input_Registry23
Format
## 'example_input_Registry23' A data frame with 5929 rows and 10 columns:
-
YearNumeric; the school year. -
AreaCharacter; the macro-area of the municipality, i.e. North, Center or South. -
Region_descriptionCharacter; the region (NUTS-2 administrative level) name. -
Province_descriptionCharacter; the province (NUTS-3 administrative level) name. -
Reference_institute_codeCharacter; the ID of the reference institute. -
School_codeCharacter; the school ID. -
Cadastral_codeCharacter; a LAU - level ID code, different from the official LAU municipality code. The Italian Ministry of Education does provide this code in the place of the LAU code for both the Schools registry and the early school buildings DBs. -
Municipality_descriptionCharacter; the municipality name. -
School_addressCharacter; the school physical address. -
Postal_codeCharacter; the ZIP code, slightly finer than municipality boundaries for big municipalities.
Source
See Also
Subset of the students and classes counts in school year 2022/23
Description
This dataframe includes students and classes counts for the schools from four regions: Molise, Campania, Apulia and Basilicata.
The whole dataset can be retrieved with the command Get_nstud(2023, filename = "ALUCORSOINDCLASTA")
Usage
example_input_nstud23
Format
## 'example_input_nstud23' A data frame with 21208 rows and 7 columns:
-
YearNumeric; the school year. -
School_codeCharacter; the school ID. -
OrderCharacter; the school order, either primary, middle or high school. -
GradeNumeric; the school grade. -
ClassesNumeric; the count of classes of a given grade in each school -
Male_studentsNumeric; the count of male students in all classes of a given educational grade in each school -
Female_studentsNumeric; the count of female students in all classes of a given educational grade in each school