The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

The goal of ucimlrepo is to download and import data
sets directly into R from the UCI
Machine Learning Repository.
[!IMPORTANT]
This package is an unoffical port of the Python
ucimlrepopackage.
[!NOTE]
Want to have datasets alongside a help documentation entry?
Check out the
{ucidata}R package! The package provides a small selection of data sets from the UC Irvine Machine Learning Repository alongside of help entries.
You can install the development version of ucimlrepo from GitHub with:
# install.packages("remotes")
remotes::install_github("coatless-rpkg/ucimlrepo")To use ucimlrepo, load the package using:
library(ucimlrepo)With the package now loaded, we can download a dataset using the
fetch_ucirepo() function or use the
list_available_datasets() function to view a list of
available datasets.
For example, to download the iris dataset, we can
use:
# Fetch a dataset by name
iris_by_name <- fetch_ucirepo(name = "iris")
names(iris_by_name)
#> [1] "data" "metadata" "variables"There are many levels to the data returned. For example, we can
extract the original data frame containing the iris dataset
using:
iris_uci <- iris_by_name$data$original
head(iris_uci)
#> sepal length sepal width petal length petal width class
#> 1 5.1 3.5 1.4 0.2 Iris-setosa
#> 2 4.9 3.0 1.4 0.2 Iris-setosa
#> 3 4.7 3.2 1.3 0.2 Iris-setosa
#> 4 4.6 3.1 1.5 0.2 Iris-setosa
#> 5 5.0 3.6 1.4 0.2 Iris-setosa
#> 6 5.4 3.9 1.7 0.4 Iris-setosaAlternatively, we could retrieve two data frames, one for the features and one for the targets:
iris_features <- iris_by_name$data$features
iris_targets <- iris_by_name$data$targetsWe can then view the first few rows of each data frame:
head(iris_features)
#> sepal length sepal width petal length petal width
#> 1 5.1 3.5 1.4 0.2
#> 2 4.9 3.0 1.4 0.2
#> 3 4.7 3.2 1.3 0.2
#> 4 4.6 3.1 1.5 0.2
#> 5 5.0 3.6 1.4 0.2
#> 6 5.4 3.9 1.7 0.4head(iris_targets)
#> class
#> 1 Iris-setosa
#> 2 Iris-setosa
#> 3 Iris-setosa
#> 4 Iris-setosa
#> 5 Iris-setosa
#> 6 Iris-setosaAlternatively, you can also directly query by using an ID found by
using list_available_datasets() or by looking up the
dataset on the UCI ML Repo website:
# Fetch a dataset by id
iris_by_id <- fetch_ucirepo(id = 53)We can also view a list of data sets available for download using the
list_available_datasets() function:
# List available datasets
list_available_datasets()[!NOTE]
Not all 600+ datasets on UCI ML Repo are available for download using the package. The current list of available datasets can be viewed here.
If you would like to see a specific dataset added, please submit a comment on an issue ticket in the upstream repository.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.