01 Get started

1) Installation and Technical Requirements

Introduction

Several packages allow users to use machine learning directly in R such as nnet for single layer neural nets, rpart for decision trees, and ranger for random forests. Furthermore, with mlr3verse a series of packages exists for managing different algorithms with a unified interface.

These packages can be used with a ‘normal’ computer and provide an easy installation. In terms of natural language processing, these approaches are currently limited. State-of-the-art approaches rely on neural nets with multiple layers and consist of a huge number of parameters making them computationally demanding. With specialized libraries such as keras and tensorflow, graphical processing units (gpu) can help to speed up computations significantly. However, many of these specialized libraries for machine learning are written in python. Fortunately, an interface to python is provided via the R package reticulate.

The R package Artificial Intelligence for Education (aifeducation) aims to provide educators, educational researchers, and social researchers a convincing interface to these state-of-the-art models for natural language processing and tries to address the special needs and challenges of the educational and social sciences. The package currently supports the application of Artificial Intelligence (AI) for tasks such as text embedding, classification, and question answering.

Since state-of-the-art approaches in natural language processing rely on large models compared to classical statistical methods (e.g., latent class analysis, structural equation modeling) and are based largely on python, some additional installation steps are necessary.

If you would like to train and to develop your own models and AIs, a compatible graphic device is necessary. Even a low performing graphic device can speed up computations significantly. If you prefer using pre-trained models however, this is not necessary. In this case a ‘normal’ office computer without a graphic device should be sufficient in most cases.

Step 1 - Install the R Package

In order to use the package, you first need to install it. This can be done by:

# install.packages("devtools")
devtools::install_github("FBerding/aifeducation",
                         dependencies = TRUE)

With this command, all necessary R packages are installed on your machine.

Step 2 - Install Python

Since natural language processing with neural nets is based on models which are computationally intensive, keras and tensorflow are used within this package together with some other specialized python libraries. To install them, you need to install python on your machine first. This may take some time.

reticulate::install_python()

You can check if everything is working by using the function reticulate::py_available(). This should return TRUE.

reticulate::py_available(initialize = TRUE)

Step 3 - Install Miniconda

The next step is to install miniconda since aifeducation uses conda environments for managing the different modules.

reticulate::install_miniconda()

Step 4 - Install Support for Graphic Devices

If you would like to use a graphic card for computations you need to install some further software. A list with links to downloads can be found here: https://www.tensorflow.org/install/pip#linux

In general you need

Step 5 - Install Specialized Python Libraries

If everything is working, you can now install the remaining python libraries. For convenience, aifeducation comes with an auxiliary function install_py_modules() doing that for you.

install_py_modules(envname="aifeducation")

This function installs the following python modules:

and its dependencies in the environment “aifeducation”.

If you would like to use aifeducation with other packages or within other environments, please ensure that these python modules are available.

With check_aif_py_modules() you can check, if all modules are successfully installed.

aifeducation::check_aif_py_modules(print=TRUE)

Now everything is ready to use the package.

When you start a new R session, please note that you have to call reticulate::use_condaenv(condaenv = "aifeducation") to make the python modules available for work.

2) Configuration

In general, educators and educational researchers neither have access to high performance computing nor do they own computers with a performing graphic device for their work. Thus, some additional configuration can be done to get computations working on your machine.

If you do use a computer that does not own a graphic device, you can disable the graphic device support of tensorflow with the function set_config_cpu_only().

aifeducation::set_config_cpu_only()

Now your machine only uses cpus for computations.

If your machine has a graphic card with limited memory, it is recommended to change the configuration of the memory usage with set_config_gpu_low_memory()

aifeducation::set_config_gpu_low_memory()

This enables your machine to compute ‘large’ models with limited resources. For ‘small’ models, this option is not relevant since it decreases the computational speed.

Finally, in some cases you might want to disable tensorflow to print information on the console. You can change the behavior with the function set_config_tf_logger().

aifeducation::set_config_tf_logger()

You can choose between five levels “FATAL”, “ERROR”, “WARN”, “INFO”, and “DEBUG”, setting the minimal level for logging.