The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Creating a Singularity Container to Run HuggingFace Transformers Models in R

Singularity is a container engine alternative to Docker. Singularity containers are well suited for the requirements of High Performance Computing (HPC) workloads.

A container contains all code as well as all its dependencies so that the an application runs reliably on different computers (or different computing environments). It can be used to run on servers or as a way to ensure computational reproducibility (that the code run on other systems, and in the future). For an introduction to the concept of containers see Computational Reproducibility via Containers in Psychology. Below is code to build a Singularity container for setting up transformers language models from HuggingFace and running the text-package.

Code to build a singularity container with HuggingFace models in R

Bootstrap: docker
From: ubuntu:20.04

%environment
  export LANG=C.UTF-8 LC_ALL=C.UTF-8
  export XDG_RUNTIME_DIR=/tmp/.run_$(uuidgen)

%post
    # Install
    apt-get -y update

    export R_VERSION=4.2.2
    echo "export R_VERSION=${R_VERSION}" >> $SINGULARITY_ENVIRONMENT

     # Install R
     apt-get update
     apt-get install -y --no-install-recommends software-properties-common dirmngr  wget uuid-runtime
     wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | \
       tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
     add-apt-repository \
       "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"
     apt-get install -y --no-install-recommends \
     r-base=${R_VERSION}* \
     r-base-core=${R_VERSION}* \
     r-base-dev=${R_VERSION}* \
     r-recommended=${R_VERSION}* \
     r-base-html=${R_VERSION}* \
     r-doc-html=${R_VERSION}* \
     libcurl4-openssl-dev \
     libharfbuzz-dev \
     libfribidi-dev \
     libgit2-dev \
     libxml2-dev \
     libfontconfig1-dev \
     libssl-dev \
     libxml2-dev \
     libfreetype6-dev \
     libpng-dev \
     libtiff5-dev \
     libjpeg-dev
     
     # Add a default CRAN mirror
     echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site

     # Fix R package libpaths (helps RStudio Server find the right directories)
     mkdir -p /usr/lib64/R/etc
     echo "R_LIBS_USER='/usr/lib64/R/library'" >> /usr/lib64/R/etc/Renviron
     echo "R_LIBS_SITE='${R_PACKAGE_DIR}'" >> /usr/lib64/R/etc/Renviron
     # Clean up
     rm -rf /var/lib/apt/lists/*

     # Install python3
     apt-get -y install python3 wget
     apt-get -y clean

     # Install Miniconda
     cd /
     wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
     bash Miniconda3-latest-Linux-x86_64.sh -b -p /miniconda

/bin/bash <<EOF
     rm Miniconda3-latest-Linux-x86_64.sh
     source /miniconda/etc/profile.d/conda.sh
     conda update -y conda
     # Install reticulate and text
         Rscript -e 'install.packages("pkgdown")'
     Rscript -e 'install.packages("ragg")'
     Rscript -e 'install.packages("textshaping")'
     Rscript -e 'install.packages("reticulate")'
     Rscript -e 'install.packages("devtools")'
     Rscript -e 'install.packages("glmnet")'
     Rscript -e 'install.packages("tidyverse")'
#     Rscript -e 'install.packages("text")'
     Rscript -e 'devtools::install_github("oscarkjell/text")'
     # Create the Conda environment at a system folder
     Rscript -e 'text::textrpp_install(prompt = FALSE, rpp_version = c("torch==1.11.0", "transformers==4.19.2", "numpy", "nltk"))'
     Rscript -e 'text::textrpp_initialize(save_profile = TRUE, prompt = FALSE, textEmbed_test = TRUE)'
     Rscript -e 'text::textEmbed("hello", model = "distilbert-base-uncased", layers = 5)'
     Rscript -e 'text::textEmbed("hello", model = "roberta-base", layers = 11)'

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.