The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

1. Introduction

geeLite simplifies the process of building, managing, and updating local SQLite databases containing geospatial features extracted from Google Earth Engine (GEE). This vignette covers the installation, configuration, data collection, and analysis workflows using the geeLite package.

For more detailed information and updates, visit the geeLite GitHub repository.

2. Installation

To install geeLite from GitHub, use the following commands:

# Install devtools if not installed
# install.packages("devtools")

# Install geeLite from GitHub
devtools::install_github("mtkurbucz/geeLite")

# Install Python dependencies and setup rgee
geeLite::gee_install()

3. Workflow

The workflow for setting up and using geeLite includes configuration, data collection, modification, and data analysis.

3.1 Configuration

The first step in using geeLite is setting up a configuration file. This file specifies the regions of interest, the datasets to collect from GEE, and other parameters for data collection.

# Define the path for the SQLite database
path <- path_to_db

# Create a configuration for Somalia (SO) and Yemen (YE) to collect NDVI data
set_config(
  path = path,
  regions = c("SO", "YE"),
  source = list(
    "MODIS/061/MOD13A2" = list(
      "NDVI" = c("mean", "sd")
    )
  ),
  resol = 3,
  start = "2020-01-01"
)

This configuration will create a JSON file at the specified path that defines the parameters for collecting data from GEE.

3.2 Data Collection

Once the configuration is set, you can collect data from GEE using the run_geelite() function. This function retrieves data based on the configuration file and stores it in a local SQLite database.

# Collect the data and store it in the SQLite database
run_geelite(path = path)

The function will store the collected geospatial data in the SQLite database and log the progress.

3.3 Modifying Configuration

If you want to modify the configuration (e.g., to add new statistics or datasets), you can use the modify_config() function to make updates without rebuilding the entire database.

# Add more statistics to the NDVI band and include EVI data
modify_config(
  path = path,
  keys = list(
    c("source", "MODIS/061/MOD13A2", "NDVI"),
    c("source", "MODIS/061/MOD13A2", "EVI")
  ),
  new_values = list(
    c("mean", "min", "max"),
    c("mean", "sd")
  )
)

After modifying the configuration, run run_geelite() again to update the database with the new settings.

3.4 Reading and Analyzing Data

Once the data has been collected and stored in the database, you can read and analyze it using the read_db() function. This function allows you to aggregate data at different frequencies (e.g., daily, monthly) and apply preprocessing functions.

# Read the data from the database, aggregate to monthly frequency by default
db <- read_db(path = path)

# Read the data with custom aggregation functions
db <- read_db(path = path, aggr_funs = list(
  function(x) mean(x, na.rm = TRUE),
  function(x) sd(x, na.rm = TRUE))
)

4. Usage Example

This example demonstrates how to use geeLite to gather NDVI data for Somalia and Yemen, aggregate it monthly, and visualize the results using the leaflet package.

4.1. Collecting the Data

# Define the path for the SQLite database
path <- path_to_db

# Set the configuration file for NDVI data collection
set_config(
  path = path,
  regions = c("SO", "YE"),
  source = list(
    "MODIS/061/MOD13A2" = list(
      "NDVI" = c("mean", "sd")
    )
  ),
  resol = 3,
  start = "2020-01-01"
)
#> ℹ Config file generated: 'config/config.json'.

# Collect the data
run_geelite(path = path)
#> 
#> ────────────────────────────────────────────────────────────────────────────────
#> geeLite R Package - Version: 1.0.2 
#> 
#> ────────────────────────────────────────────────────────────────────────────────
#> 
#> ── rgee 1.1.7 ─────────────────────────────────────── earthengine-api 0.1.370 ── 
#> ✔ User: not defined 
#> ✔ Initializing Google Earth Engine: DONE! 
#> ✔ Earth Engine account: users/testgeelite 
#> ✔ Python path: C:/Users/Marcell/AppData/Local/r-miniconda/envs/rgee/python.exe 
#> ──────────────────────────────────────────────────────────────────────────────── 
#> 
#> > Extracting data from Earth Engine...
#> ℹ Database successfully updated: 'data/geelite.db'.
#> ℹ State file updated: 'state/state.json'.
#> ℹ CLI scripts updated: 'cli/R functions'.

# Read the data from the database
db <- read_db(path = path, freq = "month")

4.2. Visualizing the Data

Once the data is collected, you can visualize it using the leaflet package. The following code shows how to plot the mean NDVI values for each region.

# Load necessary packages
library(leaflet)
#> Warning: package 'leaflet' was built under R version 4.3.3
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(sf)
#> Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE

# Read database and merge grid with MODIS data
sf <- merge(db$grid, db$`MODIS/061/MOD13A2/NDVI/mean`, by = "id")

# Select the date to visualize
ndvi <- sf$`2020-01-01`

# Create a color palette function based on the values
color_pal <- colorNumeric(palette = "viridis", domain = ndvi)

# Create the leaflet map
leaflet(data = sf) %>%
  addTiles() %>%                            # Add base tiles
  addPolygons(
    fillColor = color_pal(ndvi),            # Fill color
    color = "#BDBDC3",                      # Border color
    weight = 1,                             # Border weight
    opacity = 1,                            # Border opacity
    fillOpacity = 0.9                       # Fill opacity
  ) %>%
  addScaleBar(position = "bottomleft") %>%  # Add scale bar
  addLegend(
    pal = color_pal,                        # Color palette
    values = ndvi,                          # Data values to map
    title = "Mean NDVI",                    # Legend title
    position = "bottomright"                # Legend position
  )

5. Extras

Additional features of the geeLite package include a drive mode for efficiently handling large data requests, as well as command-line interface (CLI) support for automation and integration with job scheduling systems like cron.

5.1 Drive Mode

To efficiently handle large data requests, drive mode exports data in parallel batches to Google Drive before importing it into your local SQLite database. Ensure that adequate Google Drive storage is available before using drive mode.

# Collect and store data using drive mode
run_geelite(path = path, mode = "drive")

5.2 Command-Line Interface (CLI)

geeLite provides a CLI that allows you to run the main functions of the package directly from the command line. This is useful for automating workflows or integrating geeLite into larger systems.

# Setting the CLI files
Rscript /path/to/geeLite/cli/set_cli.R --path "path/to/db"

# Change directory to where the database will be generated
cd "path/to/db"

# Set up the configuration via CLI
Rscript cli/set_config.R --regions "SO YE" --source "list('MODIS/061/MOD13A2' = list('NDVI' = c('mean', 'min')))" --resol 3 --start "2020-01-01"

# Collecting GEE data via CLI
Rscript cli/run_geelite.R

# Modifying the configuration via CLI
Rscript cli/modify_config.R --keys "list(c('source', 'MODIS/061/MOD13A2', 'NDVI'), c('source', 'MODIS/061/MOD13A2', 'EVI'))" --new_values "list(c('mean', 'min', 'max'), c('mean', 'sd'))"

# Updating the database via CLI
Rscript cli/run_geelite.R

5.3 Automating Data Collection with Cron

You can automate database updates using the Linux cron job scheduler. Here’s an example cron job that updates the database monthly:

# Open the cron jobs file
crontab -e

# Add the following line to schedule monthly updates (runs on the 1st of every month)
0 0 1 * * Rscript /path/to/db/cli/run_geelite.R

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.