The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Manual Symbolic Regression: Testing Hypotheses

Introduction

In addition to automated symbolic regression, leaf allows users to define their own candidate equations using the "manual" engine. This enables direct testing of hypotheses and incorporation of prior knowledge, while still leveraging leaf’s tools for parameter fitting, evaluation, and multi-view modeling.

Installation

Before using leafr, ensure the Python backend is installed:

leafr::install_leafr()

Load package

library(leaf)
if (!backend_available()) {
  message("Install backend with leaf::install_leaf()")
}

Define the formula and custom equations

User-defined equations are specified as character strings. These can include:

x1, x2, … referring to inputs defined in the formula (by position)
Variable names directly, corresponding to column names in the dataframe
u1, u2, … for group-specific parameters
c1, c2, … for global parameters

model_formula <- "y ~ f(log(A), T, T**2, A | Archipelago, species)"
eqs <- c(
  "T**2*(u1 + u2*log(A) + u3*T)",
  "x3*(u1 + u2*x1 + u3*x2)",  # same as above
  "exp(u1 + u2*log(T) + u3*A*x2)"  # can mix both, but if using A directly in the equation need to specify it in the formula
)

Define the manual search

regressor <- SymbolicRegressor$new(
  engine = "manual",
  loss = "PoissonDeviance",
  equation_list = eqs
)

Load the data

train_data <- leaf_data("GMD")
#> Warning in leaf_data("GMD"): Invalid data name. Run leaf_data() for a
#> full list of options.
head(train_data)
#> NULL

Register equations

Even in manual mode, search_equations() is used to register and preprocess the equations. No search is performed.

regressor$search_equations(
  data = train_data,
  formula = model_formula
)
#> Error in `py_call_impl()`:
#> ! TypeError: object of type 'NoneType' has no len()
#> Run `reticulate::py_last_error()` for details.

Fit parameters and inspect results

# Only one equation gets a finite loss
fit_results <- regressor$fit(data = train_data)
#> Error in `py_call_impl()`:
#> ! RuntimeError: You must run equation_search() before fitting parameters.
#> Run `reticulate::py_last_error()` for details.
pareto_front <- regressor$evaluate(metrics = c("RMSE", "PseudoR2"))
#> Error in `py_call_impl()`:
#> ! RuntimeError: You must run equation_search() before scoring.
#> Run `reticulate::py_last_error()` for details.
head(pareto_front)
#> Error:
#> ! object 'pareto_front' not found

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.