The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
FeatureFinder is designed to give comprehensive and accurate sets of features which can be used in modelling, either to build a new model or to enhance and diagnose an existing model. Both methods are available through a single function, FindFeatures. The following give examples for each method.
A typical modelling scenario involves a table consisting of a set of predictors \(\{x_i\}\) and a model target \(y\).
A typical modelling scenario involves a table consisting of a set of predictors \(\{x_i\}\) and a model residual \(r = y - p\), where \(p\) is a prediction from a previously-fit model and \(y\) is the model target
For a model target, we simply define \(r=y\), and for a model residual we use the residual \(r = y - p\). We supply a single table consisting of all predictors together with \(r\) and call the FindFeatures function.
The function generates a decision tree for the entire table, as well as decision trees for every possible subset of the table. Subsets are defined using factor-valued columns in the data. They can either be user-defined or already included as predictors.
Each decision tree will consist of an rpart tree as shown:
requireNamespace("png", quietly = TRUE)
<- png::readPNG(paste0(getwd(),"/../inst/extdata/test/ALL_ALL_medres2.png"))
img ::grid.raster(img) grid
A summary of residual nodes according to user-specified criteria for residual value and leaf volume will also be generated, in txt files (for example treesAll.txt and allfactors.txt). These contain a summary of each significant term with its definition, volume and other parameters as shown:
Partition variable (categorical factor)
The level of the partition variable
Residual
Leaf volume vs partition volume (percentage)
Leaf volume
Partition volume
Expected value average in leaf
Actual value average in leaf
Residual (checkval)
Leaf definition within this partition of the partition variable
In the example, partitioning enables significant leaves to be found for each partition, although the full dataset does not yield leaves in the fitted tree. This illustrates the benefits of the partitioning technique.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.