The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Type: Package
Title: Sequence Generalization Through Similarity Network
Version: 1.1.0
Author: Giancarlo Vercellino
Maintainer: Giancarlo Vercellino <giancarlo.vercellino@gmail.com>
Description: Proposes an application for sequence prediction generalizing the similarity within the network of previous sequences.
License: GPL-3
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.1
Depends: R (≥ 3.6)
Imports: purrr (≥ 0.3.4), ggplot2 (≥ 3.3.5), readr (≥ 2.1.2), lubridate (≥ 1.7.10), imputeTS (≥ 3.2), fANCOVA (≥ 0.6-1), scales (≥ 1.1.1), tictoc (≥ 1.0.1), modeest (≥ 2.4.0), moments (≥ 0.14), greybox (≥ 1.0.1), philentropy (≥ 0.5.0), entropy (≥ 1.3.1), Rfast (≥ 2.0.6), narray (≥ 0.4.1.1), fastDummies (≥ 1.6.3)
URL: https://rpubs.com/giancarlo_vercellino/segen
NeedsCompilation: no
Packaged: 2022-08-15 19:16:56 UTC; gvercellino
Repository: CRAN
Date/Publication: 2022-08-15 19:30:02 UTC

segen

Description

Sequence Generalization Through Similarity Network

Usage

segen(
  df,
  seq_len = NULL,
  similarity = NULL,
  dist_method = NULL,
  rescale = NULL,
  smoother = FALSE,
  ci = 0.8,
  error_scale = "naive",
  error_benchmark = "naive",
  n_windows = 10,
  n_samp = 30,
  dates = NULL,
  seed = 42
)

Arguments

df

A data frame with time features on columns. They could be numeric variables or categorical, but not both.

seq_len

Positive integer. Time-step number of the forecasting sequence. Default: NULL (automatic selection between 2 and max limit).

similarity

Positive numeric. Degree of similarity between two sequences, based on quantile conversion of distance. Default: NULL (automatic selection between 0.01, maximal difference, and 0.99, minimal difference).

dist_method

String. Method for calculating distance among sequences. Available options are: "euclidean", "manhattan", "maximum", "minkowski". Default: NULL (random search).

rescale

Logical. Flag to TRUE for min-max scaling of distances. Default: NULL (random search).

smoother

Logical. Flag to TRUE for loess smoothing. Default: FALSE.

ci

Confidence interval for prediction. Default: 0.8

error_scale

String. Scale for the scaled error metrics (for continuous variables). Two options: "naive" (average of naive one-step absolute error for the historical series) or "deviation" (standard error of the historical series). Default: "naive".

error_benchmark

String. Benchmark for the relative error metrics (for continuous variables). Two options: "naive" (sequential extension of last value) or "average" (mean value of true sequence). Default: "naive".

n_windows

Positive integer. Number of validation windows to test prediction error. Default: 10.

n_samp

Positive integer. Number of samples for random search. Default: 30.

dates

Date. Vector with dates for time features.

seed

Positive integer. Random seed. Default: 42.

Value

This function returns a list including:

Author(s)

Giancarlo Vercellino giancarlo.vercellino@gmail.com

See Also

Useful links:

Examples

segen(time_features[, 1, drop = FALSE], seq_len = 30, similarity = 0.7, n_windows = 3, n_samp = 1)



time features example: IBM and Microsoft Close Prices

Description

A data frame with with daily with daily prices for IBM and Microsoft since April 2020

Usage

time_features

Format

A data frame with 2 columns and 1324 rows.

Source

finance.yahoo.com

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.