This package STSMotifs
allows to perform a research of motif in spatial-time series. The main purpose is to find a way to handle the issue of large amounts of data. The package offers a way to do this research quickly and efficiently.
To find the motifs, the CSAMiningProcess
is used.
The process is decomposed by several steps :
To use functions of this package, some inputs are needed. The quality of outputs depends strongly by these parameters.
dataset
: Dataframe which contains numerics values. Columns represent the space and rows the time.#> 1 2 3 4 5 6 7 8 9 10
#> 1 737 1350 869 750 1138 758 1006 1095 99 -83
#> 2 283 565 504 317 1849 944 -80 -895 -936 906
#> 3 -118 -375 -564 -803 870 472 -922 -1009 -698 741
#> 4 -696 -844 -654 -1303 -474 -591 -262 1034 1012 376
#> 5 -251 -622 -14 -587 -1108 -1401 404 1545 1696 247
#> 6 645 -10 -4 411 -858 -1261 -574 -329 -367 -680
alpha
: The size of the alphabet used to encode the numerical values into a string with SAX.
word
: The length of the motif.
sb
and tb
: The spatial and temporal block sizes.
A part of the process is applied into blocks (subsets of the original dataset). With the tb (“Time slice” number of rows in each block) and sb (“Space slice” number of columns in each block), the user can specify the block size and the block shape.
kappa
: Threshold to check the minimal number of spatial-time series occurrences inside each motif.
sigma
: Threshold to check the minimal number of occurrences inside each motif.
This first step, described by the NormSAX function, applies z-score data normalization in the entire dataset. Right after normalization, SAX indexing method is applied for a given alphabet a
.
See more at Normalization and SAX Indexing
In this step, using sb
and tb
parameters, we create blocks from the original spatial-time dataset. All subsequences inside each block are combined to create a single time series. From this combined time series, motifs are verified using kappa
and sigma
thresholds. Then, all the occurrences of motifs from neighboring blocks are grouped.
See more at Search for Spatial-time Motifs
The last step, described by the \(RankSTMotifs\) function, makes a balance between distance among the occurrences of a motif with the encoded information on the motif itself and his quantity. It explores all motifs and their occurrences.
There are three ways to visualize the result:
To see an example of output : Output Example