The underlying assumptions of traditional autoregressive models are well known. The resulting complexity with these models leads to observations such as,
``We have found that choosing the wrong model or parameters can often yield poor results, and it is unlikely that even experienced analysts can choose the correct model and parameters efficiently given this array of choices.’’
NNS
simplifies the forecasting process. Below are some
examples demonstrating NNS.ARMA
and its
assumption free, minimal parameter forecasting
method.
NNS.ARMA
has the ability to fit a
linear regression to the relevant component series, yielding very fast
results. For our running example we will use the
AirPassengers
dataset loaded in base R.
We will forecast 44 periods h = 44
of
AirPassengers
using the first 100 observations
training.set = 100
, returning estimates of the final 44
observations. We will then test this against our validation set of
tail(AirPassengers,44)
.
Since this is monthly data, we will try a
seasonal.factor = 12
.
Below is the linear fit and associated root mean squared error (RMSE)
using method = "lin"
.
nns_lin = NNS.ARMA(AirPassengers,
h = 44,
training.set = 100,
method = "lin",
plot = TRUE,
seasonal.factor = 12,
seasonal.plot = FALSE)
## [1] 35.39965
Now we can try using a nonlinear regression on the relevant component
series using method = "nonlin"
.
We can test a series of seasonal.factors
and select the
best one to fit. The largest period to consider would be
0.5 * length(variable)
, since we need more than 2 points
for a regression! Remember, we are testing the first 100 observations of
AirPassengers
, not the full 144 observations.
seas = t(sapply(1 : 25, function(i) c(i, sqrt( mean( (NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "lin", seasonal.factor = i, plot=FALSE) - tail(AirPassengers, 44)) ^ 2) ) ) ) )
colnames(seas) = c("Period", "RMSE")
seas
## Period RMSE
## [1,] 1 75.67783
## [2,] 2 75.71250
## [3,] 3 75.87604
## [4,] 4 75.16563
## [5,] 5 76.07418
## [6,] 6 70.43185
## [7,] 7 77.98493
## [8,] 8 75.48997
## [9,] 9 79.16378
## [10,] 10 81.47260
## [11,] 11 106.56886
## [12,] 12 35.39965
## [13,] 13 90.98265
## [14,] 14 95.64979
## [15,] 15 82.05345
## [16,] 16 74.63052
## [17,] 17 87.54036
## [18,] 18 74.90881
## [19,] 19 96.96011
## [20,] 20 88.75015
## [21,] 21 100.21346
## [22,] 22 108.68674
## [23,] 23 85.06430
## [24,] 24 35.49018
## [25,] 25 75.16192
Now we know seasonal.factor = 12
is our best fit, we can
see if there’s any benefit from using a nonlinear regression.
Alternatively, we can define our best fit as the corresponding
seas$Period
entry of the minimum value in our
seas$RMSE
column.
Below you will notice the use of seasonal.factor = a
generates the same output.
nns = NNS.ARMA(AirPassengers,
h = 44,
training.set = 100,
method = "nonlin",
seasonal.factor = a,
plot = TRUE, seasonal.plot = FALSE)
## [1] 18.77889
Note: You may experience instances with monthly data
that report seasonal.factor
close to multiples of 3, 4, 6
or 12. For instance, if the reported
seasonal.factor = {37, 47, 71, 73}
use
(seasonal.factor = c(36, 48, 72))
by setting the
modulo
parameter in
NNS.seas(..., modulo = 12)
. The same
suggestion holds for daily data and multiples of 7, or any other time
series with logically inferred cyclical patterns. The nearest periods to
that modulo
will be in the expanded output.
## $all.periods
## Period Coefficient.of.Variation Variable.Coefficient.of.Variation
## 1: 48 0.4002249 0.4279947
## 2: 12 0.4059923 0.4279947
## 3: 60 0.4279947 0.4279947
## 4: 36 0.4279947 0.4279947
## 5: 24 0.4279947 0.4279947
##
## $best.period
## Period
## 48
##
## $periods
## [1] 48 12 60 36 24
seasonal.factor
NNS also offers a wrapper function
NNS.ARMA.optim()
to test a given vector of
seasonal.factor
and returns the optimized objective
function (in this case RMSE written as
obj.fn = expression( sqrt(mean((predicted - actual)^2)) )
)
and the corresponding periods, as well as the
NNS.ARMA
regression method used.
Alternatively, using external package objective functions work as well
such as
obj.fn = expression(Metrics::rmse(actual, predicted))
.
NNS.ARMA.optim()
will also test whether
to regress the underlying data first, shrink
the estimates
to their subset mean values, include a bias.shift
based on
its internal validation errors, and compare different
weights
of both linear and nonlinear estimates.
Given our monthly dataset, we will try multiple years by setting
seasonal.factor = seq(12, 60, 6)
every 6 months based on
our NNS.seas() insights above.
nns.optimal = NNS.ARMA.optim(AirPassengers,
training.set = 100,
seasonal.factor = seq(12, 60, 6),
obj.fn = expression( sqrt(mean((predicted - actual)^2)) ),
objective = "min",
pred.int = .95, plot = TRUE)
nns.optimal
[1] "CURRNET METHOD: lin"
[1] "COPY LATEST PARAMETERS DIRECTLY FOR NNS.ARMA() IF ERROR:"
[1] "NNS.ARMA(... method = 'lin' , seasonal.factor = c( 12 ) ...)"
[1] "CURRENT lin OBJECTIVE FUNCTION = 35.3996540135277"
[1] "BEST method = 'lin', seasonal.factor = c( 12 )"
[1] "BEST lin OBJECTIVE FUNCTION = 35.3996540135277"
[1] "CURRNET METHOD: nonlin"
[1] "COPY LATEST PARAMETERS DIRECTLY FOR NNS.ARMA() IF ERROR:"
[1] "NNS.ARMA(... method = 'nonlin' , seasonal.factor = c( 12 ) ...)"
[1] "CURRENT nonlin OBJECTIVE FUNCTION = 18.7788946994645"
[1] "BEST method = 'nonlin' PATH MEMBER = c( 12 )"
[1] "BEST nonlin OBJECTIVE FUNCTION = 18.7788946994645"
[1] "CURRNET METHOD: both"
[1] "COPY LATEST PARAMETERS DIRECTLY FOR NNS.ARMA() IF ERROR:"
[1] "NNS.ARMA(... method = 'both' , seasonal.factor = c( 12 ) ...)"
[1] "CURRENT both OBJECTIVE FUNCTION = 26.9404229568573"
[1] "BEST method = 'both' PATH MEMBER = c( 12 )"
[1] "BEST both OBJECTIVE FUNCTION = 26.9404229568573"
$periods
[1] 12
$weights
NULL
$obj.fn
[1] 18.77889
$method
[1] "nonlin"
$shrink
[1] FALSE
$nns.regress
[1] FALSE
$bias.shift
[1] 0
$errors
[1] -14.47605733 -21.86813452 -19.68680481 -32.74950538 -23.55099666 -17.40139233 -13.70469091 -6.24837561 -2.48949314 2.13237206 16.92258707 23.90974592 4.53920624
[14] -2.80578226 -8.66505386 -35.95512814 6.15047733 -3.05392579 4.47214630 18.85551669 2.32637841 0.08983381 -1.04028830 2.89582233 -24.22859693 -6.13584252
[27] -27.38320978 -53.70280267 -21.86253259 -23.90679863 -23.77369245 -22.36041401 -29.22056474 -26.31507238 12.93035387 -34.59586321 -47.79031322 -34.85559482 -62.85716413
[40] -64.07451099 -35.67072562 -50.91910417 -27.91240080 -22.72623962
$results
[1] 338.5400 405.6487 448.7971 437.6360 381.8544 325.4751 288.1899 326.6740 337.1004 319.5942 381.6468 371.7323 373.1716 452.1286 498.6929 485.0008 422.3997 357.9051
[19] 318.5071 360.0295 371.1142 350.8923 419.5388 408.2967 408.8936 499.6842 549.5283 533.5434 464.0692 391.2918 349.2869 394.1562 404.3549 381.6612 456.5031 443.9546
[37] 443.9982 546.2672 599.0085 581.0612 504.6510 423.7409 379.3403 427.2016
$lower.pred.int
[1] 287.2344 354.3431 397.4915 386.3304 330.5488 274.1695 236.8843 275.3684 285.7949 268.2886 330.3413 320.4267 321.8661 400.8231 447.3873 433.6952 371.0941 306.5995
[19] 267.2016 308.7239 319.8086 299.5867 368.2332 356.9912 357.5880 448.3786 498.2227 482.2379 412.7636 339.9862 297.9813 342.8507 353.0493 330.3556 405.1975 392.6490
[37] 392.6926 494.9616 547.7030 529.7556 453.3454 372.4353 328.0347 375.8961
$upper.pred.int
[1] 368.1156 435.2243 478.3727 467.2115 411.4300 355.0506 317.7655 356.2495 366.6760 349.1697 411.2224 401.3078 402.7472 481.7042 528.2685 514.5763 451.9753 387.4807
[19] 348.0827 389.6050 400.6897 380.4679 449.1144 437.8723 438.4692 529.2597 579.1039 563.1190 493.6448 420.8674 378.8625 423.7318 433.9305 411.2367 486.0786 473.5301
[37] 473.5738 575.8427 628.5841 610.6368 534.2265 453.3165 408.9159 456.7772
We can forecast another 50 periods out-of-sample
(h = 50
), by dropping the training.set
parameter while generating the 95% prediction intervals.
NNS.ARMA.optim(AirPassengers,
seasonal.factor = seq(12, 60, 6),
obj.fn = expression( sqrt(mean((predicted - actual)^2)) ),
objective = "min",
pred.int = .95, h = 50, plot = TRUE)
seasonal.factor = c(1, 2, ...)
We included the ability to use any number of specified seasonal periods simultaneously, weighted by their strength of seasonality. Computationally expensive when used with nonlinear regressions and large numbers of relevant periods.
weights
Instead of weighting by the seasonal.factor
strength of
seasonality, we offer the ability to weight each per any defined
compatible vector summing to 1.
Equal weighting would be weights = "equal"
.
pred.int
Provides the values for the specified prediction intervals within [0,1] for each forecasted point and plots the bootstrapped replicates for the forecasted points.
seasonal.factor = FALSE
We also included the ability to use all detected seasonal periods simultaneously, weighted by their strength of seasonality. Computationally expensive when used with nonlinear regressions and large numbers of relevant periods.
best.periods
This parameter restricts the number of detected seasonal periods to
use, again, weighted by their strength. To be used in conjunction with
seasonal.factor = FALSE
.
modulo
To be used in conjunction with seasonal.factor = FALSE
.
This parameter will ensure logical seasonal patterns (i.e.,
modulo = 7
for daily data) are included along with the
results.
mod.only
To be used in conjunction with
seasonal.factor = FALSE & modulo != NULL
. This
parameter will ensure empirical patterns are kept along with the logical
seasonal patterns.
dynamic = TRUE
This setting generates a new seasonal period(s) using the estimated
values as continuations of the variable, either with or without a
training.set
. Also computationally expensive due to the
recalculation of seasonal periods for each estimated value.
plot
, seasonal.plot
These are the plotting arguments, easily enabled or disabled with
TRUE
or FALSE
.
seasonal.plot = TRUE
will not plot without
plot = TRUE
. If a seasonal analysis is all that is desired,
NNS.seas
is the function specifically suited for that
task.
The extension to a generalized multivariate instance is provided in
the following documentation of the
NNS.VAR()
function:
If the user is so motivated, detailed arguments and proofs are provided within the following: