rtrim by example

Jeroen Pannekoek, Arco van Strien and Patrick Bogaart

2018-08-16

Introduction

As an example of the use of rtrim, counts of the Skylark (Alauda arvensis) will be analysed (the data are obtained from the Breeding Bird Monitoring Scheme in the Netherlands of Sovon and Statistics Netherlands). A first view of the overall structure of the data can be obtained from base R functions.

rm(list=ls())
library(rtrim)
#> Welcome to rtrim 2.0.5 Type ?`rtrim-package` to get started.
#>
#> Attaching package: 'rtrim'
#> The following object is masked from 'package:stats':
#>
#> heatmap
data(skylark2) # use extended version of the Skylark dataset
summary(skylark2)
#> site year count habitat deposition
#> site_01: 8 Min. :1984 Min. : 1.00 dunes:232 Min. :1.000
#> site_02: 8 1st Qu.:1986 1st Qu.: 2.00 heath:208 1st Qu.:2.000
#> site_03: 8 Median :1988 Median : 5.00 Median :3.000
#> site_04: 8 Mean :1988 Mean : 12.55 Mean :2.818
#> site_05: 8 3rd Qu.:1989 3rd Qu.: 11.00 3rd Qu.:3.000
#> site_06: 8 Max. :1991 Max. :131.00 Max. :4.000
#> (Other):392 NA's :238
#> weight
#> Min. : 1.000
#> 1st Qu.: 1.000
#> Median :10.000
#> Mean : 5.745
#> 3rd Qu.:10.000
#> Max. :10.000
#>

A more specific overview of the data can be obained by running the rtrim command count_summary. This function expects the presence of columns names count, year and site. If one or more of the actual data columns have different names, these can be specified.

idx <- which(names(skylark2)=="year") # rename year->season
names(skylark2)[idx] <- "season"
count_summary(skylark2, year_col="season") # show that it works
#> Total number of sites 55
#> Sites without positive counts (0):
#> Number of observed zero counts 0
#> Number of observed positive counts 202
#> Total number of observed counts 202
#> Number of missing counts 238
#> Total number of counts 440
names(skylark2)[idx] <- "year" # revert to original name

In this case, we find that the Skylark dataset contains counts for 55 sites in 8 years (1984–1991). Of these 440 Site by Year combinations 202 were observed and the other 238 were missing. Two covariates are included: Habitat, which distinguishes between Dunes and Heathland sites and Deposition, which indicates the amount of acidic aerial deposition (This dataset was collected in the 1990’s when acidification was a prominent theme in ecological research).

Initial model estimation

To analyse these data with rtrim, we start with a model with time effects (model 3), ignoring the Habitat covariate. Model 3 is chosen because it makes no assumption about how population changes over time. Year effects are strictly independent of each other. A quick overview of the model results can be obtained by running summary() and plot(overall()).

z1 <- trim(count ~ site + year, data=skylark2, model=3, serialcor=TRUE, overdisp=TRUE)
summary(z1)
#> Call:
#> tools::buildVignettes(dir = ".", tangle = TRUE)
#>
#> Model : 3
#> Method : GEE (Convergence reached after 11 iterations)
#>
#> Coefficients:
#> time add se_add mul se_mul
#> 1 1 0.00000000 0.0000000 1.0000000 0.00000000
#> 2 2 -0.32017834 0.1054679 0.7260195 0.07657174
#> 3 3 -0.16867813 0.1054130 0.8447808 0.08905090
#> 4 4 -0.18965668 0.1083411 0.8272431 0.08962440
#> 5 5 -0.08241227 0.1070220 0.9208922 0.09855569
#> 6 6 0.02080919 0.1058650 1.0210272 0.10809105
#> 7 7 0.09969167 0.1082296 1.1048302 0.11957536
#> 8 8 0.15580476 0.1107841 1.1685980 0.12946210
#>
#> Overdispersion : 1.3672
#> Serial Correlation : 0.3024
#>
#> Goodness of fit:
#> Chi-square = 191.40, df=140, p=0.0026
#> Likelihood Ratio = 194.80, df=140, p=0.0015
#> AIC (up to a constant) = -85.20
plot(overall(z1))

Output from summary() includes:

The goodness-of-fit test (Likelihood Ratio) for this model amounts 194.8, with 140 degrees of freedom and p<0.05, which implies that the model has to be rejected.

Covariates

A possible improvement of the model for a better fit might be the inclusion of the Habitat covariate.

z2 <- trim(count ~ site + year + habitat, data=skylark2, model=3, serialcor=TRUE, overdisp=TRUE)
summary(z2)
#> Call:
#> tools::buildVignettes(dir = ".", tangle = TRUE)
#>
#> Model : 3
#> Method : GEE (Convergence reached after 12 iterations)
#>
#> Coefficients:
#> covar cat time add se_add mul se_mul
#> 1 baseline 0 1 0.0000000 0.0000000 1.0000000 0.0000000
#> 2 baseline 0 2 -0.2165232 0.1991433 0.8053138 0.1603728
#> 3 baseline 0 3 -0.3781379 0.2303445 0.6851360 0.1578173
#> 4 baseline 0 4 -0.4981991 0.2437072 0.6076240 0.1480823
#> 5 baseline 0 5 -0.7391541 0.2565409 0.4775177 0.1225028
#> 6 baseline 0 6 -0.5212781 0.2408642 0.5937612 0.1430158
#> 7 baseline 0 7 -0.6366106 0.2633641 0.5290827 0.1393414
#> 8 baseline 0 8 -0.7215137 0.2651985 0.4860160 0.1288907
#> 9 habitat 2 1 0.0000000 0.0000000 1.0000000 0.0000000
#> 10 habitat 2 2 -0.1444846 0.2323601 0.8654682 0.2011003
#> 11 habitat 2 3 0.2771464 0.2553396 1.3193596 0.3368847
#> 12 habitat 2 4 0.3865051 0.2681110 1.4718279 0.3946133
#> 13 habitat 2 5 0.7747286 0.2789551 2.1700031 0.6053334
#> 14 habitat 2 6 0.6449194 0.2642595 1.9058334 0.5036346
#> 15 habitat 2 7 0.8588036 0.2856443 2.3603350 0.6742164
#> 16 habitat 2 8 1.0307734 0.2882726 2.8032330 0.8080952
#>
#> Overdispersion : 1.1616
#> Serial Correlation : 0.2265
#>
#> Goodness of fit:
#> Chi-square = 154.50, df=133, p=0.0979
#> Likelihood Ratio = 159.64, df=133, p=0.0575
#> AIC (up to a constant) = -106.36

Now, the \(p\)-value of the likelihood ratio is (just slightly) above the classical threshold value of 0.05, and we decide to accept this model.

Model simplification

The advantage of Model 3 is, as argued above, the absence of any assumptions regarding the temporal trend. This, however, comes at a price: Postive counts are required for all individual years to allow estimation of the model parameters. So, this model cannot be used for cases where one or more years are missing. Furthermore, the model is far from being parsimonious. Even if the Skylark population follows a perfectly theoretical trend with constant population increase or decrease, each year is assigned it’s own growth parameters, even if these are identical to last year’s.

For both reasons it may be preferable to replace model 3 by model 2 (piecewise linear), especially because in one extreme case these models are equivalent. This is the case when all years are treated as change points, and each year the trend changes into a different one. Let’s first check this.

z3 <- trim(count ~ site + year + habitat, data=skylark2, model=2, changepoints="all",
serialcor=TRUE, overdisp=TRUE)
summary(z3)
#> Call:
#> tools::buildVignettes(dir = ".", tangle = TRUE)
#>
#> Model : 2
#> Method : GEE (Convergence reached after 11 iterations)
#>
#> Coefficients:
#> covar cat from upto add se_add mul se_mul
#> 1 baseline 0 1984 1985 -0.21652346 0.1991430 0.8053136 0.1603726
#> 2 baseline 0 1985 1986 -0.16161496 0.2207074 0.8507687 0.1877709
#> 3 baseline 0 1986 1987 -0.12006156 0.2194797 0.8868658 0.1946491
#> 4 baseline 0 1987 1988 -0.24095494 0.2260048 0.7858770 0.1776120
#> 5 baseline 0 1988 1989 0.21787599 0.2249234 1.2434329 0.2796772
#> 6 baseline 0 1989 1990 -0.11533261 0.2180151 0.8910697 0.1942667
#> 7 baseline 0 1990 1991 -0.08490346 0.2329769 0.9186010 0.2140128
#> 8 habitat 2 1984 1985 -0.14448463 0.2323599 0.8654682 0.2011001
#> 9 habitat 2 1985 1986 0.42163114 0.2480259 1.5244461 0.3781021
#> 10 habitat 2 1986 1987 0.10935908 0.2335696 1.1155628 0.2605616
#> 11 habitat 2 1987 1988 0.38822335 0.2389049 1.4743590 0.3522316
#> 12 habitat 2 1988 1989 -0.12980911 0.2366361 0.8782631 0.2078287
#> 13 habitat 2 1989 1990 0.21388409 0.2301313 1.2384791 0.2850127
#> 14 habitat 2 1990 1991 0.17197025 0.2458819 1.1876425 0.2920198
#>
#> Overdispersion : 1.1616
#> Serial Correlation : 0.2265
#>
#> Goodness of fit:
#> Chi-square = 154.50, df=133, p=0.0979
#> Likelihood Ratio = 159.64, df=133, p=0.0575
#> AIC (up to a constant) = -106.36

and indeed this results in a similar model fit (although parameter values are different, of course).

The graphical display of the time-totals suggests that after an initial decline in counts, Skylark population recovers with approximately the same rate. One could either just argue if this recovery starts in 1985 or in 1987, and to what extent the recovery rate is ‘constant’, or one can look at the model statistics tfor a more objective analysis. In this case, we look at the Wald statistics associated with the Habitat covariate, and the individual changepoints:

wald(z3)
#> Wald test for significance of covariates
#> Covariate W df p
#> habitat 21.54981 7 0.003036166
#>
#> Wald test for significance of changes in slope
#> Changepoint Wald_test df p
#> 1984 10.27482731 2 0.005872859
#> 1985 9.17718172 2 0.010167175
#> 1986 3.08043508 2 0.214334471
#> 1987 1.53703040 2 0.463701061
#> 1988 1.63613306 2 0.441284040
#> 1989 0.88658542 2 0.641919282
#> 1990 0.01468516 2 0.992684312

The first test shows that there is a significant (at the 5% level) effect of the Habitat covariate on slopes (or year indices), showing that the slopes (year indices) for Dunes are different from those for Heathland. The tests for the significance of changes in slopes show that the only significant changes are for the years 1984, which means that the slope between 1984 and 1985 is different from zero, and 1985, which means that the slope between 1985 and 1986 is different from the slope between 1984 and 1985. This suggests that it should be possible to describe these data with a model with less than the full set of seven changepoints. To investigate this possibility, the stepwise procedure for selection of changepoints can be used by including stepwise=TRUE in the call to trim():

z4 <- trim(count ~ site + year + habitat, data=skylark2, model=2, changepoints="all",
stepwise=TRUE, serialcor=TRUE, overdisp=TRUE)
summary(z4)
#> Call:
#> tools::buildVignettes(dir = ".", tangle = TRUE)
#>
#> Model : 2
#> Method : GEE (Convergence reached after 11 iterations)
#>
#> Coefficients:
#> covar cat from upto add se_add mul se_mul
#> 1 baseline 0 1984 1985 -0.26905602 0.18234931 0.7641004 0.13933319
#> 2 baseline 0 1985 1991 -0.07758256 0.04105089 0.9253506 0.03798646
#> 3 habitat 2 1984 1985 -0.02040236 0.20676977 0.9798044 0.20259392
#> 4 habitat 2 1985 1991 0.17488185 0.04372458 1.1911055 0.05208058
#>
#> Overdispersion : 1.1265
#> Serial Correlation : 0.2283
#>
#> Goodness of fit:
#> Chi-square = 161.09, df=143, p=0.1431
#> Likelihood Ratio = 160.76, df=143, p=0.1471
#> AIC (up to a constant) = -125.24
wald(z4)
#> Wald test for significance of covariates
#> Covariate W df p
#> habitat 18.50555 2 9.584531e-05
#>
#> Wald test for significance of changes in slope
#> Changepoint Wald_test df p
#> 1984 10.99440 2 0.0040982293
#> 1985 14.64858 2 0.0006593285

Not surprisingly, this results in a model with only two changepoints left, at 1984 and 1985.

The difference between the models of this run (z4) and the previous (z3) can be tested by comparing their Likelihood Ratio’s, see also Section 2.5.

gof(z3)
#> Goodness of fit:
#> Chi-square = 154.50, df=133, p=0.0979
#> Likelihood Ratio = 159.64, df=133, p=0.0575
#> AIC (up to a constant) = -106.36
LR3 <- gof(z3)$LR$LR # Get raw LR info for run 4
df3 <- gof(z3)$LR$df
gof(z4)
#> Goodness of fit:
#> Chi-square = 161.09, df=143, p=0.1431
#> Likelihood Ratio = 160.76, df=143, p=0.1471
#> AIC (up to a constant) = -125.24
LR4 <- gof(z4)$LR$LR # idem for run 3
df4 <- gof(z4)$LR$df
# Test the differece by using the fact that the difference of two LR measures is
# asymptotically Chi^2 distributed
LR <- abs(LR4 - LR3)
df <- abs(df4 - df3)
p <- 1 - pchisq(LR, df=df) # Use Chi-squared distribution
p
#> [1] 0.9997119

Since \(p \gg 0.05\) the \(H_0\) hypothesis that model z4 is a submodel of z3 in the sense that z4 can be obtained from z3 by setting some of z4 parameters to 0, cannot be rejected at the \(alpha=0.05\) level. In other words, both models are practically equivalent. The model z4, however, is the most sparse model, as shown by Akaike’s Information Criterion.

Concerning model z4, the Wald-test for the significance of the effects of the covariate on the slope parameters shows that this effect is very significant (p=0.0001) and the Wald-tests for the significance of changes in slope shows that both changes (at 1984 and 1985) are, as expected, also very significant (p is 0.004 and 0.0007, respectively).

The slope (in the additive parameterization) for a site is the sum of the constant term and the effects for the covariate values for that site. The effect for the first category of a covariate is zero and omitted from the output of summary() and coefs(). Thus, sites with covariate value 1 (Dunes) have slope -0.269 between 1984 and 1985 and -0.078 from 1985 onwards. The corresponding multiplicative parameters show that for Dunes there is a sharp decrease (the multipicative coefficient of 0.76 corresponds to \((1-0.76)\times100=24\%\) decrease) between 1984 and 1985 and a much smaller annual decrease (0.93, equivalent to 7% decrease) from 1985 to 1991. For sites with covariate value 2 (Heathland) the slope between 1984 and 1985 is \(-0.269 - 0.020 = -0.289\) corresponding with a multiplicative effect of \(0.764 \times 0.980 = 0.75\) which is only slightly different from the effect for Dunes for this time period. Apparently, the significant effect of the covariate is mainly determined by the trend from 1985 onwards. The parameters show indeed that Skylark populations increase in Heathland, while they decrease in Dunes. The slope is 0.097 (additive) and 1.10 multiplicative, corresponding to an annual increase of 10%.

Indices and time-totals

Model-based and imputed overall indices (based on the time-totals for all sites) can be obtained from TRIM output by calling index() or totals(). By default, only imputed indices or time-totals are returned. Model-based indices or time-totals can be added by using the which="both" option.

index(z4, which="both")
#> [1] 532.3683 400.9042 421.4130 445.5449 473.5561 505.7377 542.4180 583.9665
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1849.77536 284.86859 227.32453 173.59124 122.5094 72.91878
#> [2,] 284.86859 597.32820 472.24298 348.12719 222.0154 90.81069
#> [3,] 227.32453 472.24298 389.84066 307.13642 222.0580 132.42158
#> [4,] 173.59124 348.12719 307.13642 265.80705 223.0117 177.55307
#> [5,] 122.50942 222.01541 222.05796 223.01172 224.7704 227.22227
#> [6,] 72.91878 90.81069 132.42158 177.55307 227.2223 282.53057
#> [7,] 23.64290 -48.75872 35.90095 128.14598 230.2475 344.67918
#> [8,] -26.52645 -200.18592 -70.00660 73.39846 233.7164 414.98598
#> [,7] [,8]
#> [1,] 23.64290 -26.52645
#> [2,] -48.75872 -200.18592
#> [3,] 35.90095 -70.00660
#> [4,] 128.14598 73.39846
#> [5,] 230.24755 233.71637
#> [6,] 344.67918 414.98598
#> [7,] 474.15268 621.65767
#> [8,] 621.65767 858.65907
#> [1] 0.000000000 0.004295043 0.004195321 0.004484101 0.005188383 0.006398122
#> [7] 0.008278455 0.011088202
#> [1] 529.9958 390.7452 440.1062 433.4965 469.5308 510.2804 544.2264 588.1948
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1856.14388 308.40588 213.25683 157.63913 111.3665 68.76949
#> [2,] 308.40588 693.10199 414.95348 283.85523 177.0672 73.53196
#> [3,] 213.25683 414.95348 567.91776 253.28453 145.0830 85.91709
#> [4,] 157.63913 283.85523 253.28453 518.87846 211.1701 124.40600
#> [5,] 111.36651 177.06721 145.08304 211.17007 527.8938 222.89738
#> [6,] 68.76949 73.53196 85.91709 124.40600 222.8974 593.29231
#> [7,] 28.56078 -30.25181 34.84584 94.18286 167.8898 301.38900
#> [8,] -17.29359 -161.35680 -28.70232 69.31299 177.1195 306.25548
#> [,7] [,8]
#> [1,] 28.56078 -17.29359
#> [2,] -30.25181 -161.35680
#> [3,] 34.84584 -28.70232
#> [4,] 94.18286 69.31299
#> [5,] 167.88976 177.11954
#> [6,] 301.38900 306.25548
#> [7,] 713.94573 522.06360
#> [8,] 522.06360 998.66973
#> [1] 0.000000000 0.004440315 0.005317491 0.005349910 0.006363061 0.007766191
#> [7] 0.009300430 0.011830835
#> time fitted se_fit imputed se_imp
#> 1 1984 1.0000000 0.00000000 1.0000000 0.00000000
#> 2 1985 0.7530579 0.06553658 0.7372610 0.06663569
#> 3 1986 0.7915817 0.06477129 0.8303956 0.07292113
#> 4 1987 0.8369110 0.06696343 0.8179245 0.07314308
#> 5 1988 0.8895273 0.07203043 0.8859143 0.07976879
#> 6 1989 0.9499771 0.07998826 0.9628009 0.08812600
#> 7 1990 1.0188773 0.09098601 1.0268504 0.09643874
#> 8 1991 1.0969220 0.10530053 1.1098103 0.10876964

Indices can also be plotted:

plot(index(z4))
#> [1] 529.9958 390.7452 440.1062 433.4965 469.5308 510.2804 544.2264 588.1948
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1856.14388 308.40588 213.25683 157.63913 111.3665 68.76949
#> [2,] 308.40588 693.10199 414.95348 283.85523 177.0672 73.53196
#> [3,] 213.25683 414.95348 567.91776 253.28453 145.0830 85.91709
#> [4,] 157.63913 283.85523 253.28453 518.87846 211.1701 124.40600
#> [5,] 111.36651 177.06721 145.08304 211.17007 527.8938 222.89738
#> [6,] 68.76949 73.53196 85.91709 124.40600 222.8974 593.29231
#> [7,] 28.56078 -30.25181 34.84584 94.18286 167.8898 301.38900
#> [8,] -17.29359 -161.35680 -28.70232 69.31299 177.1195 306.25548
#> [,7] [,8]
#> [1,] 28.56078 -17.29359
#> [2,] -30.25181 -161.35680
#> [3,] 34.84584 -28.70232
#> [4,] 94.18286 69.31299
#> [5,] 167.88976 177.11954
#> [6,] 301.38900 306.25548
#> [7,] 713.94573 522.06360
#> [8,] 522.06360 998.66973
#> [1] 0.000000000 0.004440315 0.005317491 0.005349910 0.006363061 0.007766191
#> [7] 0.009300430 0.011830835

In this plot the solid red line connects the indices for the individual years. In this case, the first year, 1984, is chosen as base year. Standard errors for the indices are shown using a transparent band and white ‘error-bars’.

In the last trim run, habitat was used as a covariate. Indices for covariates can also be computed, by setting the covars flag:

index(z4, which="both", covars=TRUE)
#> [1] 532.3683 400.9042 421.4130 445.5449 473.5561 505.7377 542.4180 583.9665
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1849.77536 284.86859 227.32453 173.59124 122.5094 72.91878
#> [2,] 284.86859 597.32820 472.24298 348.12719 222.0154 90.81069
#> [3,] 227.32453 472.24298 389.84066 307.13642 222.0580 132.42158
#> [4,] 173.59124 348.12719 307.13642 265.80705 223.0117 177.55307
#> [5,] 122.50942 222.01541 222.05796 223.01172 224.7704 227.22227
#> [6,] 72.91878 90.81069 132.42158 177.55307 227.2223 282.53057
#> [7,] 23.64290 -48.75872 35.90095 128.14598 230.2475 344.67918
#> [8,] -26.52645 -200.18592 -70.00660 73.39846 233.7164 414.98598
#> [,7] [,8]
#> [1,] 23.64290 -26.52645
#> [2,] -48.75872 -200.18592
#> [3,] 35.90095 -70.00660
#> [4,] 128.14598 73.39846
#> [5,] 230.24755 233.71637
#> [6,] 344.67918 414.98598
#> [7,] 474.15268 621.65767
#> [8,] 621.65767 858.65907
#> [1] 0.000000000 0.004295043 0.004195321 0.004484101 0.005188383 0.006398122
#> [7] 0.008278455 0.011088202
#> [1] 529.9958 390.7452 440.1062 433.4965 469.5308 510.2804 544.2264 588.1948
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1856.14388 308.40588 213.25683 157.63913 111.3665 68.76949
#> [2,] 308.40588 693.10199 414.95348 283.85523 177.0672 73.53196
#> [3,] 213.25683 414.95348 567.91776 253.28453 145.0830 85.91709
#> [4,] 157.63913 283.85523 253.28453 518.87846 211.1701 124.40600
#> [5,] 111.36651 177.06721 145.08304 211.17007 527.8938 222.89738
#> [6,] 68.76949 73.53196 85.91709 124.40600 222.8974 593.29231
#> [7,] 28.56078 -30.25181 34.84584 94.18286 167.8898 301.38900
#> [8,] -17.29359 -161.35680 -28.70232 69.31299 177.1195 306.25548
#> [,7] [,8]
#> [1,] 28.56078 -17.29359
#> [2,] -30.25181 -161.35680
#> [3,] 34.84584 -28.70232
#> [4,] 94.18286 69.31299
#> [5,] 167.88976 177.11954
#> [6,] 301.38900 306.25548
#> [7,] 713.94573 522.06360
#> [8,] 522.06360 998.66973
#> [1] 0.000000000 0.004440315 0.005317491 0.005349910 0.006363061 0.007766191
#> [7] 0.009300430 0.011830835
#> [1] 151.41486 115.69616 107.05951 99.06759 91.67226 84.82898 78.49655
#> [8] 72.63683
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 680.14281 175.50973 125.487875 81.95618 44.22443 11.66928 -16.271866
#> [2,] 175.50973 316.20136 227.894769 151.01020 84.33445 26.77179 -22.666784
#> [3,] 125.48787 227.89477 170.325239 120.08082 76.38868 38.55058 5.935972
#> [4,] 81.95618 151.01020 120.080818 92.92765 69.15926 48.42161 30.394683
#> [5,] 44.22443 84.33445 76.388683 69.15926 62.58349 56.60409 51.168661
#> [6,] 11.66928 26.77179 38.550576 48.42161 56.60409 63.29511 68.671735
#> [7,] -16.27187 -22.66678 5.935972 30.39468 51.16866 68.67174 83.276483
#> [8,] -40.10646 -64.87343 -22.024136 14.78932 46.22930 72.89292 95.318085
#> [,8]
#> [1,] -40.10646
#> [2,] -64.87343
#> [3,] -22.02414
#> [4,] 14.78932
#> [5,] 46.22930
#> [6,] 72.89292
#> [7,] 95.31809
#> [8,] 113.98900
#> [1] 0.00000000 0.01941374 0.01452022 0.01207511 0.01126833 0.01150187
#> [7] 0.01234132 0.01347750
#> [1] 152.28937 118.65593 106.93849 98.20218 84.26744 88.55164 80.27117
#> [8] 73.41364
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 857.23016 212.44602 130.794830 81.26563 43.16220 11.23260
#> [2,] 212.44602 482.27910 248.589944 148.12893 79.87699 25.22234
#> [3,] 130.79483 248.58994 494.861038 186.77141 86.74620 38.15043
#> [4,] 81.26563 148.12893 186.771406 471.04114 146.35168 60.50051
#> [5,] 43.16220 79.87699 86.746196 146.35168 456.97537 139.70914
#> [6,] 11.23260 25.22234 38.150430 60.50051 139.70914 496.20007
#> [7,] -15.62127 -20.01882 6.118977 30.89685 62.97208 145.79499
#> [8,] -38.37287 -57.98168 -19.335263 13.44426 42.27935 75.08061
#> [,7] [,8]
#> [1,] -15.621267 -38.37287
#> [2,] -20.018821 -57.98168
#> [3,] 6.118977 -19.33526
#> [4,] 30.896849 13.44426
#> [5,] 62.972080 42.27935
#> [6,] 145.794986 75.08061
#> [7,] 487.058049 157.03849
#> [8,] 157.038485 471.51617
#> [1] 0.00000000 0.02895929 0.03164298 0.03116094 0.02896155 0.03332921
#> [7] 0.03198038 0.03051576
#> [1] 380.9534 285.2080 314.3535 346.4773 381.8839 420.9087 463.9214 511.3296
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1169.63255 109.35885 101.83666 91.63506 78.28499 61.24949
#> [2,] 109.35885 281.12683 244.34821 197.11700 137.68096 64.03890
#> [3,] 101.83666 244.34821 219.51542 187.05560 145.66928 93.87100
#> [4,] 91.63506 197.11700 187.05560 172.87940 153.85247 129.13145
#> [5,] 78.28499 137.68096 145.66928 153.85247 162.18695 170.61817
#> [6,] 61.24949 64.03890 93.87100 129.13145 170.61817 219.23546
#> [7,] 39.91476 -26.09193 29.96498 97.75129 179.07888 276.00744
#> [8,] 13.58001 -135.31249 -47.98247 58.60914 187.48706 342.09305
#> [,7] [,8]
#> [1,] 39.91476 13.58001
#> [2,] -26.09193 -135.31249
#> [3,] 29.96498 -47.98247
#> [4,] 97.75129 58.60914
#> [5,] 179.07888 187.48706
#> [6,] 276.00744 342.09305
#> [7,] 390.87620 526.33959
#> [8,] 526.33959 744.67007
#> [1] 0.000000000 0.005326182 0.005842317 0.006709398 0.008134942 0.010416741
#> [7] 0.013975775 0.019399925
#> [1] 377.7064 272.0892 333.1677 335.2943 385.2634 421.7288 463.9552 514.7811
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1241.52830 139.93996 92.707657 78.72737 68.76655 57.66502
#> [2,] 139.93996 404.75793 206.731056 144.92980 99.40524 48.82746
#> [3,] 92.70766 206.73106 420.360676 146.69804 77.41732 52.19100
#> [4,] 78.72737 144.92980 146.698042 442.16268 152.95139 84.44628
#> [5,] 68.76655 99.40524 77.417324 152.95139 477.60909 176.47169
#> [6,] 57.66502 48.82746 52.191003 84.44628 176.47169 570.73235
#> [7,] 44.20517 -10.13882 29.644153 67.50792 124.16732 246.03362
#> [8,] 21.08489 -103.35415 -9.192067 56.67347 138.48689 248.41973
#> [,7] [,8]
#> [1,] 44.20517 21.084885
#> [2,] -10.13882 -103.354152
#> [3,] 29.64415 -9.192067
#> [4,] 67.50792 56.673467
#> [5,] 124.16732 138.486890
#> [6,] 246.03362 248.419725
#> [7,] 646.15707 442.879761
#> [8,] 442.87976 905.396377
#> [1] 0.000000000 0.005940009 0.008571312 0.008977519 0.011418796 0.013947351
#> [7] 0.016898858 0.022108899
#> covariate category time fitted se_fit imputed se_imp
#> 1 Overall (none) 1984 1.0000000 0.00000000 1.0000000 0.00000000
#> 2 Overall (none) 1985 0.7530579 0.06553658 0.7372610 0.06663569
#> 3 Overall (none) 1986 0.7915817 0.06477129 0.8303956 0.07292113
#> 4 Overall (none) 1987 0.8369110 0.06696343 0.8179245 0.07314308
#> 5 Overall (none) 1988 0.8895273 0.07203043 0.8859143 0.07976879
#> 6 Overall (none) 1989 0.9499771 0.07998826 0.9628009 0.08812600
#> 7 Overall (none) 1990 1.0188773 0.09098601 1.0268504 0.09643874
#> 8 Overall (none) 1991 1.0969220 0.10530053 1.1098103 0.10876964
#> 9 habitat dunes 1984 1.0000000 0.00000000 1.0000000 0.00000000
#> 10 habitat dunes 1985 0.7641004 0.13933319 0.7791479 0.17017428
#> 11 habitat dunes 1986 0.7070608 0.12049989 0.7022059 0.17788473
#> 12 habitat dunes 1987 0.6542792 0.10988682 0.6448394 0.17652462
#> 13 habitat dunes 1988 0.6054377 0.10615237 0.5533376 0.17018093
#> 14 habitat dunes 1989 0.5602421 0.10724676 0.5814696 0.18256289
#> 15 habitat dunes 1990 0.5184204 0.11109150 0.5270963 0.17883059
#> 16 habitat dunes 1991 0.4797206 0.11609263 0.4820668 0.17468761
#> 17 habitat heath 1984 1.0000000 0.00000000 1.0000000 0.00000000
#> 18 habitat heath 1985 0.7486689 0.07298069 0.7203724 0.07707145
#> 19 habitat heath 1986 0.8251756 0.07643505 0.8820811 0.09258138
#> 20 habitat heath 1987 0.9095004 0.08191091 0.8877115 0.09474977
#> 21 habitat heath 1988 1.0024425 0.09019392 1.0200076 0.10685876
#> 22 habitat heath 1989 1.1048823 0.10206243 1.1165518 0.11809890
#> 23 habitat heath 1990 1.2177904 0.11821918 1.2283488 0.12999561
#> 24 habitat heath 1991 1.3422366 0.13928361 1.3629135 0.14869062

Indices are collected in a single dataframe, but can be easily separated by using e.g. subset().

Again, indices for the covariate categories can be plotted without much effort:

plot(index(z4,which="fitted",covars=TRUE))
#> [1] 532.3683 400.9042 421.4130 445.5449 473.5561 505.7377 542.4180 583.9665
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1849.77536 284.86859 227.32453 173.59124 122.5094 72.91878
#> [2,] 284.86859 597.32820 472.24298 348.12719 222.0154 90.81069
#> [3,] 227.32453 472.24298 389.84066 307.13642 222.0580 132.42158
#> [4,] 173.59124 348.12719 307.13642 265.80705 223.0117 177.55307
#> [5,] 122.50942 222.01541 222.05796 223.01172 224.7704 227.22227
#> [6,] 72.91878 90.81069 132.42158 177.55307 227.2223 282.53057
#> [7,] 23.64290 -48.75872 35.90095 128.14598 230.2475 344.67918
#> [8,] -26.52645 -200.18592 -70.00660 73.39846 233.7164 414.98598
#> [,7] [,8]
#> [1,] 23.64290 -26.52645
#> [2,] -48.75872 -200.18592
#> [3,] 35.90095 -70.00660
#> [4,] 128.14598 73.39846
#> [5,] 230.24755 233.71637
#> [6,] 344.67918 414.98598
#> [7,] 474.15268 621.65767
#> [8,] 621.65767 858.65907
#> [1] 0.000000000 0.004295043 0.004195321 0.004484101 0.005188383 0.006398122
#> [7] 0.008278455 0.011088202
#> [1] 151.41486 115.69616 107.05951 99.06759 91.67226 84.82898 78.49655
#> [8] 72.63683
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 680.14281 175.50973 125.487875 81.95618 44.22443 11.66928 -16.271866
#> [2,] 175.50973 316.20136 227.894769 151.01020 84.33445 26.77179 -22.666784
#> [3,] 125.48787 227.89477 170.325239 120.08082 76.38868 38.55058 5.935972
#> [4,] 81.95618 151.01020 120.080818 92.92765 69.15926 48.42161 30.394683
#> [5,] 44.22443 84.33445 76.388683 69.15926 62.58349 56.60409 51.168661
#> [6,] 11.66928 26.77179 38.550576 48.42161 56.60409 63.29511 68.671735
#> [7,] -16.27187 -22.66678 5.935972 30.39468 51.16866 68.67174 83.276483
#> [8,] -40.10646 -64.87343 -22.024136 14.78932 46.22930 72.89292 95.318085
#> [,8]
#> [1,] -40.10646
#> [2,] -64.87343
#> [3,] -22.02414
#> [4,] 14.78932
#> [5,] 46.22930
#> [6,] 72.89292
#> [7,] 95.31809
#> [8,] 113.98900
#> [1] 0.00000000 0.01941374 0.01452022 0.01207511 0.01126833 0.01150187
#> [7] 0.01234132 0.01347750
#> [1] 380.9534 285.2080 314.3535 346.4773 381.8839 420.9087 463.9214 511.3296
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1169.63255 109.35885 101.83666 91.63506 78.28499 61.24949
#> [2,] 109.35885 281.12683 244.34821 197.11700 137.68096 64.03890
#> [3,] 101.83666 244.34821 219.51542 187.05560 145.66928 93.87100
#> [4,] 91.63506 197.11700 187.05560 172.87940 153.85247 129.13145
#> [5,] 78.28499 137.68096 145.66928 153.85247 162.18695 170.61817
#> [6,] 61.24949 64.03890 93.87100 129.13145 170.61817 219.23546
#> [7,] 39.91476 -26.09193 29.96498 97.75129 179.07888 276.00744
#> [8,] 13.58001 -135.31249 -47.98247 58.60914 187.48706 342.09305
#> [,7] [,8]
#> [1,] 39.91476 13.58001
#> [2,] -26.09193 -135.31249
#> [3,] 29.96498 -47.98247
#> [4,] 97.75129 58.60914
#> [5,] 179.07888 187.48706
#> [6,] 276.00744 342.09305
#> [7,] 390.87620 526.33959
#> [8,] 526.33959 744.67007
#> [1] 0.000000000 0.005326182 0.005842317 0.006709398 0.008134942 0.010416741
#> [7] 0.013975775 0.019399925

The model based indices reflect the strong decrease from 1984 to 1985 and the smaller decrease from 1985 onwards for Dunes (habitat category 1) and the similar decrease from 1984 to 1985 and the increase from that year onwards for Heathland (category 2). The overall model based indices are between the indices for Dunes and Heathland and show much less change over time than when Dunes and Heathland are treated separately. The imputed indices are very similar to the model based indices with the exception that the imputed index for 1986 is larger than the model based index for that year.

Multiple covariates

One may try to extend the model further by also incorporating the second covariate ‘deposition’ in the model. This covariate is a measure for the amount of acidic aerial deposition. This time, the time-effects model (model 3) cannot be estimated due to lack of data in particular years:

check_observations(skylark2, model=3, covars=c("habitat","deposition"))
#> $errors
#> $errors$deposition
#> year deposition
#> 1 1990 1
#> 2 1991 1
#>
#>
#> $sufficient
#> [1] FALSE

The linear trend model with covariates can still be estimated.

z5 <- trim(count ~ site + year + habitat+deposition, data=skylark2, model=2,
serialcor=TRUE, overdisp=TRUE)
#> Warning: Serial correlation is very low (rho=0.162); consider disabling it.
summary(z5)
#> Call:
#> tools::buildVignettes(dir = ".", tangle = TRUE)
#>
#> Model : 2
#> Method : GEE (Convergence reached after 11 iterations)
#>
#> Coefficients:
#> covar cat from upto add se_add mul se_mul
#> 1 baseline 0 1984 1991 -0.0547486466 0.30356576 0.9467231 0.28739271
#> 2 habitat 2 1984 1991 0.1631245102 0.04752131 1.1771833 0.05594129
#> 3 deposition 2 1984 1991 -0.0399174384 0.30595598 0.9608688 0.29398355
#> 4 deposition 3 1984 1991 -0.0734776978 0.30615962 0.9291569 0.28447032
#> 5 deposition 4 1984 1991 -0.0002787989 0.30637534 0.9997212 0.30628993
#>
#> Overdispersion : 1.1557
#> Serial Correlation : 0.1620
#>
#> Goodness of fit:
#> Chi-square = 164.11, df=142, p=0.0987
#> Likelihood Ratio = 147.99, df=142, p=0.3482
#> AIC (up to a constant) = -136.01
wald(z5)
#> Wald test for significance of covariates
#> Covariate W df p
#> habitat 11.783157 1 0.0005976904
#> deposition 8.013505 3 0.0457334254
#>
#> Wald test for significance of changes in slope
#> Changepoint Wald_test df p
#> 1984 47.85308 5 3.805843e-09

Weighting

So far, the overall indices are the indices that correspond with the time totals summed over all sites. The next run shows the results if the sites in Dunes are weighted 10 times.

z6 <- trim(count ~ site + year + habitat, data=skylark2, model=2, changepoints="auto",
serialcor=TRUE, overdisp=TRUE, weights="weight")
idx = index(z6, "fitted", covars=TRUE)
#> [1] 1895.102 1442.170 1384.949 1337.153 1298.606 1269.198 1248.887 1237.698
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 69183.913 17660.332 12650.6241 8287.253 4500.728 1228.178 -1587.2719
#> [2,] 17660.332 31901.263 23033.8252 15298.137 8571.126 2741.218 -2292.7703
#> [3,] 12650.624 23033.825 17252.0394 12195.137 7784.538 3948.929 623.5622
#> [4,] 8287.253 15298.137 12195.1374 9465.644 7069.778 4971.293 3137.2196
#> [5,] 4500.728 8571.126 7784.5376 7069.778 6420.536 5831.027 5295.9450
#> [6,] 1228.178 2741.218 3948.9286 4971.293 5831.027 6548.746 7143.1810
#> [7,] -1587.272 -2292.770 623.5622 3137.220 5295.945 7143.181 8718.5245
#> [8,] -3997.066 -6622.656 -2250.3961 1537.541 4810.418 7631.386 10058.1481
#> [,8]
#> [1,] -3997.066
#> [2,] -6622.656
#> [3,] -2250.396
#> [4,] 1537.541
#> [5,] 4810.418
#> [6,] 7631.386
#> [7,] 10058.148
#> [8,] 12143.570
#> [1] 0.000000000 0.012554390 0.009943491 0.008969745 0.009115708 0.010005786
#> [7] 0.011376179 0.013051852
#> [1] 1514.1486 1156.9616 1070.5951 990.6759 916.7226 848.2898 784.9655
#> [8] 726.3683
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 68014.281 17550.973 12548.7875 8195.618 4422.443 1166.928 -1627.1866
#> [2,] 17550.973 31620.136 22789.4769 15101.020 8433.445 2677.179 -2266.6784
#> [3,] 12548.787 22789.477 17032.5239 12008.082 7638.868 3855.058 593.5972
#> [4,] 8195.618 15101.020 12008.0818 9292.765 6915.926 4842.161 3039.4683
#> [5,] 4422.443 8433.445 7638.8683 6915.926 6258.349 5660.409 5116.8661
#> [6,] 1166.928 2677.179 3855.0576 4842.161 5660.409 6329.511 6867.1735
#> [7,] -1627.187 -2266.678 593.5972 3039.468 5116.866 6867.174 8327.6483
#> [8,] -4010.646 -6487.343 -2202.4136 1478.932 4622.930 7289.292 9531.8085
#> [,8]
#> [1,] -4010.646
#> [2,] -6487.343
#> [3,] -2202.414
#> [4,] 1478.932
#> [5,] 4622.930
#> [6,] 7289.292
#> [7,] 9531.809
#> [8,] 11398.900
#> [1] 0.00000000 0.01941374 0.01452022 0.01207511 0.01126833 0.01150187
#> [7] 0.01234132 0.01347750
#> [1] 380.9534 285.2080 314.3535 346.4773 381.8839 420.9087 463.9214 511.3296
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1169.63255 109.35885 101.83666 91.63506 78.28499 61.24949
#> [2,] 109.35885 281.12683 244.34821 197.11700 137.68096 64.03890
#> [3,] 101.83666 244.34821 219.51542 187.05560 145.66928 93.87100
#> [4,] 91.63506 197.11700 187.05560 172.87940 153.85247 129.13145
#> [5,] 78.28499 137.68096 145.66928 153.85247 162.18695 170.61817
#> [6,] 61.24949 64.03890 93.87100 129.13145 170.61817 219.23546
#> [7,] 39.91476 -26.09193 29.96498 97.75129 179.07888 276.00744
#> [8,] 13.58001 -135.31249 -47.98247 58.60914 187.48706 342.09305
#> [,7] [,8]
#> [1,] 39.91476 13.58001
#> [2,] -26.09193 -135.31249
#> [3,] 29.96498 -47.98247
#> [4,] 97.75129 58.60914
#> [5,] 179.07888 187.48706
#> [6,] 276.00744 342.09305
#> [7,] 390.87620 526.33959
#> [8,] 526.33959 744.67007
#> [1] 0.000000000 0.005326182 0.005842317 0.006709398 0.008134942 0.010416741
#> [7] 0.013975775 0.019399925
plot(idx)

The separate indices for Dunes and Heathland remain similar, of course, but due to the weighting the overall index decreases from 1985 onwards.