The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Error estimation

2026-04-27

For the most part, this document will present the functionalities of the function surveysd::calc.stError() which generates point estimates and standard errors for user-supplied estimation functions.

Prerequisites

In order to use a dataset with calc.stError(), several weight columns have to be present. Each weight column corresponds to a bootstrap sample. In the following examples, we will use the data from demo.eusilc() and attach the bootstrap weights using draw.bootstrap() and recalib(). Please refer to the documentation of those functions for more detail.

library(surveysd)

set.seed(1234)
eusilc <- demo.eusilc(prettyNames = TRUE)
dat_boot <- draw.bootstrap(eusilc, REP = 10, hid = "hid", weights = "pWeight",
                           strata = "region", period = "year")
dat_boot_calib <- recalib(dat_boot, conP.var = "gender", conH.var = "region",
                          epsP = 1e-2, epsH = 2.5e-2, verbose = FALSE)
dat_boot_calib[, onePerson := nrow(.SD) == 1, by = .(year, hid)]

## print part of the dataset
dat_boot_calib[1:5, .(year, povertyRisk, eqIncome, onePerson, pWeight, w1, w2, w3, w4, w5)]
year povertyRisk eqIncome onePerson pWeight w1 w2 w3 w4 w5
2010 FALSE 16090.69 FALSE 504.5696 0.4527064 0.4506841 0.4503020 1005.7613874 0.4433975
2010 FALSE 16090.69 FALSE 504.5696 0.4527064 0.4506841 0.4503020 1005.7613874 0.4433975
2010 FALSE 16090.69 FALSE 504.5696 0.4527064 0.4506841 0.4503020 1005.7613874 0.4433975
2010 FALSE 27076.24 FALSE 493.3824 994.9889029 989.8350341 0.4402716 0.4379729 974.3668536
2010 FALSE 27076.24 FALSE 493.3824 994.9889029 989.8350341 0.4402716 0.4379729 974.3668536

Estimator functions

The parameters fun and var in calc.stError() define the estimator to be used in the error analysis. There are two built-in estimator functions weightedSum() and weightedRatio() which can be used as follows.

povertyRate <- calc.stError(dat_boot_calib, var = "povertyRisk", fun = weightedRatio)
totalIncome <- calc.stError(dat_boot_calib, var = "eqIncome", fun = weightedSum)

Those functions calculate the ratio of persons at risk of poverty (in percent) and the total income. By default, the results are calculated separately for each reference period.

povertyRate$Estimates
year n N estimate_type val_povertyRisk stE_povertyRisk
2010 14827 8182222 direct 14.44422 0.5609433
2011 14827 8182222 direct 14.77393 0.6989847
2012 14827 8182222 direct 15.04515 0.7569976
2013 14827 8182222 direct 14.89013 0.7659424
2014 14827 8182222 direct 15.14556 0.6412845
2015 14827 8182222 direct 15.53640 0.4747940
2016 14827 8182222 direct 15.08315 0.4980765
2017 14827 8182222 direct 15.42019 0.5669706
totalIncome$Estimates
year n N estimate_type val_eqIncome stE_eqIncome
2010 14827 8182222 direct 162750998071 614261392
2011 14827 8182222 direct 161926931417 1182577717
2012 14827 8182222 direct 162576509628 1567491258
2013 14827 8182222 direct 163199507862 1562464603
2014 14827 8182222 direct 163986275009 1189496473
2015 14827 8182222 direct 163416275447 1410046042
2016 14827 8182222 direct 162706205137 1277951911
2017 14827 8182222 direct 164314959107 1549546450

Columns that use the val_ prefix denote the point estimate belonging to the “main weight” of the dataset, which is pWeight in case of the dataset used here.

Columns with the stE_ prefix denote standard errors calculated with bootstrap replicates. The replicates result in using w1, w2, …, w10 instead of pWeight when applying the estimator.

n denotes the number of observations for the year and N denotes the total weight of those persons.

Custom estimators

In order to define a custom estimator function to be used in fun, the function needs to have at least two arguments like the example below.

## define custom estimator
myWeightedSum <- function(x, w) {
  sum(x*w)
}

## check if results are equal to the one using `surveysd::weightedSum()`
totalIncome2 <- calc.stError(dat_boot_calib, var = "eqIncome", fun = myWeightedSum)
all.equal(totalIncome$Estimates, totalIncome2$Estimates)
## [1] TRUE

The parameters x and w can be assumed to be vectors with equal length with w being numeric weight vector and x being the column defined in the var argument. It will be called once for each period (in this case year) and for each weight column (in this case pWeight, w1, w2, …, w10).

Custom estimators using additional parameters can also be supplied and parameter add.arg can be used to set the additional arguments for the custom estimator.

## use add.arg-argument
fun <- function(x, w, b) {
  sum(x*w*b)
}
add.arg = list(b="onePerson")

err.est <- calc.stError(dat_boot_calib, var = "povertyRisk", fun = fun,
                        period.mean = 0, add.arg=add.arg)
err.est$Estimates
year n N estimate_type val_povertyRisk stE_povertyRisk
2010 14827 8182222 direct 273683.9 17771.24
2011 14827 8182222 direct 261883.6 18580.04
2012 14827 8182222 direct 243083.9 15626.57
2013 14827 8182222 direct 238004.4 17123.68
2014 14827 8182222 direct 218572.1 11110.00
2015 14827 8182222 direct 219984.1 15503.37
2016 14827 8182222 direct 201753.9 12807.93
2017 14827 8182222 direct 196881.2 13848.92
# compare with direct computation
compare.value <- dat_boot_calib[,fun(povertyRisk,pWeight,b=onePerson),
                                 by=c("year")]
all((compare.value$V1-err.est$Estimates$val_povertyRisk)==0)
## [1] TRUE

The above chunk computes the weighted poverty ratio for single person households.

Adjust variable depending on bootstrap weights

In our example the variable povertyRisk is a boolean and is TRUE if the income is less than 60% of the weighted median income. Thus it directly depends on the original weight vector pWeight. To further reduce the estimated error one should calculate for each bootstrap replicate weight \(w\) the weighted median income \(medIncome_{w}\) and then define \(povertyRisk_w\) as

\[ povertyRisk_w = \cases{1 \quad\text{if Income}<0.6\cdot medIncome_{w}\\ 0 \quad\text{else}} \]

The estimator can then be applied to the new variable \(povertyRisk_w\). This can be realized using a custom estimator function.

# custom estimator to first derive poverty threshold 
# and then estimate a weighted ratio
povmd <- function(x, w) {
 md <- laeken::weightedMedian(x, w)*0.6
 pmd60 <- x < md
 # weighted ratio is directly estimated inside the function
 return(sum(w[pmd60])/sum(w)*100)
}

err.est <- calc.stError(
  dat_boot_calib, var = "povertyRisk", fun = weightedRatio,
  fun.adjust.var = povmd, adjust.var = "eqIncome")
err.est$Estimates
year n N estimate_type val_povertyRisk stE_povertyRisk
2010 14827 8182222 direct 14.44422 0
2011 14827 8182222 direct 14.77393 0
2012 14827 8182222 direct 15.04515 0
2013 14827 8182222 direct 14.89013 0
2014 14827 8182222 direct 15.14556 0
2015 14827 8182222 direct 15.53640 0
2016 14827 8182222 direct 15.08315 0
2017 14827 8182222 direct 15.42019 0

The approach shown above is only valid if no grouping variables are supplied (parameter group = NULL). If grouping variables are supplied one should use parameters fun.adjust.var and adjust.var such that the \(povertyRisk_w\) is first calculated for each period and then used for each grouping in group.

# using fun.adjust.var and adjust.var to estimate povmd60 indicator
# for each period and bootstrap weight before applying the weightedRatio
povmd2 <- function(x, w) {
 md <- laeken::weightedMedian(x, w)*0.6
 pmd60 <- x < md
 return(as.integer(pmd60))
}

# set adjust.var="eqIncome" so the income vector is used to estimate
# the povmd60 indicator for each bootstrap weight
# and the resulting indicators are passed to function weightedRatio
group <- "gender"
err.est <- calc.stError(
  dat_boot_calib, var = "povertyRisk", fun = weightedRatio, group = "gender",
  fun.adjust.var = povmd2, adjust.var = "eqIncome")
err.est$Estimates
year n N gender estimate_type val_povertyRisk stE_povertyRisk
2010 7267 3979572 male direct 12.02660 0.4692710
2010 7560 4202650 female direct 16.73351 0.4959012
2010 14827 8182222 NA direct 14.44422 0.4630224
2011 7267 3979572 male direct 12.81921 0.4980185
2011 7560 4202650 female direct 16.62488 0.5456008
2011 14827 8182222 NA direct 14.77393 0.4988443
2012 7267 3979572 male direct 13.76065 0.4686990
2012 7560 4202650 female direct 16.26147 0.5092447
2012 14827 8182222 NA direct 15.04515 0.4208072
2013 7267 3979572 male direct 13.88962 0.3565032
2013 7560 4202650 female direct 15.83754 0.6076965
2013 14827 8182222 NA direct 14.89013 0.4304874
2014 7267 3979572 male direct 14.50351 0.5182930
2014 7560 4202650 female direct 15.75353 0.5789351
2014 14827 8182222 NA direct 15.14556 0.5185337
2015 7267 3979572 male direct 15.12289 0.3915024
2015 7560 4202650 female direct 15.92796 0.5092921
2015 14827 8182222 NA direct 15.53640 0.4084178
2016 7267 3979572 male direct 14.57968 0.3757296
2016 7560 4202650 female direct 15.55989 0.3932173
2016 14827 8182222 NA direct 15.08315 0.3230416
2017 7267 3979572 male direct 14.94816 0.4823193
2017 7560 4202650 female direct 15.86717 0.5058455
2017 14827 8182222 NA direct 15.42019 0.4183360

Multiple estimators

In case an estimator should be applied to several columns of the dataset, var can be set to a vector containing all necessary columns.

multipleRates <- calc.stError(dat_boot_calib, var = c("povertyRisk", "onePerson"), fun = weightedRatio)
multipleRates$Estimates
year n N estimate_type val_povertyRisk stE_povertyRisk val_onePerson stE_onePerson
2010 14827 8182222 direct 14.44422 0.5071988 14.85737 0.5071988
2011 14827 8182222 direct 14.77393 0.5259823 14.85737 0.5259823
2012 14827 8182222 direct 15.04515 0.5868207 14.85737 0.5868207
2013 14827 8182222 direct 14.89013 0.6133935 14.85737 0.6133935
2014 14827 8182222 direct 15.14556 0.5525538 14.85737 0.5525538
2015 14827 8182222 direct 15.53640 0.4767763 14.85737 0.4767763
2016 14827 8182222 direct 15.08315 0.4349778 14.85737 0.4349778
2017 14827 8182222 direct 15.42019 0.5500109 14.85737 0.5500109

Here we see the relative number of persons at risk of poverty and the relative number of one-person households.

Grouping

The groups argument can be used to calculate estimators for different subsets of the data. This argument can take the grouping variable as a string that refers to a column name (usually a factor) in dat. If set, all estimators are not only split by the reference period but also by the grouping variable. For simplicity, only one reference period of the above data is used.

dat2 <- subset(dat_boot_calib, year == 2010)
for (att  in c("period", "weights", "b.rep"))
  attr(dat2, att) <- attr(dat_boot_calib, att)

To calculate the ratio of persons at risk of poverty for each federal state of Austria, group = "region" can be used.

povertyRates <- calc.stError(dat2, var = "povertyRisk", fun = weightedRatio, group = "region")
povertyRates$Estimates
year n N region estimate_type val_povertyRisk stE_povertyRisk
2010 549 260564 Burgenland direct 19.53984 2.7218963
2010 733 377355 Vorarlberg direct 16.53731 3.0656308
2010 924 535451 Salzburg direct 13.78734 2.0614915
2010 1078 563648 Carinthia direct 13.08627 1.1597025
2010 1317 701899 Tyrol direct 15.30819 1.9618469
2010 2295 1167045 Styria direct 14.37464 1.2861648
2010 2322 1598931 Vienna direct 17.23468 1.6218288
2010 2804 1555709 Lower Austria direct 13.84362 1.4420425
2010 2805 1421620 Upper Austria direct 10.88977 0.9927467
2010 14827 8182222 NA direct 14.44422 0.5609433

The last row with region = NA denotes the aggregate over all regions. Note that the columns N and n now show the weighted and unweighted number of persons in each region.

Several grouping variables

In case more than one grouping variable is used, there are several options of calling calc.stError() depending on whether combinations of grouping levels should be regarded or not. We will consider the variables gender and region as our grouping variables and show three options on how calc.stError() can be called.

Option 1: All regions and all genders

Calculate the point estimate and standard error for each region and each gender. The number of rows in the output is therefore

\[n_\text{periods}\cdot(n_\text{regions} + n_\text{genders} + 1) = 1\cdot(9 + 2 + 1) = 12.\]

The last row is again the estimate for the whole period.

povertyRates <- calc.stError(dat2, var = "povertyRisk", fun = weightedRatio, 
                             group = c("gender", "region"))
povertyRates$Estimates
year n N gender region estimate_type val_povertyRisk stE_povertyRisk
2010 549 260564 NA Burgenland direct 19.53984 2.7218963
2010 733 377355 NA Vorarlberg direct 16.53731 3.0656308
2010 924 535451 NA Salzburg direct 13.78734 2.0614915
2010 1078 563648 NA Carinthia direct 13.08627 1.1597025
2010 1317 701899 NA Tyrol direct 15.30819 1.9618469
2010 2295 1167045 NA Styria direct 14.37464 1.2861648
2010 2322 1598931 NA Vienna direct 17.23468 1.6218288
2010 2804 1555709 NA Lower Austria direct 13.84362 1.4420425
2010 2805 1421620 NA Upper Austria direct 10.88977 0.9927467
2010 7267 3979572 male NA direct 12.02660 0.5195006
2010 7560 4202650 female NA direct 16.73351 0.6415481
2010 14827 8182222 NA NA direct 14.44422 0.5609433

Option 2: All combinations of region and gender

Split the data by all combinations of the two grouping variables. This will result in a larger output-table of the size

\[n_\text{periods}\cdot(n_\text{regions} \cdot n_\text{genders} + 1) = 1\cdot(9\cdot2 + 1)= 19.\]

povertyRates <- calc.stError(dat2, var = "povertyRisk", fun = weightedRatio, 
                             group = list(c("gender", "region")))
povertyRates$Estimates
year n N gender region estimate_type val_povertyRisk stE_povertyRisk
2010 261 122741.8 male Burgenland direct 17.414524 2.7030521
2010 288 137822.2 female Burgenland direct 21.432598 3.1916375
2010 359 182732.9 male Vorarlberg direct 12.973259 2.4315502
2010 374 194622.1 female Vorarlberg direct 19.883637 3.7331749
2010 440 253143.7 male Salzburg direct 9.156964 1.6844796
2010 484 282307.3 female Salzburg direct 17.939382 2.5509609
2010 517 268581.4 male Carinthia direct 10.552148 0.9707809
2010 561 295066.6 female Carinthia direct 15.392924 1.5991221
2010 650 339566.5 male Tyrol direct 12.857542 2.0279255
2010 667 362332.5 female Tyrol direct 17.604861 2.1434317
2010 1128 571011.7 male Styria direct 11.671247 1.4441360
2010 1132 774405.4 male Vienna direct 15.590616 1.8286792
2010 1167 596033.3 female Styria direct 16.964539 1.4098774
2010 1190 824525.6 female Vienna direct 18.778813 1.9221010
2010 1363 684272.5 male Upper Austria direct 9.074690 1.0720767
2010 1387 772593.2 female Lower Austria direct 16.372949 1.7249656
2010 1417 783115.8 male Lower Austria direct 11.348283 1.3213404
2010 1442 737347.5 female Upper Austria direct 12.574205 1.0915720
2010 14827 8182222.0 NA NA direct 14.444218 0.5609433

Option 3: Cobination of Option 1 and Option 2

In this case, the estimates and standard errors are calculated for

  • every gender,
  • every region and
  • every combination of region and gender.

The number of rows in the output is therefore

\[n_\text{periods}\cdot(n_\text{regions} \cdot n_\text{genders} + n_\text{regions} + n_\text{genders} + 1) = 1\cdot(9\cdot2 + 9 + 2 + 1) = 30.\]

povertyRates <- calc.stError(dat2, var = "povertyRisk", fun = weightedRatio, 
                             group = list("gender", "region", c("gender", "region")))
povertyRates$Estimates
year n N gender region estimate_type val_povertyRisk stE_povertyRisk
2010 261 122741.8 male Burgenland direct 17.414524 2.7030521
2010 288 137822.2 female Burgenland direct 21.432598 3.1916375
2010 359 182732.9 male Vorarlberg direct 12.973259 2.4315502
2010 374 194622.1 female Vorarlberg direct 19.883637 3.7331749
2010 440 253143.7 male Salzburg direct 9.156964 1.6844796
2010 484 282307.3 female Salzburg direct 17.939382 2.5509609
2010 517 268581.4 male Carinthia direct 10.552148 0.9707809
2010 549 260564.0 NA Burgenland direct 19.539836 2.7218963
2010 561 295066.6 female Carinthia direct 15.392924 1.5991221
2010 650 339566.5 male Tyrol direct 12.857542 2.0279255
2010 667 362332.5 female Tyrol direct 17.604861 2.1434317
2010 733 377355.0 NA Vorarlberg direct 16.537310 3.0656308
2010 924 535451.0 NA Salzburg direct 13.787343 2.0614915
2010 1078 563648.0 NA Carinthia direct 13.086268 1.1597025
2010 1128 571011.7 male Styria direct 11.671247 1.4441360
2010 1132 774405.4 male Vienna direct 15.590616 1.8286792
2010 1167 596033.3 female Styria direct 16.964539 1.4098774
2010 1190 824525.6 female Vienna direct 18.778813 1.9221010
2010 1317 701899.0 NA Tyrol direct 15.308191 1.9618469
2010 1363 684272.5 male Upper Austria direct 9.074690 1.0720767
2010 1387 772593.2 female Lower Austria direct 16.372949 1.7249656
2010 1417 783115.8 male Lower Austria direct 11.348283 1.3213404
2010 1442 737347.5 female Upper Austria direct 12.574205 1.0915720
2010 2295 1167045.0 NA Styria direct 14.374637 1.2861648
2010 2322 1598931.0 NA Vienna direct 17.234683 1.6218288
2010 2804 1555709.0 NA Lower Austria direct 13.843623 1.4420425
2010 2805 1421620.0 NA Upper Austria direct 10.889773 0.9927467
2010 7267 3979571.7 male NA direct 12.026600 0.5195006
2010 7560 4202650.3 female NA direct 16.733508 0.6415481
2010 14827 8182222.0 NA NA direct 14.444218 0.5609433

Group differences

If differences between groups need to be calculated, e.g difference of poverty rates between gender = "male" and gender = "female", parameter group.diff can be utilised. Setting group.diff = TRUE the differences and the standard error of these differences for all variables defined in groups will be calculated.

povertyRates <- calc.stError(dat2, var = "povertyRisk", fun = weightedRatio, 
                             group = c("gender", "region"),
                             group.diff = TRUE)
povertyRates$Estimates
year n N gender region estimate_type val_povertyRisk stE_povertyRisk
2010 549.0 260564.0 NA Burgenland direct 19.5398365 2.7218963
2010 641.0 318959.5 NA Burgenland - Vorarlberg group difference 3.0025263 5.4881788
2010 733.0 377355.0 NA Vorarlberg direct 16.5373102 3.0656308
2010 736.5 398007.5 NA Burgenland - Salzburg group difference 5.7524933 3.6874303
2010 813.5 412106.0 NA Burgenland - Carinthia group difference 6.4535688 3.0054618
2010 828.5 456403.0 NA Salzburg - Vorarlberg group difference -2.7499670 3.9050220
2010 905.5 470501.5 NA Carinthia - Vorarlberg group difference -3.4510424 3.4828322
2010 924.0 535451.0 NA Salzburg direct 13.7873432 2.0614915
2010 933.0 481231.5 NA Burgenland - Tyrol group difference 4.2316460 4.2123973
2010 1001.0 549549.5 NA Carinthia - Salzburg group difference -0.7010755 2.3084690
2010 1025.0 539627.0 NA Tyrol - Vorarlberg group difference -1.2291197 2.5902696
2010 1078.0 563648.0 NA Carinthia direct 13.0862677 1.1597025
2010 1120.5 618675.0 NA Salzburg - Tyrol group difference -1.5208473 2.4433967
2010 1197.5 632773.5 NA Carinthia - Tyrol group difference -2.2219227 2.7653627
2010 1317.0 701899.0 NA Tyrol direct 15.3081905 1.9618469
2010 1422.0 713804.5 NA Burgenland - Styria group difference 5.1651992 3.0758909
2010 1435.5 929747.5 NA Burgenland - Vienna group difference 2.3051533 2.6059814
2010 1514.0 772200.0 NA Styria - Vorarlberg group difference -2.1626729 3.5226116
2010 1527.5 988143.0 NA Vienna - Vorarlberg group difference 0.6973730 3.8417900
2010 1609.5 851248.0 NA Salzburg - Styria group difference -0.5872941 2.0030771
2010 1623.0 1067191.0 NA Salzburg - Vienna group difference -3.4473400 2.9629261
2010 1676.5 908136.5 NA Burgenland - Lower Austria group difference 5.6962137 3.4323446
2010 1677.0 841092.0 NA Burgenland - Upper Austria group difference 8.6500631 2.3665108
2010 1686.5 865346.5 NA Carinthia - Styria group difference -1.2883695 1.4390397
2010 1700.0 1081289.5 NA Carinthia - Vienna group difference -4.1484155 1.7211996
2010 1768.5 966532.0 NA Lower Austria - Vorarlberg group difference -2.6936874 3.1615404
2010 1769.0 899487.5 NA Upper Austria - Vorarlberg group difference -5.6475368 3.7606424
2010 1806.0 934472.0 NA Styria - Tyrol group difference -0.9335532 2.0607422
2010 1819.5 1150415.0 NA Tyrol - Vienna group difference -1.9264927 3.0259264
2010 1864.0 1045580.0 NA Lower Austria - Salzburg group difference 0.0562796 2.7015379
2010 1864.5 978535.5 NA Salzburg - Upper Austria group difference 2.8975698 1.9815305
2010 1941.0 1059678.5 NA Carinthia - Lower Austria group difference -0.7573551 1.7797906
2010 1941.5 992634.0 NA Carinthia - Upper Austria group difference 2.1964944 1.0926653
2010 2060.5 1128804.0 NA Lower Austria - Tyrol group difference -1.4645677 2.3529819
2010 2061.0 1061759.5 NA Tyrol - Upper Austria group difference 4.4184171 2.5276920
2010 2295.0 1167045.0 NA Styria direct 14.3746373 1.2861648
2010 2308.5 1382988.0 NA Styria - Vienna group difference -2.8600459 1.5409430
2010 2322.0 1598931.0 NA Vienna direct 17.2346832 1.6218288
2010 2549.5 1361377.0 NA Lower Austria - Styria group difference -0.5310145 1.8244013
2010 2550.0 1294332.5 NA Styria - Upper Austria group difference 3.4848639 1.3963688
2010 2563.0 1577320.0 NA Lower Austria - Vienna group difference -3.3910604 2.3313196
2010 2563.5 1510275.5 NA Upper Austria - Vienna group difference -6.3449098 1.8332614
2010 2804.0 1555709.0 NA Lower Austria direct 13.8436228 1.4420425
2010 2804.5 1488664.5 NA Lower Austria - Upper Austria group difference 2.9538494 1.8165092
2010 2805.0 1421620.0 NA Upper Austria direct 10.8897734 0.9927467
2010 7267.0 3979571.7 male NA direct 12.0266000 0.5195006
2010 7413.5 4091111.0 male - female NA group difference -4.7069081 0.3293523
2010 7560.0 4202650.3 female NA direct 16.7335081 0.6415481
2010 14827.0 8182222.0 NA NA direct 14.4442182 0.5609433

The resulting output table contains 49 rows. 12 rows for all the direct estimators

\[n_\text{periods}\cdot(n_\text{regions} + n_\text{genders} + 1) = 1\cdot(9 + 2 + 1) = 12,\]

and another 37 for all the differences within the variable "gender" and "region" seperately. Variable "gender" has 2 unique values (unique(dat2$gender)) resulting in 1 difference, ~ gender = "male" - gender = "female" and variable "region" has 9 unique values (unique(dat2$region)) resulting in

\[8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 = \sum\limits_{1=1}^{9-1}i = 36\]

estimates. Thus the output contains 1 + 36 = 37 estimates with respect to group differences.

If a combintaion of grouping variables is used in group and group.diff = TRUE then differences between combinations will only be calculated if one of the grouping variables differs. For example the difference between the following groups would be calculated

The difference between gender = "female" & region = "Vienna" and gender = "male" & region = "Salzburg" however would not be calculated.

Thus this leads to

\[2\cdot(\sum\limits_{1=1}^{9-1}i) + 9\cdot1 = 81\]

results with respect to the differences. The Output contains an additional column estimate_type and

povertyRates <- calc.stError(dat2, var = "povertyRisk", fun = weightedRatio, 
                             group = list(c("gender", "region")),
                             group.diff = TRUE)
povertyRates$Estimates[,.N,by=.(estimate_type)]
estimate_type N
direct 19
group difference 81

Differences between survey periods

Differences of estimates between periods can be calculated using parameter period.diff. period.diff expects a character vector (if not NULL) specifying for which periods the differences should be calcualed for. The inputs should be specified in the form "period2" - "period1".

povertyRates <- calc.stError(dat_boot_calib[year>2013], var = "povertyRisk", fun = weightedRatio, 
                             period.diff = c("2017 - 2016", "2016 - 2015", "2015 - 2014"))
povertyRates$Estimates
year n N estimate_type val_povertyRisk stE_povertyRisk
2014 14827 8182222 direct 15.1455601 0.6412845
2015 14827 8182222 direct 15.5364014 0.4747940
2015-2014 14827 8182222 period difference 0.3908413 0.3490865
2016 14827 8182222 direct 15.0831502 0.4980765
2016-2015 14827 8182222 period difference -0.4532512 0.1961776
2017 14827 8182222 direct 15.4201916 0.5669706
2017-2016 14827 8182222 period difference 0.3370414 0.3088384

If additional grouping variables are supplied to calc.stError() die differences across periods are also carried out for all variables in group.

povertyRates <- calc.stError(dat_boot_calib[year>2013], var = "povertyRisk", fun = weightedRatio, 
                             group = "gender",
                             period.diff = c("2017 - 2016", "2016 - 2015", "2015 - 2014"))
povertyRates$Estimates
year n N gender estimate_type val_povertyRisk stE_povertyRisk
2014 7267 3979572 male direct 14.5035068 0.5955424
2014 7560 4202650 female direct 15.7535328 0.7319585
2014 14827 8182222 NA direct 15.1455601 0.6412845
2015 7267 3979572 male direct 15.1228904 0.4063351
2015 7560 4202650 female direct 15.9279630 0.5993205
2015 14827 8182222 NA direct 15.5364014 0.4747940
2015-2014 7267 3979572 male period difference 0.6193836 0.4099067
2015-2014 7560 4202650 female period difference 0.1744301 0.3990323
2015-2014 14827 8182222 NA period difference 0.3908413 0.3490865
2016 7267 3979572 male direct 14.5796824 0.5492882
2016 7560 4202650 female direct 15.5598937 0.5301042
2016 14827 8182222 NA direct 15.0831502 0.4980765
2016-2015 7267 3979572 male period difference -0.5432080 0.2714584
2016-2015 7560 4202650 female period difference -0.3680693 0.2178523
2016-2015 14827 8182222 NA period difference -0.4532512 0.1961776
2017 7267 3979572 male direct 14.9481591 0.6366371
2017 7560 4202650 female direct 15.8671684 0.6059010
2017 14827 8182222 NA direct 15.4201916 0.5669706
2017-2016 7267 3979572 male period difference 0.3684767 0.3733624
2017-2016 7560 4202650 female period difference 0.3072748 0.3128755
2017-2016 14827 8182222 NA period difference 0.3370414 0.3088384

Averages across periods

With parameter period.mean averages across periods are calculated additional. The parameter accepts only odd integer values. The resulting table will contain the direct estimates as well as rolling averages of length period.mean.

povertyRates <- calc.stError(dat_boot_calib[year>2013], var = "povertyRisk", fun = weightedRatio, 
                             period.mean = 3)
povertyRates$Estimates
year n N estimate_type val_povertyRisk stE_povertyRisk
2014 14827 8182222 direct 15.14556 0.6412845
2014_2015_2016 14827 8182222 period average 15.25504 0.5104027
2015 14827 8182222 direct 15.53640 0.4747940
2015_2016_2017 14827 8182222 period average 15.34658 0.4860014
2016 14827 8182222 direct 15.08315 0.4980765
2017 14827 8182222 direct 15.42019 0.5669706

if in addition the parameters group and/or period.diff are specified then differences and groupings of averages will be calculated.

povertyRates <- calc.stError(dat_boot_calib[year>2013], var = "povertyRisk", fun = weightedRatio, 
                             period.mean = 3, period.diff = "2016 - 2015",
                             group = "gender")
povertyRates$Estimates
year n N gender estimate_type val_povertyRisk stE_povertyRisk
2014 7267 3979572 male direct 14.5035068 0.5955424
2014 7560 4202650 female direct 15.7535328 0.7319585
2014 14827 8182222 NA direct 15.1455601 0.6412845
2014_2015_2016 7267 3979572 male period average 14.7353599 0.4719872
2014_2015_2016 7560 4202650 female period average 15.7471298 0.5886141
2014_2015_2016 14827 8182222 NA period average 15.2550372 0.5104027
2015 7267 3979572 male direct 15.1228904 0.4063351
2015 7560 4202650 female direct 15.9279630 0.5993205
2015 14827 8182222 NA direct 15.5364014 0.4747940
2015_2016_2017 7267 3979572 male period average 14.8835773 0.5005572
2015_2016_2017 7560 4202650 female period average 15.7850084 0.5445719
2015_2016_2017 14827 8182222 NA period average 15.3465811 0.4860014
2016 7267 3979572 male direct 14.5796824 0.5492882
2016 7560 4202650 female direct 15.5598937 0.5301042
2016 14827 8182222 NA direct 15.0831502 0.4980765
2016-2015 7267 3979572 male period difference -0.5432080 0.2714584
2016-2015 7560 4202650 female period difference -0.3680693 0.2178523
2016-2015 14827 8182222 NA period difference -0.4532512 0.1961776
2016-2015_mean 7267 3979572 male difference between period averages 0.1482174 0.1165588
2016-2015_mean 7560 4202650 female difference between period averages 0.0378785 0.1762159
2016-2015_mean 14827 8182222 NA difference between period averages 0.0915438 0.1250200
2017 7267 3979572 male direct 14.9481591 0.6366371
2017 7560 4202650 female direct 15.8671684 0.6059010
2017 14827 8182222 NA direct 15.4201916 0.5669706

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.