The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Here is an example of a model in which X
causes
M
and M
causes Y
. There is, in
addition, unobservable confounding between X
and
Y
. This is an example of a model in which you might use
information on M
to figure out whether X
caused Y
making use of the “front door criterion.”
The DAG is defined using dagitty
syntax like this:
We might set priors thus:
You can plot the dag like this:
Updating is done like this:
# Lets imagine highly correlated data; here an effect of .9 at each step
data <- data.frame(X = rep(0:1, 2000)) |>
mutate(
M = rbinom(n(), 1, .05 + .9*X),
Y = rbinom(n(), 1, .05 + .9*M))
# Updating
model <- model |> update_model(data, refresh = 0)
Finally you can calculate an estimand of interest like this:
query_model(
model = model,
using = c("priors", "posteriors"),
query = "Y[X=1] - Y[X=0]",
) |>
kable(digits = 2)
label | query | given | using | case_level | mean | sd | cred.low | cred.high |
---|---|---|---|---|---|---|---|---|
Y[X=1] - Y[X=0] | Y[X=1] - Y[X=0] | - | priors | FALSE | 0.00 | 0.14 | -0.34 | 0.29 |
Y[X=1] - Y[X=0] | Y[X=1] - Y[X=0] | - | posteriors | FALSE | 0.79 | 0.02 | 0.76 | 0.82 |
This uses the posterior distribution and the model to assess the average treatment effect estimand.
Let’s compare now with the case where you do not have data on
M
:
model |>
update_model(data |> dplyr::select(X, Y), refresh = 0) |>
query_model(
using = c("priors", "posteriors"),
query = "Y[X=1] - Y[X=0]") |>
kable(digits = 2)
label | query | given | using | case_level | mean | sd | cred.low | cred.high |
---|---|---|---|---|---|---|---|---|
Y[X=1] - Y[X=0] | Y[X=1] - Y[X=0] | - | priors | FALSE | 0.0 | 0.14 | -0.34 | 0.34 |
Y[X=1] - Y[X=0] | Y[X=1] - Y[X=0] | - | posteriors | FALSE | 0.1 | 0.17 | -0.03 | 0.60 |
Here we update much less and are (relatively) much less certain in
our beliefs precisely because we are aware of the confounded related
between X
and Y
, without having the data on
M
we could use to address it.
Say X
, M
, and Y
were perfectly
correlated. Would the average treatment effect be identified?
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.