The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Getting the Most out of DAGassist Using Parameters

Introduction

DAGassist() is meant to be simple and easy to use, and most of its features can be enjoyed via a simple two-parameter argument:

DAGassist(
  dag = your_dag_model,
  formula = your_regression_call
)

However, DAGassist() includes several parameters for more specific applications. This vignette explains how to use those parameters to get the most out of DAGassist().

Setup

library(DAGassist)
library(dagitty)

formula arguments

DAGassist supports formulaic and regression-based formula arguments.

#formulaic formula
DAGassist(
  dag = dag_model,
  formula = Y ~ X + C,
  data = df,
  exposure = "X",
  outcome = "Y"
)

#imputed formula
DAGassist(
  dag = dag_model,
  formula = lm(Y ~ X + C, data=df)
)

The two formulas above will print identical output.

imply arguments

In cases where you only want DAGassist to use the variables explicitly called in your formula, use imply = FALSE.

DAGassist(
  dag = dag_model,
  formula = lm(Y~X+C, data = df),
  imply = FALSE
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role       X  Y  conf  med  col  IO  dMed  dCol
#> X         exposure   x                                   
#> Y         outcome       x                                
#> C         collider                    x    x             
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {}
#> Canonical controls: {}
#> 
#> Formulas:
#>   original:  Y ~ X + C
#> 
#> Model comparison:
#> 
#> +---+----------+-----------+-----------+
#> |   | Original | Minimal 1 | Canonical |
#> +===+==========+===========+===========+
#> | X | 0.908*** | 1.415***  | 1.415***  |
#> +---+----------+-----------+-----------+
#> |   | (0.030)  | (0.021)   | (0.021)   |
#> +---+----------+-----------+-----------+
#> | C | 0.475*** |           |           |
#> +---+----------+-----------+-----------+
#> |   | (0.022)  |           |           |
#> +===+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01,  |
#> | *** p < 0.001                        |
#> +===+==========+===========+===========+

In cases where you want DAGassist to explore all of the causal relationships explicated in your DAG, use imply = TRUE.

DAGassist(
  dag = dag_model,
  formula = lm(Y~X+C, data = df),
  imply = TRUE
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role        X  Y  conf  med  col  IO  dMed  dCol
#> X         exposure    x                                   
#> Y         outcome        x                      x         
#> Z         confounder        x                             
#> M         mediator                x                       
#> C         collider                     x    x   x         
#> A         other                                           
#> B         other                                           
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {Z}
#> Canonical controls: {A, B, Z}
#> 
#> Formulas:
#>   original:  Y ~ X + C
#>   minimal 1 : Y ~ X + Z
#>   canonical: Y ~ X + A + B + Z
#> 
#> Note: DAGassist added variables not in your formula, based on the
#> relationships in your DAG, to block back-door paths
#> between X and Y.
#>   - Minimal 1 added: {Z}
#>   - Canonical added: {A, B, Z}
#> 
#> Model comparison:
#> 
#> +---+----------+-----------+-----------+
#> |   | Original | Minimal 1 | Canonical |
#> +===+==========+===========+===========+
#> | X | 0.908*** | 1.256***  | 1.256***  |
#> +---+----------+-----------+-----------+
#> |   | (0.030)  | (0.027)   | (0.026)   |
#> +---+----------+-----------+-----------+
#> | C | 0.475*** |           |           |
#> +---+----------+-----------+-----------+
#> |   | (0.022)  |           |           |
#> +---+----------+-----------+-----------+
#> | Z |          | 0.311***  | 0.309***  |
#> +---+----------+-----------+-----------+
#> |   |          | (0.034)   | (0.033)   |
#> +---+----------+-----------+-----------+
#> | A |          |           | 0.187***  |
#> +---+----------+-----------+-----------+
#> |   |          |           | (0.026)   |
#> +---+----------+-----------+-----------+
#> | B |          |           | -0.057*   |
#> +---+----------+-----------+-----------+
#> |   |          |           | (0.026)   |
#> +===+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01,  |
#> | *** p < 0.001                        |
#> +===+==========+===========+===========+

DAGassist will notify you of which variables it added. imply = FALSE by default.

omit_factors and omit_intercept arguments

DAGassist omits factor and intercept rows by default, but you can explicitly include them. However, if they are not included in your DAG, DAGassist will not evaluate them, and will not include them in the minimal or canonical models.

DAGassist(
  dag = dag_model,
  formula = fixest::feols(
    Y ~ X + C + i(region),  
    data = df),
  omit_factors = FALSE,
  omit_intercept = FALSE
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role       X  Y  conf  med  col  IO  dMed  dCol
#> X         exposure   x                                   
#> Y         outcome       x                                
#> C         collider                    x    x             
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {}
#> Canonical controls: {}
#> 
#> Note: The following regressors, which are included in the below models, were not evaluated by DAGassist because they are not nodes in the DAG:
#>   {i(region)}
#> 
#> Formulas:
#>   original:  Y ~ X + C + i(region)
#> 
#> Model comparison:
#> 
#> +----------------+----------+-----------+-----------+
#> |                | Original | Minimal 1 | Canonical |
#> +================+==========+===========+===========+
#> | (Intercept)    | 0.060    | -0.011    | -0.011    |
#> +----------------+----------+-----------+-----------+
#> |                | (0.049)  | (0.027)   | (0.027)   |
#> +----------------+----------+-----------+-----------+
#> | X              | 0.908*** | 1.415***  | 1.415***  |
#> +----------------+----------+-----------+-----------+
#> |                | (0.030)  | (0.021)   | (0.021)   |
#> +----------------+----------+-----------+-----------+
#> | C              | 0.474*** |           |           |
#> +----------------+----------+-----------+-----------+
#> |                | (0.022)  |           |           |
#> +----------------+----------+-----------+-----------+
#> | region = North | -0.030   |           |           |
#> +----------------+----------+-----------+-----------+
#> |                | (0.069)  |           |           |
#> +----------------+----------+-----------+-----------+
#> | region = South | -0.085   |           |           |
#> +----------------+----------+-----------+-----------+
#> |                | (0.069)  |           |           |
#> +----------------+----------+-----------+-----------+
#> | region = West  | -0.167*  |           |           |
#> +----------------+----------+-----------+-----------+
#> |                | (0.069)  |           |           |
#> +================+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01, *** p <       |
#> | 0.001                                             |
#> +================+==========+===========+===========+

labels arguments

You can include a label list.

labs <- list(
  X = "Exposure",
  C = "Collider"
)

DAGassist(
  dag = dag_model,
  formula = lm(
    Y ~ X + C, data = df),
  labels = labs
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role       X  Y  conf  med  col  IO  dMed  dCol
#> Exposure  exposure   x                                   
#> Y         outcome       x                                
#> Collider  collider                    x    x             
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {}
#> Canonical controls: {}
#> 
#> Formulas:
#>   original:  Y ~ X + C
#> 
#> Model comparison:
#> 
#> +----------+----------+-----------+-----------+
#> |          | Original | Minimal 1 | Canonical |
#> +==========+==========+===========+===========+
#> | Exposure | 0.908*** | 1.415***  | 1.415***  |
#> +----------+----------+-----------+-----------+
#> |          | (0.030)  | (0.021)   | (0.021)   |
#> +----------+----------+-----------+-----------+
#> | Collider | 0.475*** |           |           |
#> +----------+----------+-----------+-----------+
#> |          | (0.022)  |           |           |
#> +==========+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01, *** p   |
#> | < 0.001                                     |
#> +==========+==========+===========+===========+

Note that the label parameter uses modelsummary() coef_rename logic, so an incomplete label list will not throw any errors.

DAGassist(
  dag = dag_model,
  formula = lm(
    Y ~ X + C, data = df),
  labels = labs,
  imply = TRUE
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role        X  Y  conf  med  col  IO  dMed  dCol
#> Exposure  exposure    x                                   
#> Y         outcome        x                      x         
#> Z         confounder        x                             
#> M         mediator                x                       
#> Collider  collider                     x    x   x         
#> A         other                                           
#> B         other                                           
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {Z}
#> Canonical controls: {A, B, Z}
#> 
#> Formulas:
#>   original:  Y ~ X + C
#>   minimal 1 : Y ~ X + Z
#>   canonical: Y ~ X + A + B + Z
#> 
#> Note: DAGassist added variables not in your formula, based on the
#> relationships in your DAG, to block back-door paths
#> between X and Y.
#>   - Minimal 1 added: {Z}
#>   - Canonical added: {A, B, Z}
#> 
#> Model comparison:
#> 
#> +----------+----------+-----------+-----------+
#> |          | Original | Minimal 1 | Canonical |
#> +==========+==========+===========+===========+
#> | Exposure | 0.908*** | 1.256***  | 1.256***  |
#> +----------+----------+-----------+-----------+
#> |          | (0.030)  | (0.027)   | (0.026)   |
#> +----------+----------+-----------+-----------+
#> | Collider | 0.475*** |           |           |
#> +----------+----------+-----------+-----------+
#> |          | (0.022)  |           |           |
#> +----------+----------+-----------+-----------+
#> | Z        |          | 0.311***  | 0.309***  |
#> +----------+----------+-----------+-----------+
#> |          |          | (0.034)   | (0.033)   |
#> +----------+----------+-----------+-----------+
#> | A        |          |           | 0.187***  |
#> +----------+----------+-----------+-----------+
#> |          |          |           | (0.026)   |
#> +----------+----------+-----------+-----------+
#> | B        |          |           | -0.057*   |
#> +----------+----------+-----------+-----------+
#> |          |          |           | (0.026)   |
#> +==========+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01, *** p   |
#> | < 0.001                                     |
#> +==========+==========+===========+===========+

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.