The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
🛸 The Directed Prediction Index (DPI).
The Directed Prediction Index (DPI) is a quasi-causal inference method for cross-sectional data designed to quantify the relative endogeneity (relative dependence) of outcome (Y) versus predictor (X) variables in regression models.
Bruce H. W. S. Bao 包寒吴霜
## Method 1: Install from CRAN
install.packages("DPI")
## Method 2: Install from GitHub
install.packages("devtools")
::install_github("psychbruce/DPI", force=TRUE) devtools
Define \(\text{DPI}\) as the product of \(\text{Direction}\) (relative direction) and \(\text{Strength}\) (absolute strength) of the expected \(X \rightarrow Y\) relationship:
\[ \begin{aligned} \text{DPI}_{X \rightarrow Y} & = \text{Direction}_{X \rightarrow Y} \cdot \text{Strength}_{XY} \\ & = \text{Delta}(R^2) \cdot \text{Sigmoid}(\frac{p}{\alpha}) \\ & = \left( R_{Y \sim X + Covs}^2 - R_{X \sim Y + Covs}^2 \right) \cdot \left( 1 - \tanh \frac{p_{XY|Covs}}{2\alpha} \right) \\ & \in (-1, 1) \end{aligned} \]
In econometrics and broader social sciences, an exogenous variable is assumed to have a directed (causal or quasi-causal) influence on an endogenous variable (\(ExoVar \rightarrow EndoVar\)). By quantifying the relative endogeneity of outcome versus predictor variables in multiple linear regression models, the DPI can suggest a plausible (admissible) direction of influence (i.e., \(\text{DPI}_{X \rightarrow Y} > 0 \text{: } X \rightarrow Y\)) after controlling for a sufficient number of possible confounders and simulated random covariates.
\[ \begin{aligned} \text{Direction}_{X \rightarrow Y} & = \text{Endogeneity}(Y) - \text{Endogeneity}(X) \\ & = R_{Y \sim X + Covs}^2 - R_{X \sim Y + Covs}^2 \\ & = \text{Delta}(R^2) \\ & \in (-1, 1) \end{aligned} \]
k.cov
in the DPI()
function). A higher \(R^2\) indicates higher dependence
(i.e., higher endogeneity) in a given variable set.\[ \begin{aligned} \text{Sigmoid}(\frac{p}{\alpha}) & = 2 \left[ 1 - \text{sigmoid}(\frac{p_{XY|Covs}}{\alpha}) \right] \\ & = 1 - \tanh \frac{p_{XY|Covs}}{2\alpha} \\ & \in (0, 1) \end{aligned} \]
\[ \begin{aligned} \text{sigmoid}(x) & = \frac{1}{1 + e^{-x}} \\ & = \frac{\tanh(\frac{x}{2}) + 1}{2}, & \in (0, 1) \\ \tanh(x) & = \frac{e^x - e^{-x}}{e^x + e^{-x}} \\ & = 1 - \frac{2}{1 + e^{2x}} \\ & = \frac{2}{1 + e^{-2x}} - 1 \\ & = 2 \cdot \text{sigmoid}(2x) - 1, & \in (-1, 1) \\ \text{Sigmoid}(\frac{p}{\alpha}) & = 2 \left[ 1 - \text{sigmoid}(\frac{p}{\alpha}) \right] \\ & = 1 - \tanh \frac{p}{2\alpha}. & \in (0, 1) \end{aligned} \]
\(p\) | \(\text{Sigmoid}(\frac{p}{\alpha})\) with \(\alpha = 0.05\) |
---|---|
(~0) | (~1) |
0.0001 | 0.999 |
0.001 | 0.990 |
0.01 | 0.900 |
0.02 | 0.803 |
0.03 | 0.709 |
0.04 | 0.620 |
0.05 (\(\frac{p}{\alpha}\) = 1) | 0.538 |
0.10 | 0.238 |
0.20 | 0.036 |
0.50 | 0.00009 |
0.80 | 0.0000002 |
1 | 0.000000004 |
n.sim
random samples, with k.cov
(unobservable) random covariate(s) in each simulated sample, to test the
statistical significance of DPI()
.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.