Complete Workflow: From Data to Decision

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Deniz Akdemir

2026-03-26

Overview

This vignette demonstrates the complete causaldef workflow, from data specification through to policy decision-making. We show how Le Cam deficiency theory translates abstract statistical concepts into actionable clinical insights.

The workflow consists of four stages:

Specify → Define the causal problem
Estimate → Compute deficiency for different adjustment strategies
Diagnose → Validate assumptions using negative controls and sensitivity analysis
Decide → Compute policy regret bounds and make informed decisions

Part 1: Gene Perturbation Study (Continuous Outcome)

We begin with the gene_perturbation dataset, which simulates a CRISPR knockout experiment. This illustrates the core workflow for continuous outcomes.

1.1 Data Description

library(causaldef)
library(ggplot2)

data(gene_perturbation)
str(gene_perturbation)
#> 'data.frame':    500 obs. of  6 variables:
#>  $ sample_id        : Factor w/ 500 levels "S1","S10","S100",..: 1 112 223 334 445 457 468 479 490 2 ...
#>  $ batch            : Factor w/ 4 levels "1","2","3","4": 3 4 4 2 2 2 4 3 4 4 ...
#>  $ library_size     : num  948877 984900 868332 1052105 880271 ...
#>  $ knockout_status  : Factor w/ 2 levels "Control","Knockout": 2 1 1 1 1 2 2 1 2 1 ...
#>  $ target_expression: num  8.64 11.54 12.32 12.94 8.29 ...
#>  $ housekeeping_gene: num  6.59 9.42 8.57 9.22 6.38 ...

Variables:

knockout_status: Treatment (Control vs. Knockout)
target_expression: Primary outcome (gene expression level)
housekeeping_gene: Negative control outcome (shouldn’t be affected by knockout)
batch, library_size: Technical confounders

Causal Structure:

      [Batch, Library Size]
             |
             v
   [Knockout] -----> [Target Expression]
             \
              \---X--> [Housekeeping Gene]  (no causal effect)

The housekeeping gene is affected by the same technical variations but NOT by the knockout, making it an ideal negative control.

1.2 Step 1: Specification

spec_gene <- causal_spec(
    data = gene_perturbation,
    treatment = "knockout_status",
    outcome = "target_expression",
    covariates = c("batch", "library_size"),
    negative_control = "housekeeping_gene",
    estimand = "ATE",
    outcome_type = "continuous"
)
#> ✔ Created causal specification: n=500, 2 covariate(s)

print(spec_gene)
#> 
#> -- Causal Specification --------------------------------------------------
#> 
#> * Treatment: knockout_status ( binary )
#> * Outcome: target_expression ( continuous )
#> * Covariates: batch, library_size 
#> * Sample size: 500 
#> * Estimand: ATE 
#> * Negative control: housekeeping_gene

1.3 Step 2: Deficiency Estimation

We compare three adjustment strategies:

Unadjusted: Ignores technical confounders
IPTW: Reweights samples to balance batch and library size
AIPW: Augmented IPTW (doubly robust)

deficiency_gene <- estimate_deficiency(
    spec_gene,
    methods = c("unadjusted", "iptw", "aipw"),
    n_boot = 100 # Use more for production (e.g., 1000)
)
#> ℹ Inferred treatment value: Knockout
#> ℹ Estimating deficiency: unadjusted
#> ℹ Estimating deficiency: iptw
#> ℹ Estimating deficiency: aipw

print(deficiency_gene)
#> 
#> -- Deficiency Proxy Estimates (PS-TV) ------
#> 
#>      Method  Delta     SE               CI            Quality
#>  unadjusted 0.1424 0.0428 [0.1241, 0.2803] Insufficient (Red)
#>        iptw 0.0368 0.0140 [0.0234, 0.0712]  Excellent (Green)
#>        aipw 0.0368 0.0122 [0.0259, 0.0707]  Excellent (Green)
#> Note: delta is a propensity-score TV proxy (overlap/balance diagnostic).
#> 
#> Best method: iptw (delta = 0.0368 )

Interpretation:

δ ≈ 0: The adjustment creates a “synthetic RCT” (observational data approximates what we’d see if knockout were randomized)
δ > 0.1: Substantial distributional mismatch remains; causal conclusions are uncertain

plot(deficiency_gene, type = "bar")

1.4 Step 3: Diagnose with Negative Control

The negative control diagnostic tests whether our adjustment removes ALL confounding, not just the measured confounders.

nc_test <- nc_diagnostic(
    spec_gene,
    method = "iptw",
    alpha = 0.05,
    n_boot = 100
)
#> ℹ Using kappa = 1 (conservative). Consider domain-specific estimation or sensitivity analysis via kappa_range.
#> ✔ No evidence against causal assumptions (p = 0.10891 )

print(nc_test)
#> 
#> -- Negative Control Diagnostic ----------------------------------------
#> 
#> * screening statistic (weighted corr): 0.0821 
#> * delta_NC (association proxy): 0.0821 
#> * delta bound (under kappa alignment): 0.0821 (kappa = 1 )
#> * screening p-value: 0.10891 
#> * screening method: weighted_permutation_correlation 
#> 
#> RESULT: NOT REJECTED. This is a screening result, not proof that confounding is absent.
#> NOTE: Your effect estimate must exceed the Noise Floor (delta_bound) to be meaningful.

Decision Logic:

Result	Interpretation	Action
`falsified = FALSE`	No evidence of residual confounding	Proceed with analysis
`falsified = TRUE`	Residual confounding detected	Add covariates or acknowledge limitations

1.5 Step 4: Policy Decision

Suppose we’re deciding whether to pursue this gene target for drug development. The utility is measured on a scale where: - 0 = no promise (no effect on expression) - 10 = maximum promise (strong effect)

bounds_gene <- policy_regret_bound(
    deficiency_gene,
    utility_range = c(0, 10)
)
#> Warning: Multiple fitted methods are available but `method` was not specified.
#> ℹ Using the smallest available delta across methods is optimistic after model
#>   selection.
#> ℹ For a pre-specified decision bound, call `policy_regret_bound()` with `method
#>   = '<chosen method>'`.
#> ℹ Transfer penalty: 0.3676 (delta = 0.0368)

print(bounds_gene)
#> 
#> -- Policy Regret Bounds -------------------------------------------------
#> 
#> * Deficiency delta: 0.0368 
#> * Delta mode: point 
#> * Delta method: iptw 
#> * Delta selection: minimum across fitted methods 
#> * Utility range: [0, 10]
#> * Transfer penalty: 0.3676 (additive regret upper bound)
#> * Minimax floor: 0.1838 (worst-case lower bound)
#> 
#> Note: this is a plug-in bound using a deficiency proxy rather than an identified exact deficiency.
#> Note: minimum-across-methods selection is optimistic after model selection.
#> 
#> Interpretation: Transfer penalty is 3.7 % of utility range given delta

Regret Bounds:

policy_regret_bound() reports:

Transfer penalty \(M\cdot\delta\): additive worst-case regret inflation term, and
Minimax safety floor \((M/2)\cdot\delta\): irreducible worst-case regret when \(\delta>0\).

Decision Rule: - If transfer_penalty < acceptable risk threshold → Proceed with confidence - If transfer_penalty > acceptable risk threshold → Seek more evidence

1.6 Effect Estimation

Finally, we estimate the causal effect using the best-performing method:

effect_gene <- estimate_effect(
    deficiency_gene,
    target_method = "iptw"
)

print(effect_gene)
#> 
#> -- Causal Effect Estimate ----------------------
#> Method:    iptw 
#> Type:      ATE 
#> Contrast:  Knockout vs Control 
#> Estimate:  -1.7085

Complete Report:

cat(sprintf(
    "
## Gene Perturbation Analysis Report

**Treatment Effect (IPTW-adjusted):** %.2f log2 expression units
**Deficiency (δ):** %.3f
**Negative Control Test:** %s (p = %.3f)
**Transfer Penalty:** %.3f on [0, 10] scale
**Minimax Safety Floor:** %.3f on [0, 10] scale

**Conclusion:** %s
",
    effect_gene$estimate,
    deficiency_gene$estimates["iptw"],
    ifelse(nc_test$falsified, "FAILED", "PASSED"),
    nc_test$p_value,
    bounds_gene$transfer_penalty,
    bounds_gene$minimax_floor,
    ifelse(nc_test$falsified,
        "Residual confounding detected. Interpret with caution.",
        ifelse(bounds_gene$transfer_penalty < 0.5,
            "Strong evidence supporting treatment effect.",
            "Moderate evidence; consider additional validation."
        )
    )
))

Gene Perturbation Analysis Report

Treatment Effect (IPTW-adjusted): -1.71 log2 expression units Deficiency (δ): 0.037 Negative Control Test: PASSED (p = 0.109) Transfer Penalty: 0.368 on [0, 10] scale Minimax Safety Floor: 0.184 on [0, 10] scale

Conclusion: Strong evidence supporting treatment effect.

Part 2: Hematopoietic Cell Transplantation (Survival Outcome)

Next, we analyze the hct_outcomes dataset, which mimics a retrospective registry study comparing conditioning regimens in HCT.

2.1 Data Description

data(hct_outcomes)
str(hct_outcomes)
#> 'data.frame':    800 obs. of  8 variables:
#>  $ id                    : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ age                   : num  57 50 59 65 54 48 55 49 46 58 ...
#>  $ disease_status        : Factor w/ 3 levels "Advanced","Early",..: 2 2 3 2 1 2 2 1 2 2 ...
#>  $ kps                   : num  89 85 97 84 96 64 88 73 82 100 ...
#>  $ donor_type            : Factor w/ 3 levels "HLA-Matched",..: 1 2 2 3 3 2 2 1 1 2 ...
#>  $ conditioning_intensity: Factor w/ 2 levels "Myeloablative",..: 1 2 1 1 2 1 1 1 1 1 ...
#>  $ time_to_event         : num  14.31 11.05 20.23 2.9 2.22 ...
#>  $ event_status          : Factor w/ 3 levels "Censored","Death",..: 3 3 3 3 3 2 3 3 3 3 ...

# Summarize key variables
summary(hct_outcomes[, c("age", "kps", "time_to_event")])
#>       age             kps         time_to_event   
#>  Min.   :15.00   Min.   : 60.00   Min.   : 0.000  
#>  1st Qu.:42.00   1st Qu.: 77.00   1st Qu.: 3.305  
#>  Median :50.00   Median : 85.00   Median : 8.220  
#>  Mean   :49.96   Mean   : 84.26   Mean   :12.605  
#>  3rd Qu.:58.00   3rd Qu.: 92.00   3rd Qu.:17.905  
#>  Max.   :89.00   Max.   :100.00   Max.   :86.080
table(hct_outcomes$conditioning_intensity, hct_outcomes$event_status)
#>                
#>                 Censored Death Relapse
#>   Myeloablative      117   119     486
#>   Reduced              6    12      60

Clinical Context:

Myeloablative conditioning: High-intensity chemotherapy (younger, healthier patients)
Reduced-intensity conditioning: Lower dose (older, sicker patients)
Outcome: Time to death or relapse

The key challenge is confounding by indication: doctors assign treatment based on patient status, making naive comparisons biased.

2.2 Step 1: Survival Specification

Executable survival-effect code in this section requires a compatible survival runtime. In the current support matrix, that means R >= 4.0.

# Create binary event indicator
hct_outcomes$event <- ifelse(hct_outcomes$event_status != "Censored", 1, 0)

spec_hct <- causal_spec_survival(
    data = hct_outcomes,
    treatment = "conditioning_intensity",
    time = "time_to_event",
    event = "event",
    covariates = c("age", "disease_status", "kps", "donor_type"),
    estimand = "RMST",
    horizon = 24 # 24-month restricted mean survival time
)

print(spec_hct)

2.3 Step 2: Deficiency Estimation

deficiency_hct <- estimate_deficiency(
    spec_hct,
    methods = c("unadjusted", "iptw"),
    n_boot = 50 # Use more for production
)

print(deficiency_hct)

Clinical Interpretation:

The deficiency tells us how much our observational evidence differs from what an RCT would provide:

Unadjusted δ: High, reflecting strong selection bias (young patients → myeloablative)
IPTW δ: Lower, IPTW reweights to balance baseline covariates

2.4 Step 3: Confounding Frontier

Beyond point estimates, we can map a sensitivity analysis showing how deficiency varies with hypothetical unmeasured confounding:

frontier <- confounding_frontier(
    spec_hct,
    alpha_range = c(-2, 2), # Confounding path: U → Treatment
    gamma_range = c(-2, 2), # Confounding path: U → Outcome
    grid_size = 30
)

print(frontier)
plot(frontier)

Reading the Frontier Map:

Center (α = 0 or γ = 0): No unmeasured confounding → δ = 0
Corners: Strong confounding on both paths → high δ
Observed covariates (dots): Benchmark strengths of measured confounders

If an unmeasured confounder would need extreme strength (beyond observed benchmarks) to substantially increase δ, conclusions are robust.

2.5 Step 4: Policy Regret and RMST Effect

# Utility = months of survival (horizon = 24)
bounds_hct <- policy_regret_bound(
    deficiency_hct,
    utility_range = c(0, 24)
)

print(bounds_hct)

Clinical Regret Bounds:

If δ = 0.05 with M = 24 months: - Transfer penalty = 24 × 0.05 = 1.2 months - Minimax safety floor = 0.5 × 24 × 0.05 = 0.6 months

This means even with perfect decision-making, the observational evidence can inflate regret by up to about 1.2 months on the 0–24 month utility scale, and there is an unavoidable worst-case floor of about 0.6 months when \(\delta>0\).

# Estimate RMST difference
effect_hct <- estimate_effect(
    deficiency_hct,
    target_method = "iptw",
    contrast = c("Myeloablative", "Reduced")
)

print(effect_hct)

2.6 Complete Decision Framework

delta_iptw <- deficiency_hct$estimates["iptw"]
transfer_penalty <- bounds_hct$transfer_penalty
minimax_floor <- bounds_hct$minimax_floor
rmst_diff <- effect_hct$estimate

# Decision logic
if (delta_iptw < 0.05) {
    evidence_quality <- "HIGH (approximates RCT)"
} else if (delta_iptw < 0.15) {
    evidence_quality <- "MODERATE (acceptable with caveats)"
} else {
    evidence_quality <- "LOW (substantial bias risk)"
}

# Benefit-to-risk ratio
if (!is.na(rmst_diff) && !is.na(transfer_penalty) && transfer_penalty > 0) {
    benefit_to_risk <- abs(rmst_diff) / transfer_penalty
    recommendation <- ifelse(benefit_to_risk > 2,
        "Recommend treatment with stronger effect",
        "Evidence inconclusive; consider shared decision-making"
    )
} else {
    benefit_to_risk <- NA
    recommendation <- "Unable to calculate benefit-to-risk ratio"
}

cat(sprintf(
    "
## HCT Treatment Decision Report

**RMST Difference (IPTW):** %.2f months (%s favored)
**Deficiency:** %.3f
**Evidence Quality:** %s
**Transfer Penalty:** %.2f months
**Minimax Safety Floor:** %.2f months

**Benefit-to-Risk Ratio:** %.1f:1
**Recommendation:** %s

### Clinical Translation

The observational evidence suggests %s conditioning provides approximately
%.1f additional months of relapse-free survival within the first 2 years.

However, the transfer penalty is %.1f months and the minimax safety floor is %.1f months
on the 0--24 month utility scale. Clinicians should weigh this against individual patient factors.
",
    abs(rmst_diff),
    ifelse(rmst_diff > 0, "Myeloablative", "Reduced"),
    delta_iptw,
    evidence_quality,
    transfer_penalty,
    minimax_floor,
    benefit_to_risk,
    recommendation,
    ifelse(rmst_diff > 0, "myeloablative", "reduced-intensity"),
    abs(rmst_diff),
    transfer_penalty,
    minimax_floor
))

Part 3: Comparative Analysis Across Studies

3.1 When Is Observational Evidence Sufficient?

Study	δ (IPTW)	Transfer Penalty	Evidence Quality
Gene Perturbation	Low (~0.01)	Very Low	Excellent
HCT Outcomes	Moderate (~0.05-0.15)	1-4 months	Acceptable with caveats

3.2 General Workflow Summary

┌─────────────────────────────────────────────────────────────────┐
│  SPECIFY: causal_spec() / causal_spec_survival()                │
│           ↓ Define treatment, outcome, covariates, NC           │
├─────────────────────────────────────────────────────────────────┤
│  ESTIMATE: estimate_deficiency()                                │
│            ↓ Compare unadjusted, IPTW, AIPW, TMLE, etc.        │
│            ↓ Select method with lowest δ                        │
├─────────────────────────────────────────────────────────────────┤
│  DIAGNOSE: nc_diagnostic() + confounding_frontier()             │
│            ↓ Test whether assumptions are falsified             │
│            ↓ Map sensitivity to unmeasured confounding          │
├─────────────────────────────────────────────────────────────────┤
│  DECIDE: policy_regret_bound() + estimate_effect()              │
│          ↓ Compute transfer penalty / minimax floor             │
│          ↓ Report effect with uncertainty qualification         │
└─────────────────────────────────────────────────────────────────┘

References

Akdemir, D. (2026). Constraints on Causal Inference as Experiment Comparison: A Framework for Identification, Transportability, and Policy Learning. DOI: 10.5281/zenodo.18367347
Le Cam, L., & Yang, G. L. (2000). Asymptotics in Statistics: Some Basic Concepts. Springer.
VanderWeele, T. J., & Ding, P. (2017). Sensitivity Analysis in Observational Research: Introducing the E-value. Annals of Internal Medicine.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.