--- title: "Sensitivity analysis for external validity" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{external-validity} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(rmarkdown.html_vignette.check_title = FALSE) ``` In the following vignette, we will walk through a basic example of how to conduct a sensitivity analysis for external validity using `senseweight`. We will use a subset of the JTPA data. ```{r example, warning=FALSE, message = FALSE} library(senseweight) library(ggplot2) # Load in JTPA data: data(jtpa_women) ``` ```{r} # Summarize sites jtpa_women |> dplyr::group_by(site) |> dplyr::summarize( length(prevearn), dplyr::across( c(prevearn, age, married, hrwage, black, hispanic, hsorged, yrs_educ), mean ) ) ``` Assume researchers are interested in generalizing the results from the site of Omaha, Nebraska to the other 15 experimental sites: ```{r estimate} site_name <- "NE" df_site <- jtpa_women[which(jtpa_women$site == site_name), ] df_else <- jtpa_women[which(jtpa_women$site != site_name), ] # Estimate unweighted estimator: model_dim <- estimatr::lm_robust(Y ~ T, data = df_site) PATE <- coef(lm(Y ~ T, data = df_else))[2] DiM <- coef(model_dim)[2] # Generate weights using observed covariates: df_all <- jtpa_women df_all$S <- ifelse(jtpa_women$site == "NE", 1, 0) model_ps <- WeightIt::weightit( (1 - S) ~ . - site - T - Y, data = df_all, method = "ebal", estimand = "ATT" ) weights <- model_ps$weights[df_all$S == 1] # Estimate IPW model: model_ipw <- estimatr::lm_robust(Y ~ T, data = df_site, weights = weights) ipw <- coef(model_ipw)[2] # Estimate bound for var(tau): m <- sqrt(stats::var(df_site$Y[df_site$T == 1]) / stats::var(df_site$Y[df_site$T == 0])) # Since m > 1: vartau <- stats::var(df_site$Y[df_site$T == 1]) - stats::var(df_site$Y[df_site$T == 0]) ``` # Sensitivity Summary Measures ## Robustness value We can generate the sensitivity summary measures using the `summarize_sensitivity` function: ```{r summaries} summarize_sensitivity( weights = weights, Y = df_site$Y, Z = df_site$T, sigma2 = vartau, estimand = "PATE" ) ``` The `summarize_sensitivity` function defaults to evaluating the robustness value at `q=1`, indicating a robustness value, relative to a bias equal to the point estimate. Researchers can specify different values for `q` in the function. In the generalization setting, researchers can modify the `sigma2` bound and posit their own values for a plausible bound (given substantive justification). With no specification, `sigma2` will be automatically calculated to be bound by `var(Y(1)) + var(Y(0))`. ```{r} RV = robustness_value(estimate = ipw, b_star = 0, sigma2 = vartau, weights = weights) print(RV) ``` ## Benchmarking ```{r benchmarking} # Select weighting variables: weighting_vars = names(df_all)[which(!names(df_all) %in% c("site", "S", "Y", "T"))] # Run bechmarking: df_benchmark <- run_benchmarking( weighting_vars = weighting_vars, data = df_all[, -1], treatment = "T", outcome = "Y", selection = "S", estimate = ipw, RV = RV, sigma2 = vartau, estimand = "PATE" ) print(df_benchmark) ``` ## Bias Contour Plot ```{r contour_plot, warning=FALSE, fig.width=6.5, fig.height=5} contour_plot( var(weights), vartau, ipw, df_benchmark, benchmark = TRUE, shade = TRUE, shade_var = c("age", "prevearn"), label_size = 4 ) ```