Introduction

The Marginal structural model (MSM), introduced by Robins et al. (2000) and Hernan et al. (2000), provides a principled approach to adjust for time-dependent confounding in treatment switching using inverse probability of treatment weighting (IPTW), which creates a pseudo-population where treatment switching is independent of measured covariates. Unlike the IPCW method, MSM does not censor data after treatment switching, and it can be used to estimate the effect of subsequent effect in addition to the effect of randomized treatment. Provided that prognostic covariates for treatment switching and survival are collected at baseline and at regular intervals, MSM can be more powerful than IPCW for estimating the effect of randomized treatment.

Estimation of weights

The switching model can be implemented using pooled logistic regression.

Visit-Specific Intercepts: These can be modeled using a natural cubic spline with specified degrees of freedom, denoted as \(\nu\) (corresponding to the ns_df parameter of the msm function). The boundary and internal knots can be based on the range and percentiles of observed treatment switching times, respectively. Here, \(\nu\) equals the number of internal knots plus one; a value of \(\nu=0\) indicates a common intercept, while \(\nu=1\) leads to a linear effect of visit. By default, \(\nu=3\), which implies two internal knots for the cubic spline.
Stratification Factors: If more than one stratification factor exists, they can be included in the logistic model either as main effects only (strata_main_effect_only = TRUE) or as all possible combinations of the levels of the stratification factors (strata_main_effect_only = FALSE).
Handling Nonconvergence: If the pooled logistic regression model encounters issues such as complete or quasi-complete separation, Firth’s bias-reducing penalized likelihood method (firth = TRUE) can be applied alongside intercept correction (flic = TRUE) to yield more accurate predicted probabilities.
Inverse Probability of Treatment Weighting: Let \(A_{i,j}\) denote the indicator of alternative therapy for subject \(i\) in treatment cycle \(j\). We assume that once a patient switched to an alternative therapy, he/she will stay on the alternative therapy. Let \(X_i\) denote the baseline covariates for subject \(i\) and \(L_{i,j}\) denote the time-dependent covariates for subject \(i\) at the start of cycle \(j\). For the pooled logistic regression model for treatment switching, the full data will be used for patients who didn’t switch treatment, while the partial data up to and including the treatment cycle when the patient switched treatment will be used for patients who switched treatment. The logistic regression model is used to estimate \[P(A_j = a_{i,j} | A_{j-1} = 0, \bar{L}_{j} = \bar{l}_{i,j}, X = x_i)\] where \(x_i\), \(l_{i,j}\), and \(a_{i,j}\) denote the observed value of baseline covariates, time-dependent covariates for cycle \(j\), and alternative therapy status for cycle \(j\) of subject \(i\), respectively. Natural cubic splines can be used to model visit-specific intercepts for the pooled logistic regression model. The unstabilized weights for subject \(i\) for cycle \(t\) are given by \[w_i(t) = \prod_{j=1}^{t} \frac{1}{P(A_j = a_{i,j} | \bar{A}_{j-1} = \bar{a}_{i,j-1}, \bar{L}_{j} = \bar{l}_{i,j}, X = x_i)}\] where, by assumption, \(P(A_j = 1 | A_{j-1} = 1) = 1\), i.e., the weights do not vary after treatment switching.

Stabilized Weights

To address the high variability in “unstabilized” weights, researchers often opt for “stabilized” weights. The switching model for stabilized weights involves a model for the numerator of the weight in addition to the model for the denominator. Typically, the numerator model mirrors the denominator model but includes only prognostic baseline variables. Any prognostic baseline variables included in the numerator must also be part of the weighted outcome model.

Truncation of Weights

You can specify weight truncation using the trunc and trunc_upper_only parameters of the ipcw function. The trunc parameter is a number between 0 and 0.5 that represents the fraction of the weight distribution to truncate. The trunc_upper_only parameter is a flag that indicates whether to truncate weights only from the upper end of the weight distribution.

Estimation of Hazard Ratio

Once weights have been obtained for each event time in the data set, we fit a (potentially stratified) Cox proportional hazards model to this weighted data set to obtain an estimate of the hazard ratio. The Cox model includes baseline covariates, randomized treatment, and time-dependent alternative therapy. An interaction between randomized and alternative treatment can be included when patients from both treatment arms can switch to alternative therapy. The confidence interval for the hazard ratio can be derived by bootstrapping the weight estimation and the subsequent model-fitting process.

Example for Pooled Logistic Regression Switching Model

First we prepare the data.

sim1 <- tssim(
  tdxo = 0, coxo = 0, p_R = 0.5, p_X_1 = 0.3, p_X_0 = 0.3, 
  rate_T = 0.002, beta1 = -0.5, beta2 = 0.3, 
  gamma0 = 0.3, gamma1 = -0.9, gamma2 = 0.7, gamma3 = 1.1, gamma4 = -0.8,
  zeta0 = -3.5, zeta1 = 0.5, zeta2 = 0.2, zeta3 = -0.4, 
  alpha0 = 0.5, alpha1 = 0.5, alpha2 = 0.4, 
  theta1_1 = -0.4, theta1_0 = -0.4, theta2 = 0.2,
  rate_C = 0.0000855, followup = 20, days = 30,
  n = 500, NSim = 100, seed = 314159)

Given that the observations are collected at regular intervals, we can use a pooled logistic regression switching model. In this model, the prognostic baseline variable is represented by bprog, while L serves as time-dependent covariates. Additionally, ns1, ns2, and ns3 are the three terms for the natural cubic spline with \(\nu=3\) degrees of freedom for visit-specific intercepts.

fit1 <- msm(
  sim1[[1]], id = "id", tstart = "tstart", 
  tstop = "tstop", event = "Y", treat = "trtrand", 
  swtrt = "xo", swtrt_time = "xotime", base_cov = "bprog", 
  numerator = "bprog", denominator = c("bprog", "L"), 
  ns_df = 3, swtrt_control_only = TRUE, boot = FALSE)

The fits for the denominator and numerator switching models are as follows.

# denominator switching model fit
fit1$fit_switch[[1]]$fit_den$parest[, c("param", "beta", "sebeta", "z")]
#>         param        beta    sebeta          z
#> 1 (Intercept) -4.58906421 0.5539967 -8.2835581
#> 2       bprog  0.60139329 0.2703681  2.2243498
#> 3           L  1.59547085 0.5253828  3.0367776
#> 4         ns1  0.04454811 0.6632449  0.0671669
#> 5         ns2 -1.44772063 0.7399064 -1.9566268
#> 6         ns3 -1.37103666 0.7794941 -1.7588801


# numerator switching model fit
fit1$fit_switch[[1]]$fit_num$parest[, c("param", "beta", "sebeta", "z")]
#>         param       beta    sebeta           z
#> 1 (Intercept) -3.3752627 0.3019737 -11.1773386
#> 2       bprog  0.8026209 0.2671525   3.0043551
#> 3         ns1  0.1628824 0.6645797   0.2450909
#> 4         ns2 -1.2821252 0.7300037  -1.7563271
#> 5         ns3 -1.3249197 0.7733622  -1.7131944

Since treatment switching is not allowed in the experimental arm, the weights are identical to 1 for subjects in the experimental group. The unstabilized and stablized weights for the control group are plotted below.

# unstabilized weights
ggplot(fit1$data_outcome %>% filter(trtrand == 0), 
       aes(x = unstabilized_weight)) + 
  geom_density(fill="#77bd89", color="#1f6e34", alpha=0.8) +
  scale_x_continuous("unstabilized weights")


# stabilized weights
ggplot(fit1$data_outcome %>% filter(trtrand == 0), 
       aes(x = stabilized_weight)) + 
  geom_density(fill="#77bd89", color="#1f6e34", alpha=0.8) +
  scale_x_continuous("stabilized weights")

Now we fit a weighted outcome Cox model and compare the treatment hazard ratio estimate with the reported.

fit1$fit_outcome$parest[, c("param", "beta", "sebeta", "z")]
#>     param       beta    sebeta         z
#> 1 treated -0.7842312 0.1155526 -6.786789
#> 2   bprog  0.2528728 0.1235804  2.046222
#> 3 crossed -0.7063899 0.1853693 -3.810716