trafficCAR Model Diagnostics and Checking

This vignette explains how to interpret the diagnostic tools provided by trafficCAR. These diagnostics are designed to answer three questions:

  1. Are there systematic discrepancies between the model and the observed data?
  2. Is there remaining spatial structure that the model has failed to explain?
  3. Are the fitted models capable of reproducing key features of the data?

The diagnostics are intentionally simple and global. They are meant to flag problems early, not to replace detailed model criticism.


Residual diagnostics

The residuals() method for a traffic_fit object provides three types of residuals:

Raw residuals reflect overall lack of fit. Unstructured residuals are particularly important: they represent the portion of the data that should be approximately independent if the spatial model is adequate.

Typical usage:

r_raw <- residuals(fit, type = "raw")
r_un  <- residuals(fit, type = "unstructured")
summary(r_raw)
summary(r_un)

Interpretation guidelines:


Moran’s I on residuals

Spatial autocorrelation in residuals is assessed using Moran’s I via moran_residuals().

moran_residuals(fit, type = "unstructured", method = "permutation")

Interpretation depends on the residual type:

Permutation-based p-values should be interpreted as global diagnostics. A small p-value for unstructured residuals is a strong indication of model misspecification (e.g., missing covariates or inappropriate neighborhood structure).

If residual variance is zero, Moran’s I is undefined and returned as NA. This typically occurs in saturated or near-saturated models.


Posterior predictive checks

Posterior predictive checks (PPCs) compare observed summary statistics to their distribution under replicated data generated from the fitted model.

ppc <- ppc_summary(fit, stats = c("mean", "var", "tail"))
print(ppc)

The following statistics are reported:

Each statistic is accompanied by a posterior predictive p-value:

\[ \text{p-value} = P(T(y^{rep}) \ge T(y) \mid y) \]

Interpretation guidelines:

PPCs are not formal hypothesis tests. They are descriptive tools intended to highlight discrepancies between the model and the data.


Practical workflow

A recommended diagnostic workflow is:

  1. Inspect raw and unstructured residual summaries.
  2. Compute Moran’s I on unstructured residuals.
  3. Run posterior predictive checks on means and variances.

Consistent signals across these diagnostics provide strong evidence for or against model adequacy.


Limitations

The diagnostics provided here are intentionally conservative:

These tools are best viewed as a first line of model checking rather than a complete diagnostic framework.