---
title: "mfrmr Workflow"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{mfrmr Workflow}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
is_cran_check <- !isTRUE(as.logical(Sys.getenv("NOT_CRAN", "false")))
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5,
  eval = !is_cran_check
)
```

This vignette outlines a reproducible workflow for:

- loading packaged simulation data
- fitting an MFRM with flexible facets
- running diagnostics and residual PCA
- generating APA and visual summary outputs
- moving from fitted models into design simulation and fixed-calibration
  prediction

For a plot-first companion guide, see the separate
`mfrmr-visual-diagnostics` vignette.

For speed-sensitive work, a useful pattern is:

- start with `method = "JML"` or `quad_points = 7`
- use `diagnose_mfrm(..., residual_pca = "none")` for the first pass
- reuse the same diagnostics object in downstream reports and plots

## MML and Diagnostic Modes

`mfrmr` treats `MML` and `JML` differently on purpose.

- `MML` integrates over the person distribution with Gauss-Hermite quadrature.
- The current `MML` branch optimizes the quadrature-based marginal
  log-likelihood directly; it is not an EM implementation.
- `JML` is often useful for quick exploratory passes, but `MML` remains the
  preferred route for final estimation and fixed-calibration follow-up.

For `RSM` and `PCM`, diagnostics now expose two distinct evidence paths:

- `diagnostic_mode = "legacy"` keeps the residual/EAP-based stack.
- `diagnostic_mode = "marginal_fit"` adds the strict latent-integrated screen.
- `diagnostic_mode = "both"` is the safest default when you want to inspect
  both views side by side.

The strict marginal branch is still screening-oriented in the current release.
Use `summary(diag)$diagnostic_basis` to separate the legacy residual evidence
from the strict marginal evidence rather than pooling them into one decision.

## Load Data

```{r load-data}
library(mfrmr)

list_mfrmr_data()

data("ej2021_study1", package = "mfrmr")
head(ej2021_study1)

study1_alt <- load_mfrmr_data("study1")
identical(names(ej2021_study1), names(study1_alt))
```

## Minimal Runnable Example

Start with the packaged `example_core` dataset. It is intentionally compact,
category-complete, and generated from a single latent trait plus facet main
effects so that help-page examples stay fast without relying on undersized toy
data. The same object is also available via
`data("mfrmr_example_core", package = "mfrmr")`:

```{r toy-setup}
data("mfrmr_example_core", package = "mfrmr")
toy <- mfrmr_example_core

fit_toy <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "JML",
  model = "RSM",
  maxit = 30
)
diag_toy <- diagnose_mfrm(fit_toy, residual_pca = "none")

summary(fit_toy)$overview
summary(diag_toy)$overview
names(plot(fit_toy, draw = FALSE))
```

## Diagnostics and Reporting

```{r diagnostics-reporting}
t4_toy <- unexpected_response_table(
  fit_toy,
  diagnostics = diag_toy,
  abs_z_min = 1.5,
  prob_max = 0.4,
  top_n = 10
)
t12_toy <- fair_average_table(fit_toy, diagnostics = diag_toy)
t13_toy <- bias_interaction_report(
  estimate_bias(fit_toy, diag_toy,
                facet_a = "Rater", facet_b = "Criterion",
                max_iter = 2),
  top_n = 10
)

class(summary(t4_toy))
class(summary(t12_toy))
class(summary(t13_toy))

names(plot(t4_toy, draw = FALSE))
names(plot(t12_toy, draw = FALSE))
names(plot(t13_toy, draw = FALSE))

chk_toy <- reporting_checklist(fit_toy, diagnostics = diag_toy)
subset(
  chk_toy$checklist,
  Section == "Visual Displays",
  c("Item", "DraftReady", "NextAction")
)
```

## Fit and Diagnose with Full Data

For a realistic analysis, use the packaged Study 1 dataset:

```{r fit-full}
fit <- fit_mfrm(
  data = ej2021_study1,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7
)

diag <- diagnose_mfrm(
  fit,
  residual_pca = "none"
)

summary(fit)
summary(diag)
```

If you need residual-structure evidence for a final report, you can add residual
PCA after the initial diagnostic pass. Treat this as an exploratory screen, not
as a standalone unidimensionality test or as a DIMTEST/UNIDIM substitute. In
MFRM reporting, a cautious claim should combine global residual fit,
element-level fit, residual PCA, and local-dependence screens, for example:
"evidence consistent with essential unidimensionality under the specified facet
structure."

```{r fit-full-pca}
diag_pca <- diagnose_mfrm(
  fit,
  residual_pca = "both",
  pca_max_factors = 6
)

summary(diag_pca)
```

## Strict Diagnostics for RSM and PCM

For `RSM` and `PCM`, the package can now keep the legacy residual path and the
strict marginal path side by side:

```{r strict-rsm-pcm}
fit_rsm_strict <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7,
  maxit = 30
)
diag_rsm_strict <- diagnose_mfrm(
  fit_rsm_strict,
  diagnostic_mode = "both",
  residual_pca = "none"
)

fit_pcm_strict <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "PCM",
  step_facet = "Criterion",
  quad_points = 7,
  maxit = 30
)
diag_pcm_strict <- diagnose_mfrm(
  fit_pcm_strict,
  diagnostic_mode = "both",
  residual_pca = "none"
)

summary(diag_rsm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]
summary(diag_pcm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]
```

When you want a compact simulation-based screening check for the strict branch,
use `evaluate_mfrm_diagnostic_screening()` on a small design:

```{r strict-screening}
screen_rsm <- evaluate_mfrm_diagnostic_screening(
  design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
  reps = 1,
  scenarios = c("well_specified", "local_dependence"),
  model = "RSM",
  maxit = 30,
  quad_points = 7,
  seed = 123
)
screen_pcm <- evaluate_mfrm_diagnostic_screening(
  design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
  reps = 1,
  scenarios = c("well_specified", "step_structure_misspecification"),
  model = "PCM",
  maxit = 30,
  quad_points = 7,
  seed = 123
)

screen_rsm$performance_summary[, c("Scenario", "EvaluationUse", "LegacyAnyFlagRate", "StrictAnyFlagRate")]
screen_pcm$performance_summary[, c("Scenario", "EvaluationUse", "LegacySensitivityProxy", "StrictSensitivityProxy", "DeltaStrictMinusLegacyFlagRate")]
```

The same strict branch is now reflected in the reporting router:

```{r strict-reporting-route}
chk_rsm_strict <- reporting_checklist(fit_rsm_strict, diagnostics = diag_rsm_strict)
subset(
  chk_rsm_strict$checklist,
  Section == "Visual Displays" &
    Item %in% c("QC / facet dashboard", "Strict marginal visuals", "Precision / information curves"),
  c("Item", "Available", "DraftReady", "NextAction")
)
```

## Residual PCA and Reporting

```{r residual-pca}
pca <- analyze_residual_pca(diag_pca, mode = "both")
plot_residual_pca(pca, mode = "overall", plot_type = "scree")
```

```{r bias-apa}
data("mfrmr_example_bias", package = "mfrmr")
bias_df <- mfrmr_example_bias
fit_bias <- fit_mfrm(
  bias_df,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7
)
diag_bias <- diagnose_mfrm(fit_bias, residual_pca = "none")
bias <- estimate_bias(fit_bias, diag_bias, facet_a = "Rater", facet_b = "Criterion")
fixed <- build_fixed_reports(bias)
apa <- build_apa_outputs(fit_bias, diag_bias, bias_results = bias)

mfrm_threshold_profiles()
vis <- build_visual_summaries(fit_bias, diag_bias, threshold_profile = "standard")
vis$warning_map$residual_pca_overall
```

The same `example_bias` dataset also carries a `Group` variable so DIF-oriented
examples can show a non-null pattern instead of a fully clean result. It can be
loaded either with `load_mfrmr_data("example_bias")` or
`data("mfrmr_example_bias", package = "mfrmr")`.

## Human-Readable Reporting API

```{r reporting-api}
spec <- specifications_report(fit, title = "Study run")
data_qc <- data_quality_report(
  fit,
  data = ej2021_study1,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score"
)
iter <- estimation_iteration_report(fit, max_iter = 8)
subset_rep <- subset_connectivity_report(fit, diagnostics = diag)
facet_stats <- facet_statistics_report(fit, diagnostics = diag)
cat_structure <- category_structure_report(fit, diagnostics = diag)
cat_curves <- category_curves_report(fit, theta_points = 101)
bias_rep <- bias_interaction_report(bias, top_n = 20)
plot_bias_interaction(bias_rep, plot = "scatter")
```

## Design Simulation and Prediction

The package also supports a separate simulation/prediction layer. The key
distinction is:

- `evaluate_mfrm_recovery()` checks whether known generating parameters are
  recovered under a stated simulation design. It is the first simulation check
  to run when you are validating a model specification or a planned design.
- `evaluate_mfrm_design()` and `predict_mfrm_population()` are
  design-level helpers that summarize expected operating characteristics under
  an explicit simulation specification.
- `predict_mfrm_units()` and `sample_mfrm_plausible_values()` score future or
  partially observed persons under a fixed `MML` calibration.

```{r design-prediction}
sim_spec <- build_mfrm_sim_spec(
  n_person = 30,
  n_rater = 4,
  n_criterion = 4,
  raters_per_person = 2,
  assignment = "rotating"
)

recovery <- suppressWarnings(
  evaluate_mfrm_recovery(
    sim_spec = sim_spec,
    reps = 2,
    maxit = 30,
    seed = 2
  )
)

summary(recovery)$recovery_summary[, c("ParameterType", "Facet", "RMSE", "Bias")]
plot(recovery, type = "summary", metric = "rmse", draw = FALSE)$data$plot_table

recovery_review <- assess_mfrm_recovery(
  recovery,
  min_reps = 2,
  min_se_available = NULL,
  max_mcse_rmse_ratio = NULL,
  max_rmse = c(facet = 1, step = 1, default = 1.5),
  max_abs_bias = c(default = 0.75)
)

summary(recovery_review)$checklist[, c("Section", "Item", "Status")]
status_plot <- plot(recovery_review, type = "status", draw = FALSE)
status_plot$data$section_status
status_plot$data$reading_order

metric_plot <- plot(recovery_review, type = "metrics", metric = "rmse", draw = FALSE)
metric_plot$data$plot_table
metric_plot$data$guidance

recovery_bundle <- build_summary_table_bundle(
  recovery_review,
  appendix_preset = "recommended"
)
recovery_bundle$table_index[, c("Table", "Rows", "Role")]

# For a release-scale check, use the optional validation protocol:
# source(system.file("validation", "recovery-validation.R", package = "mfrmr"))
# validation <- mfrmr_run_recovery_validation(tier = "core", output_dir = "recovery-validation")
# summary(validation)
# validation_summary <- mfrmr_summarize_recovery_validation(validation)
# validation_summary$topline_release_decision
# validation_summary$release_decision_table
# validation_summary$domain_decision_table
# The top-line table is the release-recovery conclusion. It uses recovery
# metrics, convergence, and Monte Carlo precision as the primary evidence.
# The domain table keeps uncertainty status separate, so OverallStatus =
# "review" is not read as a recovery-metric failure by itself.

pred_pop <- predict_mfrm_population(
  sim_spec = sim_spec,
  reps = 2,
  maxit = 30,
  seed = 1
)

summary(pred_pop)$forecast[, c("Facet", "MeanSeparation", "McseSeparation")]

keep_people <- unique(toy$Person)[1:18]
toy_mml <- suppressWarnings(
  fit_mfrm(
    toy[toy$Person %in% keep_people, , drop = FALSE],
    person = "Person",
    facets = c("Rater", "Criterion"),
    score = "Score",
    method = "MML",
    quad_points = 5,
    maxit = 30
  )
)

new_units <- data.frame(
  Person = c("NEW01", "NEW01"),
  Rater = unique(toy$Rater)[1],
  Criterion = unique(toy$Criterion)[1:2],
  Score = c(2, 3)
)

pred_units <- predict_mfrm_units(toy_mml, new_units, n_draws = 0)
pv_units <- sample_mfrm_plausible_values(toy_mml, new_units, n_draws = 2, seed = 1)

summary(pred_units)$estimates[, c("Person", "Estimate", "Lower", "Upper")]
summary(pv_units)$draw_summary[, c("Person", "Draws", "MeanValue")]
```

For a report or appendix handoff, pass the recovery objects through the same
summary-table export route used by the rest of the package:

```{r recovery-export, eval = FALSE}
export_summary_appendix(
  list(recovery = recovery, recovery_review = recovery_review),
  output_dir = tempdir(),
  prefix = "mfrmr_recovery_appendix",
  preset = "recommended",
  include_html = FALSE,
  overwrite = TRUE
)
```

For a quick smoke test, `reps = 2` or another very small value is useful only to
check that the data-generating setup and refit path run. For a study report,
increase `reps`, keep the ADEMP-style metadata in the exported tables, and set
substantive RMSE/Bias thresholds so that `assess_mfrm_recovery()` can mark those
rows as `ok`, `review`, or `concern` rather than `not_assessed`.