--- title: "mfrmr Workflow" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{mfrmr Workflow} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} is_cran_check <- !isTRUE(as.logical(Sys.getenv("NOT_CRAN", "false"))) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, eval = !is_cran_check ) ``` This vignette outlines a reproducible workflow for: - loading packaged simulation data - fitting an MFRM with flexible facets - running diagnostics and residual PCA - generating APA and visual summary outputs - moving from fitted models into design simulation and fixed-calibration prediction For a plot-first companion guide, see the separate `mfrmr-visual-diagnostics` vignette. For speed-sensitive work, a useful pattern is: - start with `method = "JML"` or `quad_points = 7` - use `diagnose_mfrm(..., residual_pca = "none")` for the first pass - reuse the same diagnostics object in downstream reports and plots ## MML and Diagnostic Modes `mfrmr` treats `MML` and `JML` differently on purpose. - `MML` integrates over the person distribution with Gauss-Hermite quadrature. - The current `MML` branch optimizes the quadrature-based marginal log-likelihood directly; it is not an EM implementation. - `JML` is often useful for quick exploratory passes, but `MML` remains the preferred route for final estimation and fixed-calibration follow-up. For `RSM` and `PCM`, diagnostics now expose two distinct evidence paths: - `diagnostic_mode = "legacy"` keeps the residual/EAP-based stack. - `diagnostic_mode = "marginal_fit"` adds the strict latent-integrated screen. - `diagnostic_mode = "both"` is the safest default when you want to inspect both views side by side. The strict marginal branch is still screening-oriented in the current release. Use `summary(diag)$diagnostic_basis` to separate the legacy residual evidence from the strict marginal evidence rather than pooling them into one decision. ## Load Data ```{r load-data} library(mfrmr) list_mfrmr_data() data("ej2021_study1", package = "mfrmr") head(ej2021_study1) study1_alt <- load_mfrmr_data("study1") identical(names(ej2021_study1), names(study1_alt)) ``` ## Minimal Runnable Example Start with the packaged `example_core` dataset. It is intentionally compact, category-complete, and generated from a single latent trait plus facet main effects so that help-page examples stay fast without relying on undersized toy data. The same object is also available via `data("mfrmr_example_core", package = "mfrmr")`: ```{r toy-setup} data("mfrmr_example_core", package = "mfrmr") toy <- mfrmr_example_core fit_toy <- fit_mfrm( data = toy, person = "Person", facets = c("Rater", "Criterion"), score = "Score", method = "JML", model = "RSM", maxit = 30 ) diag_toy <- diagnose_mfrm(fit_toy, residual_pca = "none") summary(fit_toy)$overview summary(diag_toy)$overview names(plot(fit_toy, draw = FALSE)) ``` ## Diagnostics and Reporting ```{r diagnostics-reporting} t4_toy <- unexpected_response_table( fit_toy, diagnostics = diag_toy, abs_z_min = 1.5, prob_max = 0.4, top_n = 10 ) t12_toy <- fair_average_table(fit_toy, diagnostics = diag_toy) t13_toy <- bias_interaction_report( estimate_bias(fit_toy, diag_toy, facet_a = "Rater", facet_b = "Criterion", max_iter = 2), top_n = 10 ) class(summary(t4_toy)) class(summary(t12_toy)) class(summary(t13_toy)) names(plot(t4_toy, draw = FALSE)) names(plot(t12_toy, draw = FALSE)) names(plot(t13_toy, draw = FALSE)) chk_toy <- reporting_checklist(fit_toy, diagnostics = diag_toy) subset( chk_toy$checklist, Section == "Visual Displays", c("Item", "DraftReady", "NextAction") ) ``` ## Fit and Diagnose with Full Data For a realistic analysis, use the packaged Study 1 dataset: ```{r fit-full} fit <- fit_mfrm( data = ej2021_study1, person = "Person", facets = c("Rater", "Criterion"), score = "Score", method = "MML", model = "RSM", quad_points = 7 ) diag <- diagnose_mfrm( fit, residual_pca = "none" ) summary(fit) summary(diag) ``` If you need residual-structure evidence for a final report, you can add residual PCA after the initial diagnostic pass. Treat this as an exploratory screen, not as a standalone unidimensionality test or as a DIMTEST/UNIDIM substitute. In MFRM reporting, a cautious claim should combine global residual fit, element-level fit, residual PCA, and local-dependence screens, for example: "evidence consistent with essential unidimensionality under the specified facet structure." ```{r fit-full-pca} diag_pca <- diagnose_mfrm( fit, residual_pca = "both", pca_max_factors = 6 ) summary(diag_pca) ``` ## Strict Diagnostics for RSM and PCM For `RSM` and `PCM`, the package can now keep the legacy residual path and the strict marginal path side by side: ```{r strict-rsm-pcm} fit_rsm_strict <- fit_mfrm( data = toy, person = "Person", facets = c("Rater", "Criterion"), score = "Score", method = "MML", model = "RSM", quad_points = 7, maxit = 30 ) diag_rsm_strict <- diagnose_mfrm( fit_rsm_strict, diagnostic_mode = "both", residual_pca = "none" ) fit_pcm_strict <- fit_mfrm( data = toy, person = "Person", facets = c("Rater", "Criterion"), score = "Score", method = "MML", model = "PCM", step_facet = "Criterion", quad_points = 7, maxit = 30 ) diag_pcm_strict <- diagnose_mfrm( fit_pcm_strict, diagnostic_mode = "both", residual_pca = "none" ) summary(diag_rsm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")] summary(diag_pcm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")] ``` When you want a compact simulation-based screening check for the strict branch, use `evaluate_mfrm_diagnostic_screening()` on a small design: ```{r strict-screening} screen_rsm <- evaluate_mfrm_diagnostic_screening( design = list(person = 18, rater = 3, criterion = 3, assignment = 3), reps = 1, scenarios = c("well_specified", "local_dependence"), model = "RSM", maxit = 30, quad_points = 7, seed = 123 ) screen_pcm <- evaluate_mfrm_diagnostic_screening( design = list(person = 18, rater = 3, criterion = 3, assignment = 3), reps = 1, scenarios = c("well_specified", "step_structure_misspecification"), model = "PCM", maxit = 30, quad_points = 7, seed = 123 ) screen_rsm$performance_summary[, c("Scenario", "EvaluationUse", "LegacyAnyFlagRate", "StrictAnyFlagRate")] screen_pcm$performance_summary[, c("Scenario", "EvaluationUse", "LegacySensitivityProxy", "StrictSensitivityProxy", "DeltaStrictMinusLegacyFlagRate")] ``` The same strict branch is now reflected in the reporting router: ```{r strict-reporting-route} chk_rsm_strict <- reporting_checklist(fit_rsm_strict, diagnostics = diag_rsm_strict) subset( chk_rsm_strict$checklist, Section == "Visual Displays" & Item %in% c("QC / facet dashboard", "Strict marginal visuals", "Precision / information curves"), c("Item", "Available", "DraftReady", "NextAction") ) ``` ## Residual PCA and Reporting ```{r residual-pca} pca <- analyze_residual_pca(diag_pca, mode = "both") plot_residual_pca(pca, mode = "overall", plot_type = "scree") ``` ```{r bias-apa} data("mfrmr_example_bias", package = "mfrmr") bias_df <- mfrmr_example_bias fit_bias <- fit_mfrm( bias_df, person = "Person", facets = c("Rater", "Criterion"), score = "Score", method = "MML", model = "RSM", quad_points = 7 ) diag_bias <- diagnose_mfrm(fit_bias, residual_pca = "none") bias <- estimate_bias(fit_bias, diag_bias, facet_a = "Rater", facet_b = "Criterion") fixed <- build_fixed_reports(bias) apa <- build_apa_outputs(fit_bias, diag_bias, bias_results = bias) mfrm_threshold_profiles() vis <- build_visual_summaries(fit_bias, diag_bias, threshold_profile = "standard") vis$warning_map$residual_pca_overall ``` The same `example_bias` dataset also carries a `Group` variable so DIF-oriented examples can show a non-null pattern instead of a fully clean result. It can be loaded either with `load_mfrmr_data("example_bias")` or `data("mfrmr_example_bias", package = "mfrmr")`. ## Human-Readable Reporting API ```{r reporting-api} spec <- specifications_report(fit, title = "Study run") data_qc <- data_quality_report( fit, data = ej2021_study1, person = "Person", facets = c("Rater", "Criterion"), score = "Score" ) iter <- estimation_iteration_report(fit, max_iter = 8) subset_rep <- subset_connectivity_report(fit, diagnostics = diag) facet_stats <- facet_statistics_report(fit, diagnostics = diag) cat_structure <- category_structure_report(fit, diagnostics = diag) cat_curves <- category_curves_report(fit, theta_points = 101) bias_rep <- bias_interaction_report(bias, top_n = 20) plot_bias_interaction(bias_rep, plot = "scatter") ``` ## Design Simulation and Prediction The package also supports a separate simulation/prediction layer. The key distinction is: - `evaluate_mfrm_recovery()` checks whether known generating parameters are recovered under a stated simulation design. It is the first simulation check to run when you are validating a model specification or a planned design. - `evaluate_mfrm_design()` and `predict_mfrm_population()` are design-level helpers that summarize expected operating characteristics under an explicit simulation specification. - `predict_mfrm_units()` and `sample_mfrm_plausible_values()` score future or partially observed persons under a fixed `MML` calibration. ```{r design-prediction} sim_spec <- build_mfrm_sim_spec( n_person = 30, n_rater = 4, n_criterion = 4, raters_per_person = 2, assignment = "rotating" ) recovery <- suppressWarnings( evaluate_mfrm_recovery( sim_spec = sim_spec, reps = 2, maxit = 30, seed = 2 ) ) summary(recovery)$recovery_summary[, c("ParameterType", "Facet", "RMSE", "Bias")] plot(recovery, type = "summary", metric = "rmse", draw = FALSE)$data$plot_table recovery_review <- assess_mfrm_recovery( recovery, min_reps = 2, min_se_available = NULL, max_mcse_rmse_ratio = NULL, max_rmse = c(facet = 1, step = 1, default = 1.5), max_abs_bias = c(default = 0.75) ) summary(recovery_review)$checklist[, c("Section", "Item", "Status")] status_plot <- plot(recovery_review, type = "status", draw = FALSE) status_plot$data$section_status status_plot$data$reading_order metric_plot <- plot(recovery_review, type = "metrics", metric = "rmse", draw = FALSE) metric_plot$data$plot_table metric_plot$data$guidance recovery_bundle <- build_summary_table_bundle( recovery_review, appendix_preset = "recommended" ) recovery_bundle$table_index[, c("Table", "Rows", "Role")] # For a release-scale check, use the optional validation protocol: # source(system.file("validation", "recovery-validation.R", package = "mfrmr")) # validation <- mfrmr_run_recovery_validation(tier = "core", output_dir = "recovery-validation") # summary(validation) # validation_summary <- mfrmr_summarize_recovery_validation(validation) # validation_summary$topline_release_decision # validation_summary$release_decision_table # validation_summary$domain_decision_table # The top-line table is the release-recovery conclusion. It uses recovery # metrics, convergence, and Monte Carlo precision as the primary evidence. # The domain table keeps uncertainty status separate, so OverallStatus = # "review" is not read as a recovery-metric failure by itself. pred_pop <- predict_mfrm_population( sim_spec = sim_spec, reps = 2, maxit = 30, seed = 1 ) summary(pred_pop)$forecast[, c("Facet", "MeanSeparation", "McseSeparation")] keep_people <- unique(toy$Person)[1:18] toy_mml <- suppressWarnings( fit_mfrm( toy[toy$Person %in% keep_people, , drop = FALSE], person = "Person", facets = c("Rater", "Criterion"), score = "Score", method = "MML", quad_points = 5, maxit = 30 ) ) new_units <- data.frame( Person = c("NEW01", "NEW01"), Rater = unique(toy$Rater)[1], Criterion = unique(toy$Criterion)[1:2], Score = c(2, 3) ) pred_units <- predict_mfrm_units(toy_mml, new_units, n_draws = 0) pv_units <- sample_mfrm_plausible_values(toy_mml, new_units, n_draws = 2, seed = 1) summary(pred_units)$estimates[, c("Person", "Estimate", "Lower", "Upper")] summary(pv_units)$draw_summary[, c("Person", "Draws", "MeanValue")] ``` For a report or appendix handoff, pass the recovery objects through the same summary-table export route used by the rest of the package: ```{r recovery-export, eval = FALSE} export_summary_appendix( list(recovery = recovery, recovery_review = recovery_review), output_dir = tempdir(), prefix = "mfrmr_recovery_appendix", preset = "recommended", include_html = FALSE, overwrite = TRUE ) ``` For a quick smoke test, `reps = 2` or another very small value is useful only to check that the data-generating setup and refit path run. For a study report, increase `reps`, keep the ADEMP-style metadata in the exported tables, and set substantive RMSE/Bias thresholds so that `assess_mfrm_recovery()` can mark those rows as `ok`, `review`, or `concern` rather than `not_assessed`.