--- title: "Models to Fit to Square Tables" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Models to Fit to Square Tables} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ordinalTables) ``` # Tests for Square Tables This vignette looks at the related set of models that can be fit to a square table. Specifically, the models of symmetry, marginal homogeneity, and versions of quasi-symmetry are examined. The material parallels Chapter 11 of Agresti(1984). Measures based on minimizing the criterion Minimum Discriminant Information Statistic are discussed in the vignette "Analysis of the Minimum Discriminant Information Statistic". ## Data The data on visual acuity of women working at the Royal Ordinance factory, vision_data will be used. ``` {r vision} vision_data ``` ## Symmetry The test of symmetry is Bowker_symmetry() ```{r bowker} bowker <- Bowker_symmetry(vision_data) ``` This leads to a significant X^2 of `r bowker$chisq` on `r bowker$df` degrees of freedom, leading to the rejection of the hypothesis of symmetry. ## Mariginal Homogenity There are two tests of marginal homogeneity, Stuart_marginal_homogeneity() and Bhapkar_marginal_homogenity(). ```{r marginal} stuart <- Stuart_marginal_homogeneity(vision_data) bhapkar <- Bhapkar_marginal_homogeneity(vision_data) ``` The measures yield similar X^2 for this data, `r stuart$chisq` and `r bhapkar$chisq` on `r bhapkar$df` degrees of freedom, leading to rejecting the hypothesis of marginal homogeneity. See the related vignette "Checking of Margins are (Stochasically) Ordered". ## Quasi-Symmetry Quasi-symmetry can be fit as a general log-linear model by specifying the correct design matrix X. There is also a direct test of the hypothesis, Bhapkar_quasi_symmetry() ``` {r quasi} quasi <- Bhapkar_quasi_symmetry(vision_data) ``` This is non-sigificant, `r quasi$chisq` on `r quasi$df`, with a p-value of `r pchisq(quasi$chisq, quasi$df, lower.tail=FALSE)`. ## Variations of Quasi-Symmetry The basic quasi-symmetry model can be described as modeling a cell frequency p(ij) as a function of a row effect p(i), a column effect q(j), and a symmetry effect d(ij) where d(ij) = d(ji) for all i and j. Then, p(ij) = p(i)q(j)d(ij). This is a simple log-linear model and can be fit that way. Several authors have suggested specializations of the basic quasi-symmetry model. McCullagh (1978) gives 4, quasi-symmetry, conditional symmetry, palindromic symmetry and generalized palindromic symmetry. The last two models get their name from the property that the categories cannot be arbitrarily re-arranged and still have the model hold. Instead, the only change in the order of categories that maintains the model is a complete reversal of the categories, where, for example, 1 -> 4, 2 -> 3, 3 -> 2, and 4 -> 1. Fitting each of McCullagh's models to the vision data is striaghtforward. ``` {r quasi2} conditional <- McCullagh_conditional_symmetry(vision_data) quasi2 <- McCullagh_quasi_symmetry(vision_data) palindrome <- McCullagh_palindromic_symmetry(vision_data) gen_palindrome <- McCullagh_generalized_palindromic_symmetry(vision_data) ``` For conditional symmetry, the Pearson X^2 is `r conditional$chisq` and the likelihood ratio G^2 is `r conditional$g_squared`, both on `r conditional$df` degrees of freedom. The asymmetry parameter is `r conditional$theta`. For quasi-symmetry, the Pearson X^2 is `r quasi2$chisq` on `r quasi2$df` degrees of freedom. The vector asymmetry parameters is `r quasi2$alpha`, but alpha[1] is constrained to be 1.0. Palindormic symmetry yields a vector of asymmetry parameters alpha as well as a general asymmetry parameter theta. The model takes a few seconds to run (this will improve in subsequent releases). The alpha vector is `r palindrome$alpha` (recall alpha[1] is constrained to be 1.0), and the asymmetry parameter delta is estimated to be `r palindrome$delta` with a standard eorror of `r palindrome$sigma_delta` for a z-score of `r palindrome$delta / palindrome$sigma`. The overall fit of the model is good, `r$palindrome$chisq` on `r palindrome$df` degrees of freedom. Generalized palindromic symmetry has the same basic structure as palindromic symmetry, but there is a vector of delta parameters instead of just one. The fit of the model is good (`r gen_palindrome$chisq` on `r gen_palindrome$df` degrees of freedom). The alpha vector is `gen_palindrome$alpha` and the delta vector of asymmetry parameters is `r gen_palindrome$delta_vec`. Goodman proposed a different set of constraints, this time on the diagonals that parallel the main diagonal (e.g, m_ij where |i - j| = k). The Goodman_diagonals_parameter_symmetry() model specifies that each set of diagonal cells all deviate from symmetry by a set amount ```{r goodman} diagonal <- Goodman_diagonals_parameter_symmetry(vision_data) equal <- c(FALSE, TRUE, TRUE) constrained_diagonal <- Goodman_constrained_diagonals_parameter_symmetry(vision_data, equal) fixed <- c(FALSE, TRUE, FALSE) delta <- c(1.0, 1.0, 1.0) fixed_diagonal <- Goodman_fixed_parameter(vision_data, delta, fixed) ``` The original diagonals parameter model has one parameter per diagonal (r - 1 for an r X r table). These parameters are in the delta vector, `r diagonal$delta`. The basic model fits the data well, with "ominibus_chisq" X^2 = `r diagonal$omnibus_chisq` on omnibus_df `r diagonal$omnibus_df` degrees of freedom. A fit measure is also returned for model that constrains all delta parameters to be equal, equality_chisq and equality_df. It is possible to constrain a subset of the delta parameters to be equal using Gooodman_constrained_diagonals_parameter_symmmetry(). This takes two arguments, the data matrix and a logical vector indicating whether the corresponding parameter is part of the equality-constrained set. The example constrains the last two parameters to be equal. The estimate of the common parameter `r constrained_diagonal$delta_pooled` is available in the common_delta member of the result. Looking at the members that start with "pooled", the X^2 of `r `constrained_diagonal$pooled_chisq` on `r constrained_diagonal$pooled_df` degrees of freedom indicates that this model fits very well too. The common delta is `r constrained_diagonal$delta_pooled`. The equality test here is somewhat misleading. It tests the hypothesis that the remaining deltas combined with the pooled one are equal. The other option with the diagonals symmetry model is to fix certain delta parameters at specified values. The parameters are fixed by indicating TRUE in the element of the delta vector; FALSE should be specified for the parameters to be estimated. The values are specified in the vector delta. Values for the non-constrained parameters serve as starting values. They should be positive, and 1.0 seems to work well. The function is Goodman_fixed_parameter() with arguments data matrix, delta values, and the fixed-free vector. When this model is fit to the vision data, constraining delta[2] = 1.0, the fit is still excellent at X^2 = `r fixed_diagonal$chisq` on `r fixed_diagonal$df` degrees of freedom. Examining the delta vector `r fixed_diagonal$delta` shows that element 2 was indeed held constant at its value of 1.0. Agresti proposed a simplified version of the diagonals parameter symmetry model using a single delta parameter. For diagonal |i - j| = k, delta = delta^k. ```{r agresti_diagonal} agresti <- Agresti_simple_diagonals_parameter_quasi_symmetry(vision_data) ``` The model returns an acceptably low X^2 (`r agresti$chisq`) on `r agresti$df` degrees of freedom. The test of the parameter beta `r agresti$beta` with standard error of `r agresti$se` is significant, z = `r agresti$z`. Finally, the estimate of the diagonal parameter delta is `r agresti$delta`.