--- title: "Analysis of the Minimum Discriminant Information Statistic" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Analysis of the Minimum Discriminant Information Statistic} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ordinalTables) ``` # Analysis of the Minimum Doscriminant Information Statistic (mdis) ## Data For this vignette we will again use the vision data (vision_data). ```r{vision_data} vision_data ``` Row sums = `r rowSums(vision_data)` Column sums = `r colSums(vision_data)` ## The Minimum Discrinant Information Statistic For background on the measure, refer to Ireland, Ku, and Kullback (1969). Symmetry and marginal homogeneity in a r X r contingency table, Journal of the American Statistical Association, 64(328), 1323-1341. In that paper, Ireland et al. give procedures for computing 3 relevant models: marginal homogeneity, symmetry, and quasi-symmetry. They also show how the MDIS can be broken down for nested tests of these models. Like chi-square, MDIS is minimized when fitting models to a table. Similarly, it can be compared to a central chi-square distribution when assessing the quality of the fit. Finally, the measure is subtractive, which means that it is possible to break down fit into components of less restrictive models. For example, Symmetry is equivalent to marginal homogeneity and quasi-symmerty. Because of the additive nature of MDIS, MDIS(symmetry) = MDIS(marginal homogeneity) + MDIS(quasi-symmetry) (see below). One unique aspect of the Ireland et al. paper is that they note that the diagonal entries of the table are not relevant to the comparisons of models, and so they give algorithms for analyzing the full table and for "truncated" tables, excluding the diagonal entires. Each of the functions, Ireland_symmetry(), Ireland_marginal_homogeneity() and Ireland_quasi_symmetry() takes an optional logical parameter "truncated", which if TRUE excludes the diagonal cells from the analysis and the computation of the fit measure. The default for truncated is FALSE, include the diagonal. ## Symmetry Symmetry is the strongest model for square tables. It implies both marginal homogeneity and the weaker quasi-symmetry. Symmetry says that the cells on either side of the diagonal (n[i, j] and n[j, i]) are equal except for sampling variation. From the MDIS perspective, the function to fit this model is Ireland_symmetry(). ```{r ireland symmetry} symmetry <- Ireland_symmetry(vision_data) ``` For the vision data, this model gives an mdis value of `r symmetry$mdis` and `r symmetry$df` degrees of freedom, with an associated p-value of `1.0 - pchisq(synmetry$mdis, symmetry$df)`. So symmetry is rejected for the vision data. A related test for symmetry is Bowker_symmetry() ``` {r bowker} bowker <- Bowker_symmetry(vision_data) ``` This gives a similar result, with a Pearson X^2 of `r bowker$chisq` on `r bowker$df` degrees of freedoom. ## Marginal Homogeneity Testing hypotheses about the margins is covered in detail in the vignette "Checking Whether the Margins are Ordered." For the purpose of this vignette, the marginal homogeneity model is fit using the function Ireland_marginal_homogeneity(). ``` {r marginal_homogeneity} marginal_homogeneity <- Ireland_marginal_homogeneity(vision_data) ``` The MDIS for the symmetry model is `r marginal_homogeneity$mdis` on `r marginal_homogeneity$df` degrees of freedom, with a p-value `r pchisq(marginal_homogeneity$mdis, marginal_homogeneity$df, lower.tail=FALSE)`, leading us to reject marginal homogeneity for the vision data. ## Quasi-symmetry Quasi-symmetry is defined in terms of an underlying d(ij) which is symmetric, d(ij) = d(ji). Then quasi-symmetry states (p. 1325) that the cell probability p(ij) is a dfcntioon of d(ij) multiplied by a row effect p(i) and a column effect q(j): p(ij) = p(i)q(j)d(ij). Quasi-symmerty and its variations are examined in more detail in the vignette "Models to Fit to Square Tables". Because MDIS(symmetry) = MDIS(marginal homogeneity) + MDIS(quasi-symmetry), the MDIS for quasi-symmetry can be computed by subtraction: MDIS(symmetry) - MDIS(marginal homogeneity). This is computed using Ireland_quasi_symmetry(): ``` {r quasi1} quasi1 <- Ireland_quasi_symmetry(vision_data) ``` which yields a MDIS value of `r quasi1$mdis` on `r quasi1$df` degrees of freedom. If there is a desire to fit the quasi-symmetry model (for example to see predicted values), Ireland_quasi_symmetry_model() can be used to fit it: ```{r quasi2} quasi2 <- Ireland_quasi_symmetry_model(vision_data) ``` which gives a early identical MDIS of `r quasi2$mdis` on `r quasi2$df` degrees of freedom, which is not significant, p = `r pchisq(quasi2$mdis, quasi2$df, lower.tail=FALSE)`, and so the hypothesis of quasi-symmetry is not rejected for the vision data. Finally, we note that fitting the models with the option truncated = TRUE yields neary identical results: ```{r truncate} truncated_sym <- Ireland_symmetry(vision_data, truncated=TRUE) truncated_mh <- Ireland_marginal_homogeneity(vision_data, truncated=TRUE) truncated_qs <- Ireland_quasi_symmetry_model(vision_data, truncated=TRUE) ``` with MDIS values of `r truncated_sym$mdis`, `r truncated_mh$mdis`, and `r truncated_qs$mdis`, leading to the same conclusions. Note that when truncated = TRUE, the diagonal cells for the estimated frequencies x_star and the estimated proportions p_star are returned as 0.0.