\documentclass[article,shortnames,nojss]{jss} %\VignetteIndexEntry{isocir: An R Package for Isotonic Inference for Circular Data.} %\VignetteDepends{combinat} %\VignetteDepends{circular} %\VignetteKeyWords{} %\VignettePackage{isocir} \usepackage{rotating , booktabs , latexsym , amsmath , graphicx,amssymb,amsfonts,mathrsfs,thumbpdf, subfigure, listings, slashbox, multirow} \usepackage[latin1]{inputenc} %\usepackage[nogin]{Sweave} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \SweaveOpts{engine=R, echo=T, eps=FALSE} \author{Sandra Barrag\'{a}n\\Universidad de Valladolid\And Miguel A. Fern\'{a}ndez\\Universidad de Valladolid\AND Cristina Rueda\\Universidad de Valladolid\And Shyamal Das Peddada\\ National Institute of \\Environmental Health Sciences} \title{\pkg{isocir}: An \proglang{R} Package for Constrained Inference using Isotonic Regression for Circular Data, with an Application to Cell Biology.} \Plainauthor{Sandra Barrag\'{a}n, Miguel A. Fern\'{a}ndez, Cristina Rueda, Shyamal Das Peddada} %% comma-separated \Plaintitle{isocir: An R Package for Constrained Inference using Isotonic Regression for Circular Data, with an Application to Cell Biology.} %% without formatting \Shorttitle{\pkg{isocir}: An \proglang{R} Package for Isotonic Inference for Circular Data.} %% a short title (if necessary) % an abstract and keywords \Abstract{ In many applications one may be interested in drawing inferences regarding the order of a collection of points on a unit circle. Due to the underlying geometry of the circle standard constrained inference procedures developed for Euclidean space data are not applicable. Recently, statistical inference for parameters under such order constraints on a unit circle was discussed in \citet{Rue09,Fer11}. In this paper we introduce an \proglang{R} package called \pkg{isocir} which provides a set of functions that can be used for analyzing angular data subject to order constraints on a unit circle. Since this work is motivated by applications in cell biology, we illustrate the proposed package using a relevant cell cycle data.\\} \Keywords{Circular Data, Isotropic Order, \emph{CIRE}, Conditional Test, R package \pkg{isocir}, \proglang{R}} \Plainkeywords{Isotonic estimation, circular data, isotropic order, conditional test, R package isocir} %% without formatting %% The address of (at least) one author should be given %% in the following format: \Address{ Sandra Barrag\'{a}n, Miguel A. Fern\'{a}ndez, Cristina Rueda\\ Departamento de Estad\'{\i}stica e Investigaci\'{o}n Operativa\\ Instituto de Matem\'{a}ticas (IMUVA)\\ Universidad de Valladolid\\ Valladolid, Spain\\ E-mail: \email{sandraba@eio.uva.es},\\ \email{miguelaf@eio.uva.es},\\ \email{crueda@eio.uva.es}\\ URL: \url{http://www.eio.uva.es/~sandra/}\\ \url{http://www.eio.uva.es/~miguel/}\\ \url{http://www.eio.uva.es/~cristina/}\\ Shyamal D. Peddada\\ Biostatistics Branch \\ National Institute of Environmental Health Sciences\\ Research Triangle Park\\ NC 27709, USA\\ E-mail: \email{peddada@niehs.nih.gov},\\ URL: \url{http://www.niehs.nih.gov/research/atniehs/labs/bb/staff/peddada/index.cfm}\\ } %% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{document} <>= library("isocir") @ \section[Introduction]{Introduction} Circular data arise in a wide range of contexts, such as in geography, cell biology, circadian biology, endocrinology, ornithology, etc (cf \citet{Zar99} or \citet{Mar08}). This work is motivated by applications in cell cycle biology where one may want to draw inferences regarding angular parameters subject to order constraints on a unit circle. A cell cycle among eukaryotes follows a well-coordinated process where cells go through cycle four phases of distinct biological functions, namely, G1, S, G2 and M phase (see Figure~\ref{diagramcycle}).\\ \begin{figure}[htb] \begin{minipage}[b]{0.39\linewidth} \centering \includegraphics[width=80pt]{./esqbasico.pdf}\\ \caption{Phases of a cell cycle } \label{diagramcycle} \end{minipage} \begin{minipage}[b]{0.6\linewidth} Genes participating in the cell division cycle are often called cell cycle genes. A cell cycle gene has a periodic expression with its peak expression occurring just before its biological function (\citet{Jen06}). Since the periodic expression of a cell cycle gene can be mapped onto a unit circle, the angle corresponding to its peak expression is known as the phase angle of the gene (see \citet{Liu04} and Figure~\ref{phaseangle}).\\ \end{minipage} \end{figure} \begin{figure}[htb] \begin{minipage}[b]{0.66\linewidth} \vspace{-0.6cm} Since cell cycle is fundamental to the growth and development of an organism, cell biologists have been interested in understanding various aspects of cell cycle that are evolutionarily conserved. For instance, they would like to identify genes whose relative order of peak expressions is evolutionarily conserved. In order to solve such problems \citet{Rue09} introduced an order restriction on the unit circle, called cir- \vspace{-0.45cm} \end{minipage} \begin{minipage}[b]{0.33\linewidth} \vspace{-0.6cm} \centering \includegraphics[width=85pt]{./phi.pdf}\\ \caption{Phase angle ($\phi$)} \label{phaseangle} \vspace{-0.4cm} \end{minipage} \end{figure} cular order and extended the notion of isotonic regression estimator to circular parameter space by defining the circular isotonic regression estimator (CIRE). Using CIRE, \citet{Fer11} developed a formal statistical theory and methodology for testing whether the circular order of peak expression of a subset of cell cycle genes is conserved across multiple species. These statistical methods may have numerous other applications apart from cell cycle. With the increase in human survival rates, there is considerable interest in understanding neurological diseases related to aging such as the Alzheimer's disease (AD) and the Parkinson's disease (PD). An important aspect of such neurological diseases is the disruption of circadian clock and genes participating in it. Researchers are interested in testing differences in the phases of expression of circadian genes in normal and AD patients (\citet{Cer11}). Methodology discussed in this paper can be used for analyzing such data. Other areas of applications include; the study of migratory patterns and directions of birds (\citet{Coc04}), the changes in the wind directions (\citet{Bow00}), directional fluctuations in the atmosphere (\citet{vDo00}), psychology (studies of mental maps or monitoring data (\citet{Kib07}), the orientation of ridges in fingerprints or magnetic maps (\citet{Bol03}). Motivated by the wide range of applications and the non-existence of a user friendly software, in Section~\ref{sec3} we introduce the \pkg{isocir} package programmed in \proglang{R} environment, \cite{Rcore04}, which can be downloaded from \url{http://cran.r-project.org/web/packages/isocir/}. The package provides functions which can be used for drawing inferences regarding the order of a collection of points on a unit circle. In Section~\ref{sec2} we describe the statistical problem and the methodology of \citet{Rue09} and \citet{Fer11}. The \pkg{isocir} package is illustrated in Section~\ref{sec4} using the motivating cell cycle gene expression data. Some concluding remarks are provided in Section~\ref{sec5}. \clearpage \section[Circular models]{Angular Parameters under Order Constraints} \label{sec2} \subsection[Description Orders]{Circular Order Restriction} \label{orders} Let $\overline{\theta}=(\overline{\theta}_{1},\overline{\theta}_{2},\ldots,\overline{\theta}_{q})$ where each $\overline{\theta}_{i}$, $i=1, 2, \dots, q$ is the sample circular mean of a random sample of size $n_{i}$ from a population with unknown circular mean $\phi_{i}$. All angles are defined in a counter clockwise direction relative to a given pole. The mean resultant lengths for each population are denoted as $r_{1}, \ldots, r_{q}$ (see \citet{Mar00} for the definition of circular mean and mean resultant length of a set of angles). Then the problem of interest is to draw statistical inferences on $\phi_i$, $i=1, 2, \ldots, q$, subject to the constraint that the angles $\phi_1, \phi_2, \ldots, \phi_q$ are in a counter clockwise order on a unit circle. Thus $\phi_1$ is ``followed" by $\phi_2$ which is ``followed" by $\phi_3,\ldots,\phi_q$ is ``followed" by $\phi_1$. More precisely, we shall denote this simple circular order among angular parameters as follows: \begin{equation}\label{Csco} C_{sco}=\{\phi=(\phi_1,\phi_2,\ldots,\phi_q)\in [0,2\pi]^q:\phi_{1}\preceq\phi_{2}\preceq\cdots\preceq\phi_{q}\preceq\phi_{1}\}. \end{equation} It is important to note that the order among the angular parameters is invariant under changes in location of the pole (the initial point of the circle). Unlike the Euclidean space, points on a unit circle wrap around. That is, starting at the pole, by traveling $2 \pi$ radians around the circumference of the circle one would return to pole. For this reason the circular order among points on a unit circle is preserved even if the location of the pole is shifted. This is why \citet{Rue09} and \citet{Fer11} refer to the circular order $C_{sco}$ as isotropic order. As in this paper we will also consider more general circular order restrictions, from now on we will refer to $C_{sco}$ as simple circular order. As a consequence of the geometry, a circle can never be linearized and hence methods developed for Euclidean space data are not applicable to circular data. The problem is even more challenging when the angular parameters are constrained by an order around the circle, such as $C_{sco}$. General methodology for circular data, when there are no constraints on the angular parameters, can be found in the book \citet{Mar00}, among others. Constrained inference for circular data is rather recent (\citet{Rue09} and \citet{Fer11}). As noted in \citet{Rue09}, standard Euclidean space methods such as the pool adjacent violators algorithm (PAVA) used for computing isotonic regression (see \citet{Rob88} for details) cannot be applied to circular data. For example, when a cell biologist is investigating a large number of cell cycle genes, it may be difficult to ascertain the circular order among all cell cycle genes under consideration. However, based on the underlying biology, the investigator may a priori know the circular order among groups of genes but not the order among genes within each group. In such situations a partial circular order, $C_{pco}$ as defined below, can be used. \begin{equation} \label{Cpco} C_{pco}=\left\{ \phi\in [0,2\pi]^q:\left\{ \begin{array}{c} \phi_{1} \\ \phi_{2} \\ \vdots \\ \phi_{l_{1}} \\ \end{array} \right\} \preceq \left\{ \begin{array}{c} \phi_{l_{1}+1} \\ \phi_{l_{1}+2} \\ \vdots \\ \phi_{l_{1}+l_{2}} \\ \end{array} \right\}\preceq \cdots \preceq \left\{ \begin{array}{c} \phi_{l_{1}+\ldots+l_{L-1}+1} \\ \phi_{l_{1}+\ldots+l_{L-1}+2}\\ \vdots \\ \phi_{l_{1}+\ldots+l_{L}} \\ \end{array} \right\}\preceq \left\{ \begin{array}{c} \phi_{1} \\ \phi_{2} \\ \vdots \\ \phi_{l_{1}} \\ \end{array} \right\}\right\} \text{.} \end{equation} In this case we have $L$ sets of parameters with $l_j$ angular parameters in set $j$ and $q = \sum_{j=1}^L l_j$. Order among the parameters within a set is not known but every parameter in a given set is ``followed" by every parameter in the next set. \subsection[CIRE and test]{Estimation and Testing under Circular Order Restrictions} \label{CIRE} Analogous to PAVA for Euclidean data, \citet{Rue09} derived a circular isotonic regression estimator (CIRE) for estimating angular parameters $(\phi_1, \phi_2, \ldots, \phi_q)$ subject to a circular order. The CIRE of ${\bf \phi } = (\phi_1, \phi_2, \ldots, \phi_q)$, under the constraint $(\phi_1, \phi_2, \ldots, \phi_q)' \in C_{sco}$, is given by: \begin{equation} \label{CIRE} \widetilde{\phi }=\arg \min_{\phi \in C_{sco}} SCE(\phi,\overline{\theta})\text{,} \end{equation} where $ SCE(\phi,{\overline{\theta}})$, defined below, is the Sum of Circular Errors, a circle analog to the Sum of Squared Errors (SSE) used for Euclidean data; \begin{equation} \label{SCE} SCE(\phi, \overline{\theta})=\sum_{i=1}^{q} r_{i} (1-cos(\phi_{i}-\overline{\theta}_{i})), \end{equation} where $r_i$ are the mean resultant lengths. The CIRE is implemented in the function \code{CIRE} of the package \pkg{isocir}. Just as the normal distribution is commonly used for the Euclidean space data, the von-Mises distribution is widely used for describing angular data on a unit circle. Accordingly, for $i=1, 2, \dots, q$, throughout this paper we assume that $\overline{\theta}_i$ are independently distributed according to von-Mises distribution, denoted as $M(\phi_i, \kappa)$ where $\phi_i$ is the modal direction and $\kappa$ is the concentration parameter (see \citet{Mar00}). Under such an assumption, \citet{Fer11} developed a conditional test for testing the following hypotheses: \begin{description} \item[$\hspace{1cm}H_{0}: $ ] $\phi \in C_{sco}$ \vspace{-0.2cm} \item[$\hspace{1cm}H_{1}: $ ] $H_{0}$ is not true. \end{description} The conditional test statistic is given by $ T^*=\frac{2 \hat{\kappa} SCE(\widetilde\phi , \overline{\theta})}{q}$ , where $\hat\kappa$ is the estimator of $\kappa$, $\widetilde{\phi }$ is the CIRE computed under $H_{0}$. The estimate $\widetilde\phi$ determines a partition of $\wp=\{1,\ldots,I\}$ into sets of coordinates on which $\widetilde\phi$ is constant. These sets are called level sets. The rejection region for the conditional $\alpha$-level test is given by: \begin{equation*} \text{Reject }H_{0} \text{ if } T^* \geq c(m) \text{, } \end{equation*} where $m$ is the number of level sets for $\widetilde\phi $ and, for large values of $\kappa$, the approximate critical value $c(m)$ is chosen so that: \begin{equation}\label{prob2} pr(F _{q-m,q-1} \geq c(m))=\frac{% \alpha }{1-1/(q-1)!}\text{,} \end{equation} where $F _{q-m,q-1}$ represents the central $F$ random variable with $(q-m, q-1)$ degrees of freedom. The above test statistic is proportional to a chi-square test when $\kappa$ is known. For details one may refer to \citet{Fer11}. The above methodology can be modified to test \begin{description} \item[$\hspace{1cm}H_{0}: $ ] $\phi \in C_{pco}$ \vspace{-0.2cm} \item[$\hspace{1cm}H_{1}: $ ] $H_{0}$ is not true \end{description} by replacing $1/(q-1)!$ by $(l_1 ! l_2 ! \cdots l_L!)/(q-1)!$.\\ The simulation study performed in \citet{Fer11} suggests that the power of this test is quite reasonable. Notice that, for a given data set, the p-value obtained by using the above methodology may serve as a useful goodness of fit criterion when comparing two or more plausible circular orders among a set of angular parameters. Larger values may suggest that the estimations are closer to the presumed circular order. Thus the statistical methodology developed in \citet{Fer11} can be used not only for testing relative order among the parameters. It can be also useful for selecting a ``best fitting" circular order among several circular order candidates. These tests are implemented in the function \code{cond.test} in the \proglang{R} package \pkg{isocir} introduced in the next Section. \section[isocir]{Package \pkg{isocir}} \label{sec3} \label{sec3} We start this section by giving some background on \proglang{R} packages for isotonic regression and analysis of circular data. We then describe the structure of our package \pkg{isocir} (\citet{Bar13}) and illustrate it by some examples. \subsection[Background]{Related Packages} As isotonic regression is a well-known and widely used technique there are many packages in \proglang{R} for performing isotonic regression, such as: \begin{itemize} \item \textbf{\pkg{isotone}} (\cite{isotone}): Active set and generalized PAVA for isotone optimization. \item \textbf{\pkg{Iso}} (\cite{Iso}): Functions to perform isotonic regression. \item \textbf{\pkg{ordMonReg}} (\cite{ordMonReg}): Compute least squares estimates of one bounded or two ordered isotonic regression curves. \end{itemize} Similarly, there are several packages in \proglang{R} for analyzing circular data, such as: \begin{itemize} \item \textbf{\pkg{CircStats}} (\cite{CircStats}): The implementations of the Circular Statistics from ``Topics in circular Statistics" \citet{Jam01}. It is an R port from the \proglang{S-plus} library with the same name. \item \textbf{\pkg{circular}} (\cite{circular}): This package expands in several ways the \pkg{CircStats} package. \end{itemize} Since none of the existing packages for circular data are applicable for analyzing circular data under constraints, in this article we introduce the software package ``\textbf{iso}tonic inference for \textbf{cir}cular data", with the acronym \pkg{isocir}, for analyzing circular data under constraints. Our package depends on \pkg{circular} (see \cite{circular}) and \pkg{combinat} (see \cite{combinat}). These packages should be installed in the computer before loading \pkg{isocir}. \subsection[Package structure]{Package Structure} For the convenience of the reader, we summarize all the functions, arguments and descriptions of our package \pkg{isocir} in Table 1.\\ \begin{table}[h] \centering \renewcommand{\arraystretch}{1.1} \begin{tabular}{|l|l|l|} \hline \textbf{ Functions} & \textbf{ Arguments} &\textbf{ Description}\\ \hline \code{sce} & \code{(arg1, arg2, meanrl)}& Sum of Circular Errors\\ \hline \code{mrl} & \code{(data)} & Mean Resultant Length\\ \hline \code{CIRE} & \code{(data, groups, circular)} & Circular Isotonic Regression Estimator\\ \hline \code{cond.test} & \code{(data, groups, kappa)} & Conditional Test\\ \hline \hline \code{isocir} & \code{(cirmeans, SCE, CIRE, pvalue, kappa)} & Creates an Object of class isocir\\ \hline \code{is.isocir} & \code{(x)} & Checks an Object\\ \hline \code{print.isocir} & \code{(x,decCIRE,decpvalue,deckappa,...)} & Prints an object of class isocir\\ \hline \code{plot.isocir} & \code{(x, option, ...)} & Plots an object of class isocir\\ \hline \end{tabular} \caption{Summary of the components of the package \pkg{isocir}} \label{functions} \end{table} In the following we describe each function of the software in detail. \begin{itemize} \item \textbf{Functions \code{sce()} and \code{mrl()}} \end{itemize} The auxiliary function \code{sce} computes the sum of circular errors between a given $q$-dimensional vector (denoted by \code{arg1}) and one or more $q$-dimensional vectors (denoted by \code{arg2}). The function \code{mrl} computes the mean resultant length for the input \code{data}. \begin{itemize} \item \textbf{Function \code{CIRE()}} \end{itemize} Using the methodology developed in \citet{Rue09}, this \proglang{R} function computes the CIRE for a given circular order (\ref{Csco}) or a partial order (\ref{Cpco}). The arguments of this function are summarized in Table~\ref{arguments_CIRE}. \begin{table}[h] \centering \renewcommand{\arraystretch}{1.1} \begin{tabular}{| l | l |} \hline \emph{Arguments} & \emph{Values} \\ \hline \textbf{\code{data}} & vector or matrix with the data\\ \hline \textbf{\code{groups}} & the groups of the order\\ \hline \textbf{\code{circular}} & \code{=TRUE}(by default) / \code{=FALSE}\\ \hline \end{tabular} \caption{Arguments of the \code{CIRE} function} \label{arguments_CIRE} \end{table} The input variable \code{data} is a matrix where each column contains the vector of unconstrained angular means corresponding to each replication. If there is only one replication then data is a vector. The position $i$ in the vector \code{groups} contains the group number to which the parameter $\phi_i$ belongs to. The logical argument \code{circular} sets whether the order is wrapped around the circle, i.e. circular order (\code{circular=TRUE}) or not, i.e. simple order (\code{circular=FALSE}). For example, the simple order cone in the circle $C_{so}=\{\phi=(\phi_1,\phi_2,\ldots,\phi_q)\in [0,2\pi]^q:0\leq\phi_{1}\leq\phi_{2}\leq\cdots\leq\phi_{q}\leq 2\pi\}$ would be a non circular order. The output of this function is an object of class \code{isocir} (explained later) containing the circular isotonic regression estimator ($\widetilde{\phi}$), the unrestricted circular means ($\overline{\theta}$) and the corresponding sum of circular error ($SCE(\widetilde{\phi},\overline{\theta})$). \begin{itemize} \item \textbf{Function \code{cond.test()}} \end{itemize} This function performs the conditional test and computes the corresponding p-value for the following hypotheses: \begin{description} \item[$\hspace{1cm}H_{0}: $ ] The angles $\phi_{1},\ldots,\phi_q$ follow a (simple or partial) circular order. \vspace{-0.2cm} \item[$\hspace{1cm}H_{1}: $ ] $H_{0}$ is not true. \end{description} The arguments of this function appear in Table \ref{arguments_cond.test} and are explained below. \begin{table}[h] \centering \renewcommand{\arraystretch}{1.1} \begin{tabular}{| l | c | c|} \hline \emph{Arguments} & \emph{$\kappa$ known} & \emph{$\kappa$ unknown} \\ \hline \textbf{\code{data}} & numeric vector & matrix (as many columns as replications)\\ \hline \textbf{\code{groups}} & \multicolumn{2}{c|}{numeric vector with the groups of the order to be tested}\\ \hline \textbf{\code{kappa}} & positive numeric value & (NULL)\\ \hline \textbf{\code{biasCorrect}} & (NULL) & \code{=TRUE}(by default) / \code{=FALSE}\\ \hline \end{tabular} \caption{Arguments of the \code{cond.test} function} \label{arguments_cond.test} \end{table} Arguments \code{data} and \code{groups} are same as those in function \code{CIRE} although in this function \code{groups} is the order to be tested instead of the known order. The argument \code{kappa} is needed only when \code{data} is a vector. If there are no replications in \code{data}, the value of $\kappa$ must be set by the user. Even when there are replicated data, if the user knows the value of $\kappa$, it may be introduced and it will be taken into consideration to perfom the conditional test. When $\kappa$ is unknown and there are replicated data, the parameter is internally estimated by maximum likelihood and $\hat{\kappa}$ is shown in the output. The \code{biasCorrect} is related to the estimation of $\kappa$. If \code{biasCorrect=TRUE} the bias correction appearing in \cite{Mar00} p. 87 is performed in the estimation of $\kappa$. The output of this function is an object of class \code{isocir} (explained below) with all the results from the conditional test: the CIRE ($\widetilde{\phi}$), the unrestricted circular means ($\overline{\theta}$), the SCE ($SCE(\widetilde{\phi},\overline{\theta})$), the kappa value (estimated or introduced) and the p-value of the conditional test. \begin{itemize} \item \textbf{Class \code{isocir}} \end{itemize} Finally we describe the \code{isocir} function. This function creates the S3 objects of class \code{isocir} which is a list with the following elements: \begin{description} \item \code{$cirmeans} is a list with the unrestricted circular means. Notice that when the argument \code{data} is a vector, these values match exactly with the input. However, if there are replicated data, the argument \code{data} is a matrix and \code{$cirmeans} contains the corresponding unrestricted circular means ($\overline{\theta}_1,\ldots,\overline{\theta}_q$). \item \code{$SCE} is the value of the Sum of Circular Errors ($SCE(\widetilde{\phi},\overline{\theta})$). \item \code{$CIRE} is a list with the Circular Isotonic Regression Estimator ($\widetilde{\phi}$) obtained under the order defined by the \code{groups} argument. \item \code{$kappa} the value of kappa (either set by the user or estimated). \item \code{$pvalue} the p-value of the conditional test obtained from the function \code{cond.test}. \end{description} These objects of class \code{isocir} are the output of the functions \code{CIRE} and \code{cond.test}. The last two elements of the list (\code{$kappa} and \code{$pvalue}) are \code{NULL} if the object comes from the function \code{CIRE}. Otherwise, if the object comes from the function \code{cond.test} not only there are the results of the conditional test (\code{$kappa} and \code{$pvalue}) but also an attribute called \code{estkappa} will inform (or rather remind) the user if the value in \code{$kappa} has been internally estimated or introduced as a known input. Some S3 methods have also been defined for the class \code{isocir}: \begin{itemize} \item \code{isocir(cirmeans = NULL, SCE = NULL, CIRE = NULL, pvalue = NULL, kappa = NULL)}: This function creates an object of class \code{isocir}. \item \code{is.isocir(x)}: This function checks whether the object \code{x} is of class \code{isocir}. \item \code{print.isocir(x, decCIRE, decpvalue, deckappa, ...)}: This S3 method is used to print an object \code{x} of class \code{isocir}. The number of decimal places can be chosen. \item \code{plot.isocir(x, option = c("CIRE", "cirmeans"), ...)}: This S3 method is used to plot an object \code{x} of class \code{isocir}. The argument \code{option} gives the user the option to plot the points of the Circular Isotonic Regression Estimator (by default) or the unrestricted circular means. \end{itemize} \subsection[Examples]{Examples} \label{Examples} In this section we provide examples to illustrate \pkg{isocir}.\\ \large{ \textbf{Example 1}} Suppose the observed angular means of eight populations are given by: $\overline{\theta}_{1}=0.025; \hspace{0.1cm}\overline{\theta}_{2}=1.475;\hspace{0.1cm}\overline{\theta}_{3}=3.274;$\\ $\overline{\theta}_{4}=5.518; \hspace{0.1cm}\overline{\theta}_{5}=2.859;$\\ $\overline{\theta}_{6}=5.387;$\\ $\overline{\theta}_{7}=4.179; \hspace{0.1cm}\overline{\theta}_{8}=1.962$.\\ We illustrate \pkg{isocir} for estimating the 8 population angular parameters under the following partial circular order constraint: \begin{equation*}\label{order1} \left\{ \begin{array}{c} \phi_{1} \\ \phi_{2} \\ \phi_{3} \\ \end{array}\right \} \preceq \left\{ \begin{array}{c} \phi_{4} \\ \phi_{5} \\ \end{array}\right\} \preceq \{ \phi_{6} \}\preceq \left\{ \begin{array}{c} \phi_{7} \\ \phi_{8} \\ \end{array}\right\}\preceq \left\{ \begin{array}{c} \phi_{1} \\ \phi_{2} \\ \phi_{3} \\ \end{array}\right\} \end{equation*} These data are a set of random circular data called \code{cirdata} in our package and they can be used by calling as below: % CHUNK 2 <>= data(cirdata) cirdata @ Since in this example, there are no replications, we provide data in a vector format. The groups of the order are defined as follows: % CHUNK 3 <>= orderGroups <- c(1, 1, 1, 2, 2, 3, 4, 4) @ Thus we obtain CIRE using the function \code{CIRE} as follows: % CHUNK 4 <>= example1CIRE <- CIRE(cirdata, groups = orderGroups, circular = TRUE) @ The output is saved in \code{example1CIRE} and the printed output is as follows: % CHUNK 5 <>= example1CIRE @ Thus the constrained estimates satisfy the required order as follows: \begin{equation*}\label{order2} \left\{ \begin{array}{c} \widetilde{\phi}_{1}=0.993 \\ \widetilde{\phi}_{2}=1.475 \\ \widetilde{\phi}_{3}=3.066 \\ \end{array}\right\} \preceq \left\{ \begin{array}{c} \widetilde{\phi}_{4}=5.056 \\ \widetilde{\phi}_{5}=3.066 \\ \end{array}\right\} \preceq \{ \widetilde{\phi}_{6}=5.056 \}\preceq \left\{ \begin{array}{c} \widetilde{\phi}_{7}=5.056 \\ \widetilde{\phi}_{8}=0.993 \\ \end{array}\right\} \end{equation*} where $\widetilde{\phi}$ is the Circular Isotonic Regression Estimator of $\phi$. Results may be displayed graphically by setting \code{plot(example1CIRE)}. When done so, a plot with the points of the CIRE is produced. To see the plot for the unrestricted estimates the argument \code{option="cirmeans"} can be used (i.e. \code{plot(example1CIRE, option = "cirmeans")}).\\ \large \textbf{Example 2 ($\kappa$ unknown (replications needed))} Using the data in our package called \code{datareplic} we demonstrate the use of the function \code{cond.test} when $\kappa$ is unknown. As remarked earlier, when $\kappa$ is unknown we need replicate data to estimate $\kappa$. The file \code{datareplic} is a matrix where each column contains the values of a replication and each row the angles observed at each population mean. In this example we have 8 populations and hypotheses regarding the corresponding 8 parameters are as follows: \begin{description} \item[$\hspace{1cm}H_{0}: $ ] $ \phi_{1}\preceq \phi_{2} \preceq \phi_{3} \preceq \phi_{4} \preceq \phi_{5}\preceq \phi_{6}\preceq \phi_{7}\preceq \phi_{8}\preceq \phi_{1}$ \vspace{-0.2cm} \item[$\hspace{1cm}H_{1}: $ ] $H_{0}$ is not true. \end{description} We take the data from the package and set the groups of the order in the argument \code{groups}. % CHUNK 6 <>= data(datareplic) orderGroups2 <- c(1:8) @ Since replicate data are available, we do not include the argument \code{kappa} as we want the function to estimate it. Moreover, we correct the bias in the estimation of $\kappa$, so we set \code{biasCorrect=TRUE}. Thus we have the following code: % CHUNK 7 <>= example2test <- cond.test(datareplic,groups=orderGroups2,biasCorrect=TRUE) example2test @ The result is the p-value defined in (\ref{prob2}). Since the \code{p-value = 0.0034} we reject the null hypothesis and conclude that the parameters do not satisfy the specified circular order. If the user is interested in printing the unrestricted circular means $\overline{\theta}$ then the following command is used: \code{example2test$cirmeans}. The result is a list that is saved in the same format as CIRE. Since each group in the circular order has a single element, it is convenient to use the vector format. Hence we have: % CHUNK 8 <>= round(unlist(example2test$cirmeans), digits = 3) @ \section[Applications]{Application to Analysis of the Cell Cycle Gene Expression Data} \label{sec4} As noted in the Introduction there has been considerable discussion in the literature on the conservation of various aspects of cell cycle genes (\citet{Fer11}), particularly between two yeast species, namely, \emph{S. Cerevisiae} (budding yeast) and \emph{S. Pombe} (fission yeast). Using the 10 published budding yeast data sets (\citet{Rus04}, \citet{Ol05}, \citet{Pen05}), we illustrate the \pkg{isocir} package to test the null hypothesis that 16 fission yeast genes, namely, \emph{ssb1}, \emph{cdc22}, \emph{msh6}, \emph{psm3}, \emph{rad21}, \emph{cig2}, \emph{mik1}, \emph{h3.3}, \emph{hhf1}, \emph{hht3}, \emph{hta2}, \emph{htb1}, \emph{fkh2}, \emph{chs2}, \emph{sid2} and \emph{slp1} satisfy the same circular order as their budding yeast orthologs (RFA1, RNR1, MSH6, SMC3, MCD1, CLN2, SWE1, HHT2, HHF1, HHT1, HTA2, HTB2, FKH1, CHS2, DBF2 and CDC20) whose circular order is obtained from cyclebase (\url{http:://www.cyclebase.org}) and published literature. Thus we test the following hypothesis: \begin{equation}\label{hypo} \begin{array}{l} H_{0}: \phi_{ssb1}\preceq\phi_{cdc22}\preceq\phi_{msh6}\preceq\phi_{psm3}\preceq\phi_{rad21}\preceq\phi_{cig2}\preceq\phi_{mik1}\preceq\phi_{h3.3}\preceq\\ \hspace{0.7cm}\preceq\phi_{hhf1}\preceq\phi_{hht3}\preceq\phi_{hta2}\preceq\phi_{htb1}\preceq\phi_{fkh2}\preceq\phi_{chs2}\preceq\phi_{sid2}\preceq\phi_{slp1}\preceq\phi_{ssb1}\\ \\\vspace{-0.2cm} H_{1}: H_{0} \textrm{ is not true}. \end{array} \end{equation} For each of the 10 experimental data sets, the unconstrained estimates of the phase angles of the above 16 fission yeast genes appearing in Table \ref{data} were obtained using the Random Periods Model (\citet{Liu04}). The \proglang{R} code for that software can be obtained from \url{http://www.niehs.nih.gov/research/atniehs/labs/bb/staff/peddada/index.cfm}. Notice that there are no replicated data here, since the experiments were not performed under the same experimental conditions. It appears that the 10 experiments were not synchronized (i.e. cells were probably not arrested at the same point in the cell cycle). For this reason, from Table~\ref{SCEyCIRE} it appears that there is a large variability in the estimates of phase angles of each of the 16 genes. Even though there may be large variability in the estimated values, our interest is in the relative order of phase angles among the 16 genes which does not rely on the location of the pole and hence does not rely on the synchronization. As there are no replicated data, we have a single observation for each of the 16 fission yeast genes in each experiment and, therefore, the values in Table \ref{data} play the role of the unrestricted circular mean in each experiment. Consequently we suppose that, \begin{equation*} \overline{\theta}_{ij}\sim^{ind}M(\phi_{ij}, \kappa_j), i=1, 2, \ldots, 16, j=1, 2, \ldots, 10, \end{equation*} where $\overline{\theta}_{ij}$ is the unrestricted circular mean of the gene $i$ in the experiment $j$. Since the 10 experiments may not considered as replications of each other, we performed a separate test for each experiment. Moreover, as explained in \citet{Fer11} we assume that the concentration parameter $\kappa_j$ depends on the experiment but not on the gene. The reason for this is that out of the two sources of uncertainty, one specific to the gene and another one due to the experiment (and therefore common to all genes within the experiment), the former source maybe considered negligible relative to the latter as the number of time points used in each time course experiment is fairly large for any specific gene. The $\kappa_j$ values considered for this example are obtained using \citet{Fer11}. The procedure used for the computation of these values comes from an analysis of variance type methodology. Under the assumptions made before, the model for the circular means is: \begin{equation*} \phi_{ij}=\mu+\alpha_i+\beta_j, \end{equation*} where $\alpha_i$ is the gene effect and $\beta_j$ is the experiment effect. The proposed model is analogous to the standard two-way analysis of variance model and is fully detailed in the supplementary material of \citet{Fer11}. For each of the 10 experiments, we test the hypothesis (\ref{hypo}) using the function \code{cond.test} that is in our software. The following code gives the p-values for each experiment. Results are summarized in Table~\ref{SCEyCIRE}. % CHUNK 9 <>= data("cirgenes") kappas <- c(2.64773, 3.24742, 2.15936, 4.15314, 4.54357, 29.07610, 6.51408, 14.19445, 5.66920, 11.12889) allresults <- list() resultIsoCIRE <- matrix(ncol = ncol(cirgenes), nrow = nrow(cirgenes)) SCEs <- vector(mode = "numeric", length = nrow(cirgenes)) pvalues <- vector(mode = "numeric", length = nrow(cirgenes)) for (i in 1 : nrow(cirgenes)) { k <- kappas[i] genes <- as.numeric(cirgenes[i, !is.na(cirgenes[i, ]) ]) allresults[[i]] <- cond.test(genes, kappa = k) resultIsoCIRE[ i, !is.na(cirgenes[i, ]) ] <- unlist(allresults[[i]]$CIRE) SCEs[i] <- allresults[[i]]$SCE pvalues[i] <- allresults[[i]]$pvalue } @ From the p-values in Table~\ref{SCEyCIRE}, we see that the null hypothesis cannot be rejected in any of the 10 experiments even at a level of significance as high as 0.20. Therefore, it seems plausible that the peak expressions of these 16 genes in \emph{S. Pombe} (fission yeast) follow the same order as their \emph{S. Cerevisiae} (budding yeast) orthologs. \section[Conclusions]{Conclusions} \label{sec5} In this paper the \proglang{R} package \pkg{isocir} has been presented. This package provides useful tools for drawing inferences from circular data under order restrictions. There are two main functions (\code{CIRE} and \code{cond.test}). The first one computes the CIRE, the circular version of the widely known isotonic regression in $R^q$. The second one is designed for testing circular hypotheses using a conditional test. We have also created the class \code{isocir} in order to properly save all the results. Although we illustrated the proposed methodology using an example from cell biology, the proposed software can be applied to a wide range of contexts. For example, biologists working on circadian clocks may be interested in the testing for the conservation of circular order among circadian genes between two tissues (e.g., \citet{Liu06}). We would like to emphasize that the field of constrained inference on a unit circle is in its infancy and is wide open for new developments both in methods as well as applications. As observed in the introduction, such constrained inference problems arise naturally in many applications. Therefore we expect the software described in this paper to be widely used by researchers working in such areas. \section*{Acknowledgments} SB's, MAF's and CR's research was partially supported by Spanish MCI grant MTM2009-11161. SB's work has been partially financed by Junta de Castilla y León, Consejería de Educación and the European Social Fund within the P.O. Castilla y León 2007-2013 programme. SDP's research [in part] was supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (Z01 ES101744-04). We thank Dr. Leping Li, Keith Shockley, the anonymous reviewers, the associate editor and the editor for several useful comments which improved the presentation of this manuscript. \bibliography{library1} % \bibliographystyle{jss} \clearpage \begin{sidewaystable}[h] \centering \renewcommand{\arraystretch}{1.75} \small{ \scalebox {0.9 }[1.1 ]{ \begin{tabular}{| l | c | c | c | c | c | c | c | c | c | c | c | c | c | c | c | c |} \addlinespace \hline &\multicolumn{16}{|c|}{Unrestricted Circular Means}\\ \hline \multirow{2}{*}{\backslashbox{\emph{\textbf{Experiments}}}{\emph{\textbf{Genes}}}}& \textbf{ssb1} & \textbf{cdc22}& \textbf{msh6 } & \textbf{ psm3 } & \textbf{rad21}& \textbf{ cig2 }& \textbf{mik1} & \textbf{ h3.3} & \textbf{hhf1} & \textbf{ hht3} & \textbf{ hta2 }& \textbf{ htb1 }& \textbf{fkh2} & \textbf{chs2} &\textbf{sid2}& \textbf{slp1}\\ \cline{2-17} & $\overline{\theta}_{1}$ & $\overline{\theta}_{2}$ & $\overline{\theta}_{3}$ & $\overline{\theta}_{4}$ & $\overline{\theta}_{5}$ & $\overline{\theta}_{6}$ & $\overline{\theta}_{7}$ & $\overline{\theta}_{8}$ & $\overline{\theta}_{9}$ & $\overline{\theta}_{10} $ & $\overline{\theta}_{11}$ & $\overline{\theta}_{12}$ & $\overline{\theta}_{13}$ & $\overline{\theta}_{14}$ & $\overline{\theta}_{15}$ & $\overline{\theta}_{16}$\\ \hline \textbf{1.Oliva cdc} & 0.202 & 0.218 & 6.262 & 5.765 & 0.893 & 5.612 & 6.257 & 1.178 & 0.912 & 1.200 & 0.971 & 1.289 & 5.298 & 5.597 & 4.252 & 5.209 \\ \hline \textbf{2.Oliva elut1} & 2.940 & 3.262 & 2.811 & 2.848 & 1.603 & 2.382 & 1.709 & 4.689 & 4.355 & 4.717 & 4.419 & 4.397 & 1.600 & 1.819 & 1.751 & 2.519 \\ \hline \textbf{3.Oliva elut2} & 0.440 & 0.447 & 5.258 & 6.206 & 4.381 & 5.458 & 6.045 & 1.541 & 0.727 & 6.115 & 0.352 & 0.687 & 3.935 & 3.970 & 5.836 & 5.896 \\ \hline \textbf{4.Peng cdc} & 3.328 & 3.565 & 3.387 & 2.806 & 3.194 & 3.260 & 3.026 & 4.779 & 4.694 & 4.755 & 4.816 & 4.675 & 2.685 & 2.770 & 2.885 & 2.422 \\ \hline \textbf{5.Peng elut} & 3.333 & 3.912 & 3.894 & 3.444 & 3.648 & 3.970 & 4.296 & 5.189 & 5.060 & 5.144 & 5.216 & 5.243 & 3.338 & 3.607 & 3.082 & 3.185 \\ \hline \textbf{6.Rust cdc1} & 1.965 & 2.151 & 2.034 & 2.029 & 1.742 & 2.073 & 1.730 & 3.130 & 2.993 & 3.085 & 3.063 & 2.873 & 1.281 & 1.179 & 1.906 & 1.237 \\ \hline \textbf{7.Rust cdc2} & 1.809 & 2.208 & 1.415 & 1.351 & 1.964 & 1.940 & 1.978 & 3.745 & 3.584 & 3.670 & 3.479 & 3.590 & 1.383 & 1.456 & 1.064 & 1.396 \\ \hline \textbf{8.Rust elut1} & --- & 1.457 & 1.288 & --- & 1.530 & 1.374 & 1.379 & 2.420 & 2.279 & 2.409 & 2.311 & 2.245 & 1.010 & 1.146 & --- & 1.091 \\ \hline \textbf{9.Rust elut2} & 2.214 & 1.786 & 1.730 & 1.987 & 1.878 & 1.882 & 3.071 & 2.704 & 2.788 & 2.814 & 2.909 & 2.740 & 1.352 & 1.441 & 1.275 & 1.420 \\ \hline \textbf{10.Rust elut3} & 2.340 & 2.702 & 2.704 & 2.526 & 2.979 & 2.319 & 2.284 & 3.773 & 3.567 & 3.636 & 3.465 & 3.431 & 1.981 & 1.716 & 2.523 & 2.118 \\ \hline \end{tabular} }} \label{data} \caption{Initial S. Pombe phase angle data for each experiment} \end{sidewaystable} \clearpage \begin{sidewaystable}[h] \centering \renewcommand{\arraystretch}{1.75} \small{ \scalebox {0.86 }[1.1 ]{ \begin{tabular}{| l | c | c | c | c | c | c | c | c | c | c | c | c | c | c | c | c | c |c|} \addlinespace \hline & \multicolumn{16}{|c|}{CIRE under Circular Order}&SCE&p-value\\ \hline\backslashbox{\emph{\textbf{Experiments}}}{\emph{\textbf{Genes}}} & $\widetilde{\phi}_{1}$ & $\widetilde{\phi}_{2}$ & $\widetilde{\phi}_{3}$ & $\widetilde{\phi}_{4}$ & $\widetilde{\phi}_{5}$ & $\widetilde{\phi}_{6}$ & $\widetilde{\phi}_{7}$ & $\widetilde{\phi}_{8}$ & $\widetilde{\phi}_{9}$ & $\widetilde{\phi}_{10} $ & $\widetilde{\phi}_{11}$ & $\widetilde{\phi}_{12}$ & $\widetilde{\phi}_{13}$ & $\widetilde{\phi}_{14}$ & $\widetilde{\phi}_{15}$ & $\widetilde{\phi}_{16}$& &\\ \hline \textbf{1.Oliva cdc}& 6.257 & 6.257 & 6.257 & 6.257 & 0.054 & 0.054 & 0.054 & 1.045 & 1.045 & 1.085 & 1.085 & 1.289 & 5.069 & 5.069 & 5.069 & 5.209 & \textcolor[rgb]{0.00,0.59,0.00}{1.270} & \textcolor[rgb]{1.00,0.00,0.00} { 0.6658}\\ \hline \textbf{2.Oliva elut1}& 2.526 & 2.526 & 2.526 & 2.526 & 2.526 & 2.526 & 2.526 & 4.515 & 4.515 & 4.515 & 4.515 & 4.515 & 1.600 & 1.785 & 1.785 & 2.519 & \textcolor[rgb]{0.00,0.59,0.00}{1.218} & \textcolor[rgb]{1.00,0.00,0.00} { 0.7214}\\ \hline \textbf{3.Oliva elut2}& 5.849 & 5.849 & 5.849 & 5.849 & 5.849 & 5.849 & 6.045 & 0.598 & 0.598 & 0.598 & 0.598 & 0.687 & 3.935 & 3.970 & 5.836 & 5.849& \textcolor[rgb]{0.00,0.59,0.00}{2.660} & \textcolor[rgb]{1.00,0.00,0.00} { 0.2437}\\ \hline \textbf{4.Peng cdc}& 3.225 & 3.225 & 3.225 & 3.225 & 3.225 & 3.225 & 3.225 & 4.736 & 4.736 & 4.749 & 4.749 & 4.749 & 2.685 & 2.693 & 2.693 & 2.693 & \textcolor[rgb]{0.00,0.59,0.00}{0.248} & \textcolor[rgb]{1.00,0.00,0.00} { 0.9983}\\ \hline \textbf{5.Peng elut}& 3.333 & 3.725 & 3.725 & 3.725 & 3.725 & 3.970 & 4.296 & 5.124 & 5.124 & 5.144 & 5.216 & 5.243 & 3.302 & 3.302 & 3.302 & 3.302 & \textcolor[rgb]{0.00,0.59,0.00}{0.156} & \textcolor[rgb]{1.00,0.00,0.00} { 0.9850}\\ \hline\textbf{6.Rust cdc1}& 1.961 & 1.961 & 1.961 & 1.961 & 1.961 & 1.961 & 1.961 & 3.029 & 3.029 & 3.029 & 3.029 & 3.029 & 1.230 & 1.230 & 1.571 & 1.571 & \textcolor[rgb]{0.00,0.59,0.00}{0.213} & \textcolor[rgb]{1.00,0.00,0.00} { 0.4142}\\ \hline \textbf{7.Rust cdc2}& 1.693 & 1.693 & 1.693 & 1.693 & 1.952 & 1.952 & 1.978 & 3.614 & 3.614 & 3.614 & 3.614 & 3.614 & 1.301 & 1.301 & 1.301 & 1.396 & \textcolor[rgb]{0.00,0.59,0.00}{0.296} & \textcolor[rgb]{1.00,0.00,0.00} { 0.9536}\\ \hline \textbf{8.Rust elut1 }& --- & 1.373 & 1.373 &--- & 1.427 & 1.427 & 1.427 & 2.333 & 2.333 & 2.333 & 2.333 & 2.333 & 1.010 & 1.118 & --- & 1.118 & \textcolor[rgb]{0.00,0.59,0.00}{0.028} & \textcolor[rgb]{1.00,0.00,0.00} { 0.9992}\\ \hline \textbf{9.Rust elut2 }& 1.909 & 1.909 & 1.909 & 1.916 & 1.916 & 1.916 & 2.837 & 2.837 & 2.837 & 2.837 & 2.837 & 2.837 & 1.352 & 1.358 & 1.358 & 1.420& \textcolor[rgb]{0.00,0.59,0.00}{0.125} & \textcolor[rgb]{1.00,0.00,0.00} { 0.9992}\\ \hline \textbf{10.Rust elut3}& 2.340 & 2.585 & 2.585 & 2.585 & 2.585 & 2.585 & 2.585 & 3.574 & 3.574 & 3.574 & 3.574 & 3.574 & 1.849 & 1.849 & 2.321 & 2.321 & \textcolor[rgb]{0.00,0.59,0.00}{0.269} & \textcolor[rgb]{1.00,0.00,0.00} { 0.8748} \\ \hline \end{tabular} } } \label{SCEyCIRE} \caption{CIRE, SCE and p-values for each experiment} \end{sidewaystable} \clearpage \end{document}