--- title: "QuantilePeer: An R package for Simulating and Estimation Quantile Peer Effect Models" author: "Aristide Houndetoungan" date: "`r Sys.Date()`" output: pdf_document: citation_package: natbib number_sections: yes latex_engine: pdflatex toc: true bibliography: ["References.bib"] biblio-style: "apalike" link-citations: true urlcolor: blue vignette: > %\VignetteEncoding{UTF-8} %\VignetteIndexEntry{QuantilePeer package: Examples and Applications} %\VignetteEngine{knitr::rmarkdown} header-includes: - \usepackage[utf8]{inputenc} - \usepackage[T1]{fontenc} - \usepackage{xcolor} - \usepackage{mathpazo} # Palatino font - \renewcommand{\normalsize}{\fontsize{11.5}{16}\selectfont} - \usepackage[x11names]{xcolor} - \usepackage[english]{babel} - \usepackage{indentfirst} - \setlength{\parindent}{1.5em} - \setlength{\parskip}{0.5em} --- \definecolor{shadecolor}{rgb}{0.97, 0.97, 0.95} ```{r setup, include=FALSE} rmarkdown::find_pandoc(version = '2.9.2.1') knitr::opts_chunk$set(echo = TRUE, eval = TRUE) knitr::knit_hooks$set(output = function(x, options) { x <- x paste0("\\begingroup\\small\n\\begin{verbatim}\n", x, "\\end{verbatim}\n\\endgroup\n") }) ``` \newpage # Introduction \noindent This vignette provides a quick introduction to the **QuantilePeer** package. The package is designed to estimate quantile peer effect models, allowing researchers to assess peer effects at multiple quantiles of the peer outcome distribution \citep[][]{houndetoungan2025quantile}. This modeling approach is more flexible than the standard linear-in-means (LIM) model, which assumes that all peers exert the same influence on individuals. The quantile peer effect model allows peer influence to vary across different quantiles of the outcome distribution, making it a useful framework for testing the assumptions underlying the LIM model. The package also provides routines for estimating peer effect models with a constant elasticity of substitution (CES) social norm \citep{boucher2024toward}. The CES-based approach includes a substitution parameter that allows peers with higher or lower outcomes to exert more influence on individual outcomes. This documentation is organized as follows. Section \ref{sec:model} provides a brief overview of quantile peer effect models. Section \ref{sec:quant} illustrates how to simulate and estimate these models. Section \ref{sec:ces} presents, simulates, and estimates CES-based peer effect models. Section \ref{sec:other} discusses additional model specifications that can also be estimated using this package. To cite **QuantilePeer**, kindly run \textcolor{DarkRed}{\texttt{citation("QuantilePeer")}} in \textbf{R}. This will display the citation information, including the Bib\TeX{} entry for \LaTeX{} users. Please also cite the paper associated with the package \citep[][]{houndetoungan2025quantile}. # A brief description of the model \label{sec:model} \noindent Let $\mathcal{N}$ be a set of $n$ agents indexed by the integer $i \in [1, ~n]$. Agents are connected through a network that is characterized by an adjacency matrix $\mathbf{G} = [g_{ij}]$ of dimension $n \times n$, where $g_{ij} = 1$ if agent $j$ is a friend of agent $i$, and $g_{ij} = 0$ otherwise. In weighted networks, $g_{ij}$ can be a nonnegative variable (not necessarily binary) that measures the intensity of the outgoing link from $i$ to $j$. The model can also accommodate such networks. Note that the network is generally constituted in many independent subnets (eg: schools). Let $\mathcal{T}$ be a set of quantile levels. The reduced-form specification of the quantile peer effect models is given by: \begin{equation} y_i = \sum_{\tau \in \mathcal{T}} \lambda_{\tau} q_{\tau,i}(\mathbf{y}_{-i}) + \boldsymbol{x}_i^{\prime}\boldsymbol \beta + \varepsilon_i, \end{equation} where $\mathbf{y}_{-i} = (y_1, \ldots, y_{i-1}, y_{i+1}, \ldots, y_n)^{\prime}$ is the vector of outcomes for individuals other than $i$, and $q_{\tau,i}(\mathbf{y}_{-i})$ denotes the sample $\tau$-quantile of peer outcomes. The term $\varepsilon_i$ is an idiosyncratic error term, $\lambda_{\tau}$ captures the effect of the $\tau$-quantile of peer outcomes on $y_i$, and $\boldsymbol \beta$ captures the effect of the exogenous variable $\boldsymbol{x}_i$ on $y_i$. \cite{hyndman1996sample} distinguish nine types of quantiles. The results developed in the paper hold for all these types. However, both the simulations and empirical analysis use \textit{Type 7}, which relies on linear interpolation when the quantile level does not correspond exactly to a peer's rank.\footnote{For example, when an agent has only two friends, the sample median of peer outcomes is simply the average of the two friends’ outcomes. The first decile is a weighted average of the two outcomes, where the friend with the lower outcome receives a weight of 0.9.} If the network matrix is weighted, a sample \emph{weighted quantile} can be used, where the outcome for friend $j$ of individual $i$ is weighted by $g_{ij}$. One issue in linear peer effect models is that individual preferences with conformity and spillover (complementarity or substitution) lead to the same reduced form \citep[see][]{boucher2016some}. However, it is possible to disentangle both types of preferences using isolated individuals \citep{boucher2024toward}. Isolated individuals are those who have no friends, although they may or may not be considered friends by others. The structural specification of the model differs between isolated and non-isolated individuals, allowing for the separate identification of peer effects arising from spillovers and conformity. For isolated $i$, the specification is similar to a standard linear-in-means (LIM) model without social interactions, given by: \begin{equation}\label{eq:yiso} y_i = \mathbf{x}_i^{\prime}\boldsymbol \beta + \varepsilon_i. \end{equation} If $i$ is non-isolated, the specification is given by: \begin{equation}\label{eq:yniso} y_i = \sum_{\tau \in \mathcal{T}} \lambda_{\tau} q_{\tau,i}(\mathbf{y}_{-i}) + (1 - \lambda_2)\mathbf{x}_i^{\prime}\boldsymbol \beta + \varepsilon_i, \end{equation} where $\lambda_2$ determines whether preferences exhibit conformity or anti-conformity. Specifically, let $\lambda:= \sum_{\tau \in \mathcal{T}} \lambda_{\tau}$ denote the total peer effects at all quantile levels. These total effects can be decomposed as $\lambda = \lambda_1 + \lambda_2$, where $\lambda_1$ captures total spillover effects and $\lambda_2$ captures total conformity effects. As in \citet{boucher2024toward}, when $\lambda_1 > 0$, preferences exhibit complementarity; when $\lambda_1 < 0$, preferences exhibit substitution. In contrast, when $\lambda_2 > 0$, preferences are conformist. Anti-conformity may also arise when $\lambda_2 < 0$. If peer effects are solely due to spillovers, then $\lambda_1 \ne 0$ and $\lambda_2 = 0$. Conversely, if peer effects arise only from conformity, then $\lambda_1 = 0$ and $\lambda_2 \ne 0$. The quantile peer effect model allows for the decomposition of total peer effects at different quantile levels, measured by the parameters $\lambda_{\tau}$'s. To identify $\lambda_1$ and $\lambda_2$, it is important to observe a sufficient number of isolated individuals in the network. This enables the identification of $\boldsymbol \beta$ in Equation \eqref{eq:yiso}, which in turn makes it possible to identify $\lambda_2$ in Equation \eqref{eq:yniso}. The underlying assumption here is that the $\boldsymbol{\beta}$ parameter is the same in both equations. Alternatively, assuming that a subset of the components of $\boldsymbol{\beta}$ are the same is sufficient for the identification of $\lambda_2$. When the network contains no isolated nodes, the $\lambda_{\tau}$’s can still be identified, but it is not possible to determine whether they reflect conformity or spillover effects. # Estimating quantile peer effects \label{sec:quant} \noindent The key functions discussed in this section include: \begin{enumerate} \item \texttt{qpeer.sim}: simulates data from models with quantile peer effects; \item \texttt{qpeer.inst}: computes instruments for models with quantile peer effects; \item \texttt{qpeer}, \texttt{linpeer}, and \texttt{genpeer}: estimate models with quantile peer effects; \item \texttt{qpeer.test}: performs specification tests. \end{enumerate} Most of these functions are also implemented as classes, with associated \texttt{summary} and \texttt{print} methods. ## Data simulation \noindent Throughout this documentation, I use simulated data. To begin, I first create a network matrix $\mathbf{G}$ and two exogenous variables, $\boldsymbol{x}_1$ and $\boldsymbol{x}_2$. Importantly, I include some isolated nodes to ensure the identification of the structural model. ```{r} library(QuantilePeer) set.seed(123) # Set seed for reproducibility ngr <- 50 # Number of subnets nvec <- rep(30, ngr) # Size of subnets n <- sum(nvec) # Network matrix G <- lapply(1:ngr, function(z) { Gz <- matrix(rbinom(nvec[z]^2, 1, 0.3), nvec[z], nvec[z]) diag(Gz) <- 0 # Adding isolated nodes (important for the structural model) niso <- sample(0:nvec[z], 1, prob = (nvec[z] + 1):1 / sum((nvec[z] + 1):1)) if (niso > 0) { Gz[sample(1:nvec[z], niso), ] <- 0 } Gz }) X <- cbind(rnorm(n), rpois(n, 2)); colnames(X) <- c("X1", "X2") ``` Using the network matrix and the exogenous variables, I can now generate the dependent variables. I consider the quantile levels $(0,~1/3,~2/3,~1)$, which define four quantiles, and two types of dependent variables. The first is generated from the reduced-form model (without a conformity parameter), while the second is based on the structural model. The following code assigns values to the model parameters and simulates the corresponding dependent variables. ```{r} tau <- seq(0, 1, 1/3) #quantile level lambdatau <- c(0.1, 0.25, 0.2, 0.15) #lambda_tau lambda2 <- 0.2 # lambda_2 beta <- c(2, -0.5, 1) # First dependent variable (reduced form without conformity) y1 <- qpeer.sim(formula = ~ X, Glist = G, tau = tau, lambda = lambdatau, beta = beta, structural = FALSE, epsilon = rnorm(n, 0, 0.4)) y1 <- y1$y #qpeer.sim returns a list of several object including y # Second dependent variable (structural form with conformity) y2 <- qpeer.sim(formula = ~ X, Glist = G, tau = tau, lambda = c(lambda2, lambdatau), beta = beta, structural = TRUE, epsilon = rnorm(n, 0, 0.4)) y2 <- y2$y ``` Note that we can also include contextual variables, such as averages of $\boldsymbol x$ among peers, as additional exogenous variables. ## Instruments \label{sec:quant:instruments} \noindent I propose two instrument sets for quantile peer outcomes. The first type of instruments ($\mathbf{Z}_1$) is the set of quantiles of $\boldsymbol{x}$ among peers. The second type ($\mathbf{Z}_2$) also consists of quantiles of $\boldsymbol{x}$ among peers, but with a key distinction that the values of $\boldsymbol{x}$ are ordered using the values of the peers' dependent variable (see the detailed discussion in the paper). The second type of instruments can yield more efficient estimators but may also be endogenous. Their validity can be tested using a procedure similar to that of \cite{hausman1978specification}. It is also possible to combine $\mathbf{Z}_1$ and $\mathbf{Z}_2$ to strengthen the instruments. Instruments can be computed using \texttt{qpeer.inst}. The type of instruments is specified through the \texttt{formula} argument. If \texttt{formula} is defined without a dependent variable (i.e., an expression of the form \texttt{\textasciitilde{} X1 + X2 + \ldots}), then the first type of instruments is computed. In contrast, if \texttt{formula} includes a dependent variable (i.e., an expression of the form \texttt{y \textasciitilde{} X1 + X2 + \ldots}), then the second type of instruments is computed. For the first type of instruments, it is important to use a finer subdivision of quantile levels than $\tau$. By including many quantile levels of the characteristics $\boldsymbol{x}$, one obtains a comprehensive representation of the distribution of peer characteristics $\boldsymbol{x}$, which helps effectively approximate their outcomes and the quantiles of their outcomes. Using a finer subdivision of quantile levels is not important for the second type of instruments. ```{r} # First instrument set Z1 <- qpeer.inst(formula = ~ X, Glist = G, tau = seq(0, 1, 0.1), max.distance = 2, checkrank = TRUE) # finer subdivision Z1 <- Z1$instruments #qpeer.inst returns a list of several object # Second instrument set: y1 is used to order X Z21 <- qpeer.inst(formula = y1 ~ X, Glist = G, tau = tau, max.distance = 2, checkrank = TRUE) qy1 <- Z21$qy #quantile of y among peers Z21 <- Z21$instruments # Second instrument set: y2 is used to order X Z22 <- qpeer.inst(formula = y2 ~ X, Glist = G, tau = tau, max.distance = 2, checkrank = TRUE) qy2 <- Z22$qy #quantile of y among peers Z22 <- Z22$instruments ``` As in the standard linear model, one can use the quantiles of $\boldsymbol{x}$ within direct friends and long-distance friends (such as friends of friends) to strengthen the instruments. Setting \texttt{max.distance = 2} means that the quantiles of $\boldsymbol{x}$ are computed among both direct friends and friends of friends. The \texttt{checkrank} argument ensures that the resulting instrument set is a full-rank matrix by removing columns that are linear combinations of others. ## Estimation \noindent Quantile peer effects are estimated using the General Method of Moments (GMM). Estimates can be obtained using the \texttt{qpeer} function. I begin with the reduced-form specification, using both types of instruments for each dependent variable. ```{r} QtR1 <- qpeer(formula = y1 ~ X, excluded.instruments = ~ Z1, Glist = G, tau = tau) summary(QtR1, diagnostic = TRUE) QtR2 <- qpeer(formula = y1 ~ X, excluded.instruments = ~ Z1 + Z21, Glist = G, tau = tau) summary(QtR2) QtR3 <- qpeer(formula = y2 ~ X, excluded.instruments = ~ Z1, Glist = G, tau = tau) summary(QtR3) QtR4 <- qpeer(formula = y2 ~ X, excluded.instruments = ~ Z1 + Z21, Glist = G, tau = tau) summary(QtR4) ``` Note that the \texttt{formula} argument does not include the quantiles of peer outcomes, but only the outcome and the exogenous variables. The quantiles are computed internally by the function based on the levels specified in the \texttt{tau} argument. Additionally, the \texttt{excluded.instruments} argument should not contain any instruments that are already included as explanatory variables. The output of the \texttt{qpeer} function is an object of class \texttt{qpeer}, to which the \texttt{summary} method can be applied. An important argument of the \texttt{summary} method is \texttt{diagnostics}, a logical value indicating whether diagnostic tests for the instrumental variable regression should be performed. These tests include an F-test and the rank test of \citet{kleibergen2006generalized} for weak instruments, the Wu–Hausman test for endogeneity, and Hansen’s J-test for overidentifying restrictions (when the number of instruments exceeds the number of endogenous regressors). The diagnostic test results are displayed for the first estimation shown above. The reduced-form estimation for \texttt{y2} is likely to be inconsistent, since the reduced-form model assumes that preferences exhibit either spillover or conformity, but not both, whereas \texttt{y2} is generated from a structural model that includes both effects. The following code replicates the estimations using the structural specification. ```{r} QtS1 <- qpeer(formula = y1 ~ X, excluded.instruments = ~ Z1, Glist = G, tau = tau, structural = TRUE) summary(QtS1, diagnostic = TRUE) QtS2 <- qpeer(formula = y1 ~ X, excluded.instruments = ~ Z1 + Z21, Glist = G, tau = tau, structural = TRUE) summary(QtS2) QtS3 <- qpeer(formula = y2 ~ X, excluded.instruments = ~ Z1, Glist = G, tau = tau, structural = TRUE) summary(QtS3) QtS4 <- qpeer(formula = y2 ~ X, excluded.instruments = ~ Z1 + Z22, Glist = G, tau = tau, structural = TRUE) summary(QtS4) ``` In the new results, the estimates appear reliable. For the dependent variable \texttt{y1}, the conformity parameter is not significant, as the data were simulated under the assumption of complementarity. The \texttt{qpeer} function offers several useful options, including the ability to change the type of GMM estimator, control for subnet fixed effects, and account for heteroskedasticity. The GMM estimator type is specified using the \texttt{estimator} argument. The default value, \texttt{"IV"}, corresponds to the standard instrumental variables (IV) estimator. It is also possible to use the GMM estimator with the identity matrix as the weighting matrix (\texttt{estimator = "gmm.identity"}) or with the optimal GMM weighting matrix (\texttt{estimator = "gmm.optimal"}). Jackknife IV estimators (type 1 and type 2) can be obtained by setting \texttt{estimator} to \texttt{"JIVE"} and \texttt{"JIVE2"}, respectively. Jackknife estimators can be especially useful when the number of instruments is large, as they help reduce the bias associated with the standard IV estimator \citep[see][]{mikusheva2022inference}. The \texttt{fixed.effects} argument specifies how to control for subnet fixed effects. The default value is \texttt{FALSE} or \texttt{"no"}, indicating that no fixed effects are included. Two levels of subnet fixed effects are supported: a single fixed effect per subnet (\texttt{fixed.effects = "join"}) and separate fixed effects per subnet for isolated and non-isolated individuals (\texttt{fixed.effects = "separate"}).\footnote{As discussed by \citet{houndetoungan2024identifying}, including two fixed effects per subnetwork may also be necessary to identify peer effects in unobserved effort, particularly when the dependent variable $y$ is a proxy for that effort (e.g., academic effort and grade point average).} For the structural specification, fixed effects must be specified separately for each type; that is, they are necessarily double per subnet. The \texttt{HAC} argument specifies the assumed covariance structure of the errors. By default, homoscedasticity is assumed (\texttt{HAC = "iid"}). To allow for heteroskedasticity at the individual level, set \texttt{HAC = "hetero"}. To account for heteroskedasticity and within-subnet correlation, where errors may be correlated among individuals in the same subnet, set \texttt{HAC = "cluster"}. Below are examples of model specifications that incorporate these additional options. ```{r} QtR5 <- qpeer(formula = y1 ~ X, excluded.instruments = ~ Z1, Glist = G, tau = tau, structural = FALSE, estimator = "gmm.optimal", HAC = "cluster", fixed.effects = "separate") summary(QtR5) QtS5 <- qpeer(formula = y2 ~ X, excluded.instruments = ~ Z1, Glist = G, tau = tau, structural = TRUE, estimator = "gmm.optimal", HAC = "cluster", fixed.effects = "separate") summary(QtS5) ``` ## Specification tests \label{sec:quant:test} \noindent Several specification tests have been discussed above. These include weak instrument, endogeneity, and overidentification tests, which can be performed by setting \texttt{diagnostic} to \texttt{TRUE} in the \texttt{summary} method. In this section, I discuss monotonicity tests for quantile peer effects, the validity of type 2 instruments (see the discussion of the two types of instruments in Section \ref{sec:quant:instruments}), and an encompassing test to choose between competing sets of quantile levels. These tests can be performed using the function \texttt{qpeer.test}. The argument \texttt{which} indicates the type of test. For instance, \texttt{which} can take the values \texttt{"uniform"}, \texttt{"increasing"}, and \texttt{"decreasing"} to test whether the $\lambda_{\tau}$'s are uniform, increasing, or decreasing. The uniform test is based on a standard Wald test for the equality of all $\lambda_{\tau}$'s. The increasing and decreasing tests are based on \cite{kodde1986wald}. Here is an example: ```{r} qpeer.test(QtR1, which = "uniform") qpeer.test(QtS5, which = "decreasing") ``` The monotonicity tests can be useful for selecting a more parsimonious and precise model, as they involve fewer parameters. For instance, if the null hypothesis that the $\lambda_{\tau}$'s are uniform is not rejected, then a standard LIM peer effect model can be used instead of a quantile model. Moreover, if the $\lambda_{\tau}$'s are monotonic (increasing or decreasing), the constant elasticity of substitution (CES)-based model can be used (see Section \ref{sec:ces}). To test the endogeneity of $\mathbf{Z}_2$, the argument \texttt{which} can be set to \texttt{"wald"} or \texttt{"sargan"}. The former is a Wald-style test that compares the estimates obtained using $\mathbf{Z}_1$ and $\mathbf{Z}_2$. If the null hypothesis that both estimates are equal is not rejected, then $\mathbf{Z}_2$ is considered exogenous. The latter is an overidentification-style test that assesses the validity of the additional information in $\mathbf{Z}_2$ that is not captured by $\mathbf{Z}_1$. These tests can be used even if $\mathbf{Z}_2$ does not nest $\mathbf{Z}_1$. The following examples illustrate the test: ```{r} qpeer.test(QtR1, QtR2, which = "wald") qpeer.test(QtS3, QtS4, which = "sargan") ``` If the estimations based on the type 2 instrument set are not rejected, then the user can report these estimates, as the type 2 instrument set is likely stronger. Choosing suitable quantile levels is important. The `qpeer.test` function offers an encompassing test to choose between two competing specifications. Assume that the quantile peer effect model is estimated with two sets of quantile levels, $\mathcal{T}_1$ and $\mathcal{T}_2$. By setting `which` to `"encompassing"`, one can test whether one model performs worse; that is, whether it fails to replicate the features captured by the other. The `model1` and `model2` arguments are used to specify both models estimated using the `qpeer` function. The null hypothesis is that `model1` is not worse. ```{r} # Estimating QtS6 with a misspecified tau QtS6 <- qpeer(formula = y2 ~ X, excluded.instruments = ~ Z1, Glist = G, tau = c(0, 1), structural = TRUE, estimator = "gmm.optimal", HAC = "cluster", fixed.effects = "separate") qpeer.test(model1 = QtS6, model2 = QtS5, which = "encompassing") ``` In the preceding code, the null hypothesis is that the specification which includes only the minimum and the maximum as quantile levels does not perform worse. The hypothesis is rejected, which is expected since the data are simulated with more quantile levels. Consequently, the model with only two quantile levels is not a good choice. In the following code, the null hypothesis is that the model with four quantile levels is not worse than the competing model with two quantile levels: ```{r} qpeer.test(model1 = QtS5, model2 = QtS6, which = "encompassing") ``` Here, the null hypothesis is not rejected. However, this does not directly imply that the other model with four quantiles is well specified. The model should also be compared with alternative specifications that include, for example, five or six quantile levels. Indeed, encompassing tests only compare two models and do not imply that one of them is correctly specified. Of course, in this example, we know that the model with four quantile levels is well specified because we know the data-generating process. Therefore, the model with four quantile levels cannot be rejected, in general, against other well-specified competing models. # CES-based peer effect models \label{sec:ces} \noindent This section introduces peer effect models with a CES social norm \citep[see][]{boucher2024toward}. These models include a substitution parameter that determines whether peers with high or low outcomes have a greater influence. ## A brief description \noindent \cite{boucher2024toward} present a flexible social norm using the CES function. They replace the average outcome among peers in the standard LIM model by the CES function. Specifically, the model is given by: \begin{equation} \label{eq:ces} y_i = \left(\sum_{j \ne i} g_{ij}y_j^{\rho}\right)^{\frac{1}{\rho}} + \boldsymbol{x}_i^{\prime}\boldsymbol \beta + \varepsilon_i. \end{equation} This model nests the standard LIM model. When $\rho = 1$, Equation \eqref{eq:ces} becomes the standard LIM model. If $\rho > 1$, peers with high outcomes are more important in explaining $y_i$. In contrast, if $\rho < 1$, peers with low outcomes are more important. To see why, observe that the CES function is strictly convex when $\rho > 1$ and strictly concave when $\rho < 1$. Therefore, peer effects are either uniform, strictly increasing, or strictly decreasing in peer outcomes. This model does not accommodate situations where only peers with high and low outcomes matter while those with moderate outcomes are not influential, and vice versa. When $\rho \to +\infty$, only the peer with the highest outcome matters; when $\rho \to -\infty$, only the peer with the lowest outcome matters. These limiting cases correspond to models where the average peer outcome is replaced by the maximum or minimum peer outcome \citep[see][]{tao2014social, tatsi2015endogenous}. As in Section \ref{sec:model}, CES-based models can describe preferences that exhibit both spillover and conformity effects. See, for example, Equations \eqref{eq:yiso} and \eqref{eq:yniso}. The specification of the outcome for isolated agents is similar to Equation \eqref{eq:yiso}, and that for non-isolated agents is similar to Equation \eqref{eq:yniso}, with the CES function replacing the quantile functions. Disentangling conformity from spillover effects requires the presence of isolated agents in the network. ## Data simulation \noindent The function \texttt{cespee.sim} can be used to simulate the CES-based peer effect model. This is useful, for instance, for conducting counterfactual analyses involving changes in the intercept (e.g., providing subsidies or introducing a tax), in the network structure, or in the exogenous characteristics $\boldsymbol{x}$. The following code defines the sample size, simulates the network matrix and the matrix of explanatory variables $\mathbf{X}$, assigns values to the model parameters, and simulates the outcomes. As in Section \ref{sec:quant}, I simulate two outcomes: one generated from the reduced-form model (with only spillovers) and the other from the structural model. One important limitation of the CES specification is that it does not accommodate zero or negative outcomes. To simulate data from this model, $\boldsymbol{x}_i^{\prime}\boldsymbol{\beta} + \varepsilon_i$ must be strictly positive for all $i$. Consequently, I increase the intercept in $\boldsymbol{\beta}$ to ensure this condition is satisfied. ```{r} ngr <- 50 # Number of subnets nvec <- rep(30, ngr) # Size of subnets n <- sum(nvec) # Network matrix G <- lapply(1:ngr, function(z) { Gz <- matrix(rbinom(nvec[z]^2, 1, 0.3), nvec[z], nvec[z]) diag(Gz) <- 0 # Adding isolated nodes (important for the structural model) niso <- sample(0:nvec[z], 1, prob = (nvec[z] + 1):1 / sum((nvec[z] + 1):1)) if (niso > 0) { Gz[sample(1:nvec[z], niso), ] <- 0 } Gz rs <- rowSums(Gz); rs[rs == 0] <- 1 Gz <- Gz/rs # rowSums are normalized to one }) X <- cbind(rnorm(n), rpois(n, 2)); colnames(X) <- c("X1", "X2") lambda <- 0.55 lambda2 <- 0.2 beta <- c(2.5, -0.5, 1) rho <- -3 # First dependent variable (reduced form without conformity) y1 <- cespeer.sim(formula = ~ X, Glist = G, rho = rho, lambda = lambda, beta = beta, structural = FALSE, epsilon = rnorm(n, 0, 0.4)) y1 <- y1$y #cespeer.sim returns a list of several object including y # Second dependent variable (structural form with conformity) y2 <- cespeer.sim(formula = ~ X, Glist = G, rho = rho, lambda = c(lambda2, lambda), beta = beta, structural = TRUE, epsilon = rnorm(n, 0, 0.4)) y2 <- y2$y ``` ## Estimation \noindent The model can be estimated using the function \texttt{cespeer}. As suggested by \cite{boucher2024toward}, the instrument for the social norm can be constructed using an exogenous prediction of the outcome. This prediction can be obtained from a simple linear regression of the outcome on the exogenous characteristics, as shown in the following code: ```{r} yhat1 <- fitted(lm(y1 ~ X)) yhat2 <- fitted(lm(y2 ~ X)) ``` Using the function \texttt{cespeer}, the exogenous prediction can be specified via the \texttt{instrument} argument. The remaining arguments are similar to those of the \texttt{qpeer} function, except that the quantile level vector \texttt{tau} is not required. Additionally, the function includes a \texttt{fixed.effects} argument to indicate the type fixed effects, and a \texttt{HAC} argument to account for heteroskedasticity and within-group correlation in the error terms. ```{r} cesR1 <- cespeer(formula = y1 ~ X, instrument = ~ yhat1, Glist = G, structural = FALSE) summary(cesR1) cesS1 <- cespeer(formula = y2 ~ X, instrument = ~ yhat2, Glist = G, structural = TRUE) summary(cesS1) ``` The estimates appear reasonable, including the estimate of the CES parameter $\rho$. The output of the \texttt{summary} function includes a test of whether the $\rho$ parameter is equal to one; that is, whether the standard LIM model can be used instead of the CES specification. As expected, the null hypothesis is rejected, since the data were simulated by setting $\rho = -3$. Since $\rho < 1$, peers with low outcomes play a more important role in explaining agent outcomes. For instance, if the user were to estimate a quantile peer effect model on the same data, the function \texttt{qpeer.test} can be used to test whether the $\lambda_{\tau}$'s are decreasing: ```{r} Z <- qpeer.inst(formula = ~ X, Glist = G, tau = seq(0, 1, 0.1), max.distance = 2, checkrank = TRUE)$instruments QtR1 <- qpeer(formula = y1 ~ X, excluded.instruments = ~ Z, Glist = G, tau = seq(0, 1, 1/3)) qpeer.test(QtR1, which = "decreasing") ``` The null hypothesis is not rejected, suggesting that the CES model can be used in this case to analyze the outcome. The CES model involves fewer parameters and thus provides more precise estimates than the quantile model. # Other specifications \label{sec:other} \noindent I discuss two important functions in this section for estimating other specifications of peer effect models. The first function is \texttt{linpeer}, which estimates the standard LIM model. The second function is \texttt{genpeer}, which allows the user to define their own social norm. For the standard LIM model, instruments can be $\mathbf{GX}$ when the model does not include contextual effects, or $\mathbf{G}^2\mathbf{X}$ when it does \citep[see][]{bramoulle2009identification}. It is also possible to include higher powers of $\mathbf{G}$ to strengthen the instruments \citep[see][]{houndetoungan2024inference}. To compute $\mathbf{G}^k\mathbf{X}$ for $k \geq 1$, I use the function \texttt{peer.avg} from the **PartialNetwork** package \citep{Boucher2024PartialNetwork}. In the following code, I use both $\mathbf{GX}$ and $\mathbf{G}^2\mathbf{X}$ as instruments. ```{r} library(PartialNetwork) GX <- peer.avg(G, X) GGX <- peer.avg(G, GX) Ot1 <- linpeer(formula = y2 ~ X, excluded.instruments = ~ GX + GGX, Glist = G, structural = TRUE, estimator = "gmm.optimal", fixed.effects = "separate", HAC = "cluster") summary(Ot1, diagnostic = TRUE) ``` Furthermore, the package provides the generic function \texttt{genpeer}, which allows users to define their own endogenous variables. This option is useful for estimating, for example, the effects of quantiles among girl and boy peers separately, combining average peer effects with quantile peer effects in the same model, or specifying other custom social norms. The \texttt{endogenous.variables} argument must be specified as a formula that includes the desired endogenous variables. The example below illustrates how to estimate peer effects using both the average outcome of peers and the minimum and maximum peer outcomes. ```{r} Gy2 <- peer.avg(G, y2) qy2 <- qpeer.inst(formula = y2 ~ 1, Glist = G, tau = c(0, 1))$qy #min and max Ot2 <- genpeer(formula = y2 ~ X, excluded.instruments = ~ Z + GX + GGX, endogenous.variables = ~ Gy2 + qy2, Glist = G, structural = TRUE, estimator = "gmm.optimal", fixed.effects = "separate", HAC = "cluster") # includes average, min, and max of peers summary(Ot2, diagnostic = TRUE) ``` # Conclusion \noindent Thank you for reading this documentation. If you encounter any issues, please report them via the [Issues](https://github.com/ahoundetoungan/QuantilePeer/issues) page on GitHub. If you use **QuantilePeer** in your research or publications, please cite it. You can run \textcolor{DarkRed}{\texttt{citation("QuantilePeer")}} in **R** to obtain the citation information, including the Bib\TeX{} entry for \LaTeX{} users. Please also cite the paper associated with the package \citep[][]{houndetoungan2025quantile}. \newpage