--- title: "BulkSignalR :
Inference of ligand-receptor interactions from bulk data or spatial transcriptomics" author: - name: Jean-Philippe Villemin affiliation: - Institut de Recherche en Cancérologie de Montpellier, Inserm, Montpellier, France email: jean-philippe.villemin@inserm.fr - name: Jacques Colinge affiliation: - Institut de Recherche en Cancérologie de Montpellier, Inserm, Montpellier, France email: jacques.colinge@inserm.fr date: "`r format(Sys.Date(), '%m/%d/%Y')`" abstract: >

BulkSignalR is used to infer ligand-receptor (L-R) interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics. Potential L-R interactions are taken from the LR*db* database, which is included in our other package SingleCellSignalR, available from Bioconductor. Inferences rely on a statistical model linking potential L-R interactions with biological pathways from Reactome or biological processes from GO. A number of visualization and data summary functions are proposed to help navigating the predicted interactions.

BulkSignalR package version: `r packageVersion("BulkSignalR")` output: rmarkdown::html_vignette: self_contained: true toc: true toc_depth: 4 highlight: pygments fig_height: 3 fig_width: 3 fig_caption: no code_folding: show package: BulkSignalR vignette: > %\VignetteIndexEntry{BulkSignalR-Main} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::knit_hooks$set(optipng = knitr::hook_optipng) ``` ```{r load-libs, message = FALSE, warning = FALSE, results = FALSE} library(BulkSignalR) library(igraph) library(dplyr) library(STexampleData) ``` # Introduction ## What is it for? `BulkSignalR` is a Bioconductor library that enables the inference of L-R interactions from bulk expression data sets, *i.e.*, from transcriptomics (RNA-seq or microarrays) or expression proteomics. `BulkSignalR` also applies to spatial transcriptomics such as 10x Genomics VISIUM:TM:, and a set of functions dedicated to spatial data has been added to better support this particular use of the library. ## Starting point There is a variety of potential data sources that can be used with `BulkSignalR`, *e.g.*, expression proteomics or RNA sequencing. Such data are typically represented as a matrix of numbers representing the abundance of molecules (gene transcripts or proteins) across samples. This matrix may have been normalized already or not prior the use of `BulkSignalR`. In both cases, the latter matrix is the input data for `BulkSignalR` analysis. It is mandatory that genes/proteins are represented as rows of the expression matrix and the samples as columns. HUGO gene symbols (even for proteins) must be used to ensure matching LR*db*, Reactome, and GOBP contents. You can also work with an object from the `SummarizedExperiment` or `SpatialExperiment` Bioconductor classes directly. ## How does it work? As represented in the figure below, only a few steps are required in order to perform a `BulkSignalR` analysis. Three S4 objects will be sequentially constructed: * **BSRDataModel**, denoted here as `bsrdm` * **BSRInference**, denoted here as `bsrinf` * **BSRSignature**, denoted here as `bsrsig` \ **BSRDataModel** comprises the expression data matrix and the parameters of the statistical model learned from this matrix. **BSRInference** provides various lists that contain the ligands, receptors, downstream pathways as well as the target genes, their correlations and statistical significance for all the L-R interactions. **BSRSignature** contains gene signatures associated with the triples (ligand, receptor, downstream pathway) stored in a `BSRInference` object. Those signatures are comprised of the ligand and the receptor obviously, but also the pathway target genes that were used in the statistical model. Gene signatures are meant to report the L-R interaction as a global phenomenon integrating its downstream effect. Indeed, signatures can be subsequently scored with a dedicated function allowing the user to obtain a numerical value representing the activity of the L-R interactions (with downstream consequences) across the samples. Signature scores are returned as a matrix (one row *per* L-R interaction and one column *per* sample). Because of the occurrence of certain receptors in multiple pathways, and also because some ligands may bind several receptors, or *vice versa*, BSRInference objects may contain redundant data depending on how the user wants to look at them. We therefore provide a range of reduction operators meant to obtain reduced BSRInference objects (see below). \ Furthermore, we provide several handy functions to explore the data through different plots (heatmaps, alluvial plots, chord diagrams or networks). `BulkSignalR` library functions have many parameters that can be changed by the user to fit specific needs (see Reference Manual for details). ## Parallel mode settings Users can reduce compute time by using several processors in parallel. ```{r parallel, message=FALSE, warning=FALSE} library(doParallel) n.proc <- 1 cl <- makeCluster(n.proc) registerDoParallel(cl) # To add at the end of your script: # stopCluster(cl) ``` **Notes:** For operating systems that can fork such as the UNIX-like systems, it might be preferable to use the library `doMC` that is faster (less overhead). This is transparent to `BulkSignalR`. In case you need to reproduce results exactly and since statistical model parameter learning involves the generation of randomized expression matrices, you might want to use `set.seed()`. In a parallel mode, `iseed` that is a parameter of `clusterSetRNGStream` must be used. \ # First Example ## Loading the data Here, we load salivary duct carcinoma (SDC) bulk transcriptomes integrated as `sdc` in `BulkSignalR`. ```{r loading, eval=TRUE} data(sdc) head(sdc) ``` ## Building a BSRDataModel object The constructor `BSRDataModel` creates the first object from SDC data above. ```{r BSRDataModel, eval=TRUE,cache=FALSE} bsrdm <- BSRDataModel(counts = sdc) bsrdm ``` `learnParameters` updates a BSRDataModel object with the parameters necessary for `BulkSignalR` statistical modeling. ```{r learnParameters,eval=TRUE,warning=FALSE,fig.dim = c(7,3)} bsrdm <- learnParameters(bsrdm, quick=TRUE) bsrdm ``` ## Building a BSRInference object From the previous object `bsrdm`, you can generate inferences by calling its method `BSRInference`. The returned BSRInference object, contains all the inferred L-R interactions with their associated pathways and corrected p-values. From there, you can already access L-R interactions using `LRinter(bsrinf)`, which returns a summary table. ```{r BSRInference,eval=TRUE} # We use a subset of the reference to speed up # inference in the context of the vignette. subset <- c("REACTOME_BASIGIN_INTERACTIONS", "REACTOME_SYNDECAN_INTERACTIONS", "REACTOME_ECM_PROTEOGLYCANS", "REACTOME_CELL_JUNCTION_ORGANIZATION") reactSubset <- BulkSignalR:::.SignalR$BulkSignalR_Reactome[ BulkSignalR:::.SignalR$BulkSignalR_Reactome$`Reactome name` %in% subset,] resetPathways(dataframe = reactSubset, resourceName = "Reactome") bsrinf <- BSRInference(bsrdm, min.cor = 0.3, reference="REACTOME") LRinter.dataframe <- LRinter(bsrinf) head(LRinter.dataframe[ order(LRinter.dataframe$qval), c("L", "R", "LR.corr", "pw.name", "qval")]) ``` You can finally filter out non-significant L-R interactions and order them by Q-values before saving them into a file for instance. ```{r BSRInferenceTfile, eval=FALSE} write.table(LRinter.dataframe[order(LRinter.dataframe$qval), ], "./sdc_LR.tsv", row.names = FALSE, sep = "\t", quote = FALSE ) ``` ## Reduction strategies The output of `BSRInference` is exhaustive and can thus contain redundancy due to the redundancy present in the reference databases (Reactome, KEGG, GOBP) and multilateral interactions in LR*db*. To alleviate this issue, we propose several strategies. ### Reducing a BSRInference object to pathways With `reduceToPathway`, all the L-R interactions with their receptors included in a certain pathway are aggregated to only report the downstream pathway once. For a given pathway, the reported P-values and target genes are those of best (smallest P-value) L-R interaction that was part of the aggregation. Nothing is recomputed, we simply merge data. ```{r ReduceToPathway, eval=TRUE} bsrinf.redP <- reduceToPathway(bsrinf) ``` ### Reducing a BSRInference object to the best pathways With ` reduceToBestPathway`, a BSRInference object is reduced to only report one pathway per L-R interaction. The pathway with the smallest P-value is selected. A same pathways might occur multiple times with due different L-R interactions that all have their receptor in this pathway. ```{r ReduceToBestPathway, eval=TRUE} bsrinf.redBP <- reduceToBestPathway(bsrinf) ``` ### Reducing to ligands or receptors As already mentioned, several ligands might bind to single receptor (or several shared receptors) and the converse might happen also. Two reduction operators enable users to either aggregate all the ligands of a same receptor or all the receptors bound by a same ligand: ```{r ReduceToLigand,eval=TRUE} bsrinf.L <- reduceToLigand(bsrinf) bsrinf.R <- reduceToReceptor(bsrinf) ``` ### Combined reductions Combinations are possible. For instance, users can apply the `reduceToPathway` and `reduceToBestPathway` functions sequentially to maximize the reduction effect. In case the exact same sets of aggregated ligands and receptors obtained with `reduceToPathway` was associated with several pathways, the pathway with the best P-value would be kept by `reduceToBestPathway`. ```{r doubleReduction, eval=TRUE} bsrinf.redP <- reduceToPathway(bsrinf) bsrinf.redPBP <- reduceToBestPathway(bsrinf.redP) ``` ## Building a BSRSignature object Gene signatures for a given, potentially reduced BSRInference object are generated by the `BSRSignature` constructor, which returns a BSRSignature object. To follow the activity of L-R interactions across the samples of a data set, `scoreLRGeneSignatures` computes a score for each gene signature in each sample. Then, heatmaps can be generated to represent differences, *e.g.*, using the built-in utility function `simpleHeatmap`. Hereafter, we show different workflows of reductions combined with gene signature scoring and display. ### Scoring by ligand-receptor ```{r scoringLR,eval=TRUE,fig.dim = c(5,4)} bsrsig.redBP <- BSRSignature(bsrinf.redBP, qval.thres=0.001) scoresLR <- scoreLRGeneSignatures(bsrdm, bsrsig.redBP, name.by.pathway=FALSE ) simpleHeatmap(scoresLR[1:20, ], hcl.palette="Cividis", pointsize=8) ``` ### Scoring by pathway ```{r scoringPathway,eval=TRUE,fig.dim = c(7,3)} bsrsig.redPBP <- BSRSignature(bsrinf.redPBP, qval.thres=0.01) scoresPathway <- scoreLRGeneSignatures(bsrdm, bsrsig.redPBP, name.by.pathway=TRUE ) simpleHeatmap(scoresPathway, hcl.palette="Blue-Red 2", pointsize=8) ``` ## Other visualization utilities ### Heatmap of ligand-receptor-target genes expression After computing gene signatures score, we may wish to look at the expression of the genes involved in that signature. For instance, we can display three heatmaps corresponding to the scaled (z-scores) expression of ligands (pink), receptors (green), and target genes (blue). On the top of each individual heatmap, the whole signature score from `scoreLRGeneSignatures` is reported for reference. ```{r heatmapMulti,eval=TRUE,fig.dim = c(7,10)} pathway1 <- pathways(bsrsig.redPBP)[1] signatureHeatmaps( pathway = pathway1, bsrdm = bsrdm, heights = c(3,2,15), bsrsig = bsrsig.redPBP ) ``` ### AlluvialPlot `alluvial.plot` is a function that enable users to represent the different interactions between ligands, receptors, and pathways stored in a BSRInference object. Obviously, it is possible to filter by ligand, receptor, or pathway to limit output complexity. This is achieved by specifying a key word in the chosen category. A threshold on L-R interaction Q-values can be applied in addition. ```{r AlluvialPlot,eval=TRUE,fig.dim = c(8,3)} alluvialPlot(bsrinf, keywords = c("LAMC1"), type = "L", qval.thres = 0.01 ) ``` ### BubblePlot `bubblePlotPathwaysLR` is a handy way to visualize the strengths of several L-R interactions in relation with their receptor downstream pathways. A vector of pathways of interest can be provided to limit the complexity of the plot. ```{r BubblePlot,eval=TRUE,fig.dim = c(8,4)} pathways <- LRinter(bsrinf)[1,c("pw.name")] bubblePlotPathwaysLR(bsrinf, pathways = pathways, qval.thres = 0.001, color = "red", pointsize = 8 ) ``` ### Chordiagram `chord.diagram.LR` is a function that enables users to feature the different L-R interactions involved in a specific pathway. L-R correlations strengths are drawn using a yellow color-scale. Ligands are in grey, whereas receptors are in green. You can also highlight in red one specific interaction by passing values of a L-R pair as follows `ligand="FYN", receptor="SPN"`. ```{r Chordiagram,eval=TRUE,fig.dim = c(6,4.5)} chordDiagramLR(bsrinf, pw.id.filter = "R-HSA-210991", limit = 20, ligand="FYN", receptor="SPN" ) ``` \ # Network Analysis Since `BulkSignalR` relies on intracellular networks to estimate the statistical significance of (ligand, receptor, pathway) triples, links from receptors to target genes are naturally accessible. Different functions enable users to exploit this graphical data for plotting or further data analysis. Furthermore, networks can be exported in text files and graphML objects to be further explored with Cytoscape (www.cytoscape.org), yEd (www.yworks.com), or similar software tools. ```{r network1, eval=TRUE} # Generate a ligand-receptor network and export it in .graphML # for Cytoscape or similar tools gLR <- getLRNetwork(bsrinf.redBP, qval.thres = 1e-3) # save to file # write.graph(gLR,file="SDC-LR-network.graphml",format="graphml") # As an alternative to Cytoscape, you can play with igraph package functions. plot(gLR, edge.arrow.size = 0.1, vertex.label.color = "black", vertex.label.family = "Helvetica", vertex.label.cex = 0.1 ) ``` ```{r network2, eval=TRUE} # You can apply other functions. # Community detection u.gLR <- as_undirected(gLR) # most algorithms work for undirected graphs only comm <- cluster_edge_betweenness(u.gLR) # plot(comm,u.gLR, # vertex.label.color="black", # vertex.label.family="Helvetica", # vertex.label.cex=0.1) # Cohesive blocks cb <- cohesive_blocks(u.gLR) plot(cb, u.gLR, vertex.label.color = "black", vertex.label.family = "Helvetica", vertex.label.cex = 0.1, edge.color = "black" ) ``` ```{r network3, eval=FALSE,warning=FALSE} # For the next steps, we just share the code below, but graph generation # functions are commented to lighten the vignette. # Generate a ligand-receptor network complemented with intra-cellular, # receptor downstream pathways [computations are a bit longer here] # # You can save to a file for cystoscape or plot with igraph. gLRintra <- getLRIntracellNetwork(bsrinf.redBP, qval.thres = 1e-3) lay <- layout_with_kk(gLRintra) # plot(gLRintra, # layout=lay, # edge.arrow.size=0.1, # vertex.label.color="black", # vertex.label.family="Helvetica", # vertex.label.cex=0.1) # Reduce complexity by focusing on strongly targeted pathways pairs <- LRinter(bsrinf.redBP) top <- unique(pairs[pairs$pval < 1e-3, c("pw.id", "pw.name")]) top gLRintra.res <- getLRIntracellNetwork(bsrinf.redBP, qval.thres = 0.01, restrict.pw = top$pw.id ) lay <- layout_with_fr(gLRintra.res) # plot(gLRintra.res, # layout=lay, # edge.arrow.size=0.1, # vertex.label.color="black", # vertex.label.family="Helvetica", # vertex.label.cex=0.4) ``` \ # Non-human data In order to process data from non-human organisms, users only need to specify a few additional parameters and all the other steps of the analysis remain unchanged. By default, `BulksignalR` works with *Homo sapiens*. We implemented a strategy using ortholog genes (mapped by the `orthogene` BioConductor package) in `BulkSignalR` directly. The function `findOrthoGenes` creates a correspondence table between human and another organism. `convertToHuman` then converts an initial expression matrix to a *Homo sapiens* equivalent. When calling `BSRDataModel`, the user only needs to pass this transformed matrix, the actual non-human organism name, and the correspondence table. Then, L-R interaction inference is performed as for human data. Finally, users can switch back to gene names relative to the original organism via `resetToInitialOrganism`. The rest of the workflow is executed as usual for computing gene signatures and visualizing. \ ```{r mouse,eval=TRUE,warning=FALSE} data(bodyMap.mouse) ortholog.dict <- findOrthoGenes( from_organism = "mmusculus", from_values = rownames(bodyMap.mouse) ) matrix.expression.human <- convertToHuman( counts = bodyMap.mouse, dictionary = ortholog.dict ) bsrdm <- BSRDataModel( counts = matrix.expression.human, species = "mmusculus", conversion.dict = ortholog.dict ) bsrdm <- learnParameters(bsrdm,quick=TRUE) bsrinf <- BSRInference(bsrdm,reference="REACTOME") bsrinf <- resetToInitialOrganism(bsrinf, conversion.dict = ortholog.dict) # For example, if you want to explore L-R interactions # you can proceed as shown above for a human dataset. bsrinf.redBP <- reduceToBestPathway(bsrinf) bsrsig.redBP <- BSRSignature(bsrinf.redBP, qval.thres=0.001) scoresLR <- scoreLRGeneSignatures(bsrdm, bsrsig.redBP, name.by.pathway=FALSE) head(LRinter(bsrinf.redBP)) #simpleHeatmap(scoresLR, column.names=TRUE, # pointsize=8) ``` \ # Spatial Transcriptomics `BulkSignalR` analysis can be applied to spatial transcriptomics (ST) at medium resolution such as with 10x Genomics Visium:TM: or Nanostring GeoMx:TM:. L-R interactions that display significant occurrence in a tissue are retrieved. Additional functions have been introduced to facilitate the visualization and analysis of the results in a spatial context. To account for the rather limited dynamics of ST data and dropouts, it is usually recommended to release specific parameters controlling the training of the statistical model. Namely, minimum correlation can be set at -1 during training and the minimum number of target genes in a pathway should be reduced to 2 instead of the default at 4. Also, the thresholds on L-R interaction Q-values should be raised at 1% or 5% instead of 0.1%. A key spatial plot function is `spatialPlot` that enables visualizing L-R interaction gene signature scores at their spatial coordinates with a potential reference plot (raw tissue image or user-defined areas) on the side. As we published `BulkSignalR` paper, we provided example scripts to apply the library to spatial data represented in tabular text files [BulkSignalR github companion]( https://github.com/jcolinge/BulkSignalR_companion). It is also possible to work with an object of the `SpatialExperiment` Bioconductor class. In addition, the `VisiumIO` package allows users to readily import Visium data from the 10X Space Ranger pipeline and retrieve a `SpatialExperiment object`. \ ```{r spatial1,eval=TRUE,message=FALSE} # load data ================================================= # We re-initialize the environment variable of Reactome # to have all the pathways (we reduced it to a subset above to fasten # example computations) reactSubset <- getResource(resourceName = "Reactome", cache = TRUE) resetPathways(dataframe = reactSubset, resourceName = "Reactome") # Few steps of pre-process to subset a spatialExperiment object # from STexampleData package ================================== spe <- Visium_humanDLPFC() set.seed(123) speSubset <- spe[, colData(spe)$ground_truth%in%c("Layer1","Layer2")] idx <- sample(ncol(speSubset), 10) speSubset <- speSubset[, idx] my.image.as.raster <- SpatialExperiment::imgRaster(speSubset, sample_id = imgData(spe)$sample_id[1], image_id = "lowres") colData(speSubset)$idSpatial <- paste(colData(speSubset)[[4]], colData(speSubset)[[5]],sep = "x") annotation <- colData(speSubset) ``` ```{r spatial2,eval=TRUE,warning=FALSE} # prepare data ================================================= bsrdm <- BSRDataModel(speSubset, min.count = 1, prop = 0.01, method = "TC", symbol.col = 2, x.col = 4, y.col = 5, barcodeID.col = 1) bsrdm <- learnParameters(bsrdm, quick = TRUE, min.positive = 2, verbose = TRUE) bsrinf <- BSRInference(bsrdm, min.cor = -1,reference="REACTOME") # spatial analysis ============================================ bsrinf.red <- reduceToBestPathway(bsrinf) pairs.red <- LRinter(bsrinf.red) thres <- 0.01 min.corr <- 0.01 pairs.red <- pairs.red[pairs.red$qval < thres & pairs.red$LR.corr > min.corr,] head(pairs.red[ order(pairs.red$qval), c("L", "R", "LR.corr", "pw.name", "qval")]) s.red <- BSRSignature(bsrinf.red, qval.thres=thres) scores.red <- scoreLRGeneSignatures(bsrdm,s.red) head(scores.red) ``` From here, one can start exploring the ST data through different plots. **Note:** As we work on a much reduced data set to fasten the vignette generation, only a few point are displayed in each plot. Actual plots are a lot more informative. ```{r spatialPlot3,eval=TRUE,fig.dim = c(6,4.5)} # Visualization ============================================ # plot one specific interaction # we have to follow the syntax with {} # to be compatible with the reduction functions inter <- "{SLIT2} / {GPC1}" # with raw tissue reference spatialPlot(scores.red[inter, ], annotation, inter, ref.plot = TRUE, ref.plot.only = FALSE, image.raster = NULL, dot.size = 1, label.col = "ground_truth" ) # or with synthetic image reference spatialPlot(scores.red[inter, ], annotation, inter, ref.plot = TRUE, ref.plot.only = FALSE, image.raster = my.image.as.raster, dot.size = 1, label.col = "ground_truth" ) ``` You can dissect one interaction to separately visualize the expression of the ligand and the receptor involved in a specific L-R interaction. ```{r spatialPlot4,eval=TRUE,fig.dim = c(6,4.5)} separatedLRPlot(scores.red, "SLIT2", "GPC1", ncounts(bsrdm), annotation, label.col = "ground_truth") ``` ```{r spatialPlot5,eval=TRUE} # generate a visual index in a pdf file directly spatialIndexPlot(scores.red, annotation, label.col = "ground_truth", out.file="spatialIndexPlot") ``` \ Finally, we provide a function to assess the statistical associations of L-R interaction signature scores with user-defined areas of the sample. Based on these associations, a further visualization function can represent the latter in the form of a heatmap. \ ```{r spatialPlot6,eval=TRUE,fig.dim = c(6,4.5)} # statistical association with tissue areas based on correlations # For display purpose, we only use a subset here assoc.bsr.corr <- spatialAssociation(scores.red[c(1:17), ], annotation, label.col = "ground_truth",test = "Spearman") head(assoc.bsr.corr) spatialAssociationPlot(assoc.bsr.corr) ``` We also provide 2D-projections (see the `spatialDiversityPlot` function) to assess the diversity among L-R interaction spatial distributions over an entire data set. Other functions such as `generateSpatialPlots` can output multiple individual spatial plots in a graphic file directly. \ Note that we describe additional use cases in the [BulkSignalR github companion]( https://github.com/jcolinge/BulkSignalR_companion). \ See the reference manual for all the details. ```{r inferCells,eval=FALSE,include=FALSE} ## Additional functions # This is not part of the main workflow for analyzing L-R interactions but # we offer convenience functions for inferring cell types. However # we recommend using other software packages specifically dedicated to this # purpose. # Inferring cell types data(sdc, package = "BulkSignalR") bsrdm <- BSRDataModel(counts = sdc) bsrdm <- learnParameters(bsrdm) bsrinf <- BSRInference(bsrdm) # Common TME cell type signatures data(immune.signatures, package = "BulkSignalR") unique(immune.signatures$signature) immune.signatures <- immune.signatures[immune.signatures$signature %in% c( "B cells", "Dentritic cells", "Macrophages", "NK cells", "T cells", "T regulatory cells" ), ] data("tme.signatures", package = "BulkSignalR") signatures <- rbind(immune.signatures, tme.signatures[tme.signatures$signature %in% c("Endothelial cells", "Fibroblasts"), ]) tme.scores <- scoreSignatures(bsrdm, signatures) # assign cell types to interactions lr2ct <- assignCellTypesToInteractions(bsrdm, bsrinf, tme.scores) head(lr2ct) # cellular network computation and plot g.table <- cellularNetworkTable(lr2ct) gCN <- cellularNetwork(g.table) plot(gCN, edge.width = 5 * E(gCN)$score) gSummary <- summarizedCellularNetwork(g.table) plot(gSummary, edge.width = 1 + 30 * E(gSummary)$score) # relationship with partial EMT--- # Should be tested HNSCC data instead of SDC!! # find the ligands data(p.EMT, package = "BulkSignalR") gs <- p.EMT$gene triggers <- relateToGeneSet(bsrinf, gs) triggers <- triggers[triggers$n.genes > 1, ] # at least 2 target genes in the gs ligands.in.gs <- intersect(triggers$L, gs) triggers <- triggers[!(triggers$L %in% ligands.in.gs), ] ligands <- unique(triggers$L) # link to cell types cf <- cellTypeFrequency(triggers, lr2ct, min.n.genes = 2) missing <- setdiff(rownames(tme.scores), names(cf$s)) cf$s[missing] <- 0 cf$t[missing] <- 0 op <- par(mar = c(2, 10, 2, 2)) barplot(cf$s, col = "lightgray", horiz = T, las = 2) par(op) # random selections based on random gene sets qval.thres <- 1e-3 inter <- LRinter(bsrinf) tg <- tgGenes(bsrinf) tcor <- tgCorr(bsrinf) good <- inter$qval <= qval.thres inter <- inter[good, ] tg <- tg[good] tcor <- tcor[good] all.targets <- unique(unlist(tg)) r.cf <- list() for (k in 1:100) { # should 1000 or more r.gs <- sample(all.targets, length(intersect(gs, all.targets))) r.triggers <- relateToGeneSet(bsrinf, r.gs, qval.thres = qval.thres) r.triggers <- r.triggers[r.triggers$n.genes > 1, ] r.ligands.in.gs <- intersect(r.triggers$L, r.gs) r.triggers <- r.triggers[!(r.triggers$L %in% r.ligands.in.gs), ] r <- cellTypeFrequency(r.triggers, lr2ct, min.n.genes = 2) missing <- setdiff(rownames(tme.scores), names(r$s)) r$s[missing] <- 0 r$t[missing] <- 0 o <- order(names(r$t)) r$s <- r$s[o] r$t <- r$t[o] r.cf <- c(r.cf, list(r)) } r.m.s <- foreach(i = seq_len(length(r.cf)), .combine = rbind) %do% { r.cf[[i]]$s } # plot results op <- par(mar = c(2, 10, 2, 2)) boxplot(r.m.s, col = "lightgray", horizontal = T, las = 2) pts <- data.frame(x = as.numeric(cf$s[colnames(r.m.s)]), cty = colnames(r.m.s)) stripchart(x ~ cty, data = pts, add = TRUE, pch = 19, col = "red") par(op) for (cty in rownames(tme.scores)) { cat(cty, ": P=", sum(r.m.s[, cty] >= cf$s[cty]) / nrow(r.m.s), "\n", sep = "") } ``` \ # Acknowledgements We thank Guillaume Tosato for his help with the figures and Gauthier Gadouas for testing the software on different platforms. \ Thank you for reading this guide and for using `BulkSignalR`. \ # Session Information ```{r session-info} sessionInfo() ```