--- title: "BulkSignalR : Inference of ligand-receptor interactions from bulk data or spatial transcriptomics" author: - name: Jean-Philippe Villemin affiliation: - Institut de Recherche en Cancérologie de Montpellier, Inserm, Montpellier, France email: jean-philippe.villemin@inserm.fr - name: Jacques Colinge affiliation: - Institut de Recherche en Cancérologie de Montpellier, Inserm, Montpellier, France email: jacques.colinge@inserm.fr date: "`r format(Sys.Date(), '%m/%d/%Y')`" abstract: >
BulkSignalR is used to infer ligand-receptor (L-R) interactions from bulk
expression (transcriptomics/proteomics) data, or spatial
transcriptomics. Potential L-R interactions are taken from the
LR*db* database, which is included in our other package SingleCellSignalR,
available from
Bioconductor. Inferences rely on a statistical model linking potential
L-R interactions with biological pathways from Reactome or biological
processes from GO.
A number of visualization and data summary functions are proposed to
help navigating the predicted interactions.
BulkSignalR package version: `r packageVersion("BulkSignalR")`
output:
rmarkdown::html_vignette:
self_contained: true
toc: true
toc_depth: 4
highlight: pygments
fig_height: 3
fig_width: 3
fig_caption: no
code_folding: show
package: BulkSignalR
vignette: >
%\VignetteIndexEntry{BulkSignalR-Main}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::knit_hooks$set(optipng = knitr::hook_optipng)
```
```{r load-libs, message = FALSE, warning = FALSE, results = FALSE}
library(BulkSignalR)
library(igraph)
library(dplyr)
library(STexampleData)
```
# Introduction
## What is it for?
`BulkSignalR` is a Bioconductor library that enables the inference of L-R
interactions from bulk expression data sets, *i.e.*, from transcriptomics
(RNA-seq or microarrays) or expression proteomics.
`BulkSignalR` also applies to spatial transcriptomics such as
10x Genomics VISIUM:TM:, and a set of
functions dedicated to spatial data has been added to better support
this particular use of the library.
## Starting point
There is a variety of potential data sources that can be used with `BulkSignalR`,
*e.g.*, expression proteomics or RNA sequencing. Such data are typically
represented as a matrix of numbers representing the abundance of molecules
(gene transcripts or proteins) across samples. This matrix may have been
normalized already or not prior the use of `BulkSignalR`.
In both cases, the latter matrix is the input data for `BulkSignalR` analysis.
It is mandatory that genes/proteins are represented as rows of the
expression matrix and the samples as columns. HUGO gene symbols
(even for proteins) must be used to ensure matching LR*db*,
Reactome, and GOBP contents.
You can also work with an object from the `SummarizedExperiment` or
`SpatialExperiment` Bioconductor classes directly.
## How does it work?
As represented in the figure below, only a few steps are required in order to
perform a `BulkSignalR` analysis. Three S4 objects will be sequentially constructed:
* **BSRDataModel**, denoted here as `bsrdm`
* **BSRInference**, denoted here as `bsrinf`
* **BSRSignature**, denoted here as `bsrsig`
\
**BSRDataModel** comprises the expression data matrix and the
parameters of the statistical model learned from this matrix.
**BSRInference** provides various lists that contain the ligands, receptors,
downstream pathways as well as the target genes, their correlations and statistical significance for all the L-R interactions.
**BSRSignature** contains gene signatures associated with the triples
(ligand, receptor, downstream pathway)
stored in a `BSRInference` object. Those signatures are comprised
of the ligand and the receptor obviously, but also the pathway
target genes that were used in the statistical model. Gene signatures
are meant to report the L-R interaction as a global phenomenon integrating
its downstream effect. Indeed, signatures can be subsequently scored with a
dedicated function allowing the user to obtain a numerical value representing
the activity of the L-R interactions (with downstream consequences)
across the samples. Signature scores are returned as a matrix (one row *per*
L-R interaction and one column *per* sample).
Because of the occurrence of certain receptors in multiple pathways,
and also because some ligands may bind several receptors, or *vice versa*,
BSRInference objects may contain redundant data depending on how
the user wants to look at them. We therefore provide a range of
reduction operators meant to obtain reduced BSRInference objects
(see below).
\
Furthermore, we provide several handy functions to explore the data
through different plots (heatmaps, alluvial plots, chord diagrams or networks).
`BulkSignalR` library functions have many parameters that can be
changed by the user to fit specific needs (see Reference Manual for details).
## Parallel mode settings
Users can reduce compute time
by using several processors in parallel.
```{r parallel, message=FALSE, warning=FALSE}
library(doParallel)
n.proc <- 1
cl <- makeCluster(n.proc)
registerDoParallel(cl)
# To add at the end of your script:
# stopCluster(cl)
```
**Notes:**
For operating systems that can fork such as the UNIX-like systems,
it might be preferable to use the library `doMC` that is faster (less
overhead). This is transparent to `BulkSignalR`.
In case you need to reproduce results exactly and since statistical
model parameter learning involves the generation of randomized expression
matrices, you might want to use `set.seed()`. In a parallel mode,
`iseed` that is a parameter of `clusterSetRNGStream` must be used.
\
# First Example
## Loading the data
Here, we load salivary duct carcinoma (SDC) bulk transcriptomes integrated as
`sdc` in `BulkSignalR`.
```{r loading, eval=TRUE}
data(sdc)
head(sdc)
```
## Building a BSRDataModel object
The constructor `BSRDataModel` creates the first object from SDC data above.
```{r BSRDataModel, eval=TRUE,cache=FALSE}
bsrdm <- BSRDataModel(counts = sdc)
bsrdm
```
`learnParameters` updates a BSRDataModel object with the parameters
necessary for `BulkSignalR` statistical modeling.
```{r learnParameters,eval=TRUE,warning=FALSE,fig.dim = c(7,3)}
bsrdm <- learnParameters(bsrdm, quick=TRUE)
bsrdm
```
## Building a BSRInference object
From the previous object `bsrdm`, you can generate inferences by calling its
method
`BSRInference`. The returned BSRInference object,
contains all the inferred L-R interactions
with their associated pathways and corrected p-values.
From there, you can already access L-R interactions using `LRinter(bsrinf)`,
which returns a summary table.
```{r BSRInference,eval=TRUE}
# We use a subset of the reference to speed up
# inference in the context of the vignette.
subset <- c("REACTOME_BASIGIN_INTERACTIONS",
"REACTOME_SYNDECAN_INTERACTIONS",
"REACTOME_ECM_PROTEOGLYCANS",
"REACTOME_CELL_JUNCTION_ORGANIZATION")
reactSubset <- BulkSignalR:::.SignalR$BulkSignalR_Reactome[
BulkSignalR:::.SignalR$BulkSignalR_Reactome$`Reactome name` %in% subset,]
resetPathways(dataframe = reactSubset,
resourceName = "Reactome")
bsrinf <- BSRInference(bsrdm,
min.cor = 0.3,
reference="REACTOME")
LRinter.dataframe <- LRinter(bsrinf)
head(LRinter.dataframe[
order(LRinter.dataframe$qval),
c("L", "R", "LR.corr", "pw.name", "qval")])
```
You can finally filter out non-significant L-R interactions and order them
by Q-values before saving them into a file for instance.
```{r BSRInferenceTfile, eval=FALSE}
write.table(LRinter.dataframe[order(LRinter.dataframe$qval), ],
"./sdc_LR.tsv",
row.names = FALSE,
sep = "\t",
quote = FALSE
)
```
## Reduction strategies
The output of `BSRInference` is exhaustive and can thus contain
redundancy due to the redundancy present in the reference databases (Reactome,
KEGG, GOBP) and multilateral interactions in LR*db*.
To alleviate this issue, we propose several strategies.
### Reducing a BSRInference object to pathways
With `reduceToPathway`, all the L-R interactions with their receptors included
in a certain pathway are aggregated to only report the downstream pathway
once. For a given pathway, the reported P-values and target genes are those
of best (smallest P-value) L-R interaction that was part of the aggregation.
Nothing is recomputed, we simply merge data.
```{r ReduceToPathway, eval=TRUE}
bsrinf.redP <- reduceToPathway(bsrinf)
```
### Reducing a BSRInference object to the best pathways
With ` reduceToBestPathway`, a BSRInference object is reduced to only report
one pathway per L-R interaction. The pathway with the smallest P-value
is selected.
A same pathways might occur multiple times with due different L-R interactions
that all have their receptor in this pathway.
```{r ReduceToBestPathway, eval=TRUE}
bsrinf.redBP <- reduceToBestPathway(bsrinf)
```
### Reducing to ligands or receptors
As already mentioned, several ligands might bind to single receptor (or
several shared receptors) and the converse might happen also. Two
reduction operators enable users to either aggregate all the ligands
of a same receptor or all the receptors bound by a same ligand:
```{r ReduceToLigand,eval=TRUE}
bsrinf.L <- reduceToLigand(bsrinf)
bsrinf.R <- reduceToReceptor(bsrinf)
```
### Combined reductions
Combinations are possible.
For instance, users can apply the `reduceToPathway` and `reduceToBestPathway`
functions sequentially to maximize the reduction effect. In case the exact
same sets of aggregated ligands and receptors obtained with `reduceToPathway`
was associated with several pathways, the pathway with the best P-value
would be kept by `reduceToBestPathway`.
```{r doubleReduction, eval=TRUE}
bsrinf.redP <- reduceToPathway(bsrinf)
bsrinf.redPBP <- reduceToBestPathway(bsrinf.redP)
```
## Building a BSRSignature object
Gene signatures for a given, potentially reduced BSRInference object
are generated by the `BSRSignature` constructor, which returns a BSRSignature
object.
To follow the activity of L-R interactions across the samples of
a data set, `scoreLRGeneSignatures` computes a score for each gene signature
in each sample.
Then, heatmaps can be generated to represent differences, *e.g.*, using
the built-in utility function `simpleHeatmap`.
Hereafter, we show different workflows of reductions combined with gene
signature scoring and display.
### Scoring by ligand-receptor
```{r scoringLR,eval=TRUE,fig.dim = c(5,4)}
bsrsig.redBP <- BSRSignature(bsrinf.redBP, qval.thres=0.001)
scoresLR <- scoreLRGeneSignatures(bsrdm, bsrsig.redBP,
name.by.pathway=FALSE
)
simpleHeatmap(scoresLR[1:20, ],
hcl.palette="Cividis",
pointsize=8)
```
### Scoring by pathway
```{r scoringPathway,eval=TRUE,fig.dim = c(7,3)}
bsrsig.redPBP <- BSRSignature(bsrinf.redPBP, qval.thres=0.01)
scoresPathway <- scoreLRGeneSignatures(bsrdm, bsrsig.redPBP,
name.by.pathway=TRUE
)
simpleHeatmap(scoresPathway,
hcl.palette="Blue-Red 2",
pointsize=8)
```
## Other visualization utilities
### Heatmap of ligand-receptor-target genes expression
After computing gene signatures score, we may wish to look at the
expression of the genes involved in that signature. For instance,
we can display three heatmaps corresponding to the scaled (z-scores)
expression of ligands (pink), receptors (green), and target genes (blue).
On the top of each individual heatmap, the whole signature score from
`scoreLRGeneSignatures` is reported for reference.
```{r heatmapMulti,eval=TRUE,fig.dim = c(7,10)}
pathway1 <- pathways(bsrsig.redPBP)[1]
signatureHeatmaps(
pathway = pathway1,
bsrdm = bsrdm,
heights = c(3,2,15),
bsrsig = bsrsig.redPBP
)
```
### AlluvialPlot
`alluvial.plot` is a function that enable users to represent the different
interactions between ligands, receptors, and pathways stored in a
BSRInference object.
Obviously, it is possible to filter by ligand, receptor, or pathway to limit
output complexity. This
is achieved by specifying a key word in the chosen category. A threshold on
L-R interaction Q-values can be applied in addition.
```{r AlluvialPlot,eval=TRUE,fig.dim = c(8,3)}
alluvialPlot(bsrinf,
keywords = c("LAMC1"),
type = "L",
qval.thres = 0.01
)
```
### BubblePlot
`bubblePlotPathwaysLR` is a handy way to visualize the strengths of several
L-R interactions in relation with their receptor downstream pathways.
A vector of pathways of interest can be provided to limit the complexity
of the plot.
```{r BubblePlot,eval=TRUE,fig.dim = c(8,4)}
pathways <- LRinter(bsrinf)[1,c("pw.name")]
bubblePlotPathwaysLR(bsrinf,
pathways = pathways,
qval.thres = 0.001,
color = "red",
pointsize = 8
)
```
### Chordiagram
`chord.diagram.LR` is a function that enables users to feature the different
L-R interactions involved in a specific pathway.
L-R correlations strengths are drawn using a yellow color-scale.
Ligands are in grey, whereas receptors are in green.
You can also highlight in red one specific interaction by passing values
of a L-R pair as follows `ligand="FYN", receptor="SPN"`.
```{r Chordiagram,eval=TRUE,fig.dim = c(6,4.5)}
chordDiagramLR(bsrinf,
pw.id.filter = "R-HSA-210991",
limit = 20,
ligand="FYN",
receptor="SPN"
)
```
\
# Network Analysis
Since `BulkSignalR` relies on intracellular networks to estimate the statistical
significance of (ligand, receptor, pathway) triples, links from receptors to
target genes are naturally accessible. Different functions enable users to
exploit this graphical data for plotting or further data analysis.
Furthermore, networks can be exported in text files and graphML objects
to be further explored with Cytoscape (www.cytoscape.org),
yEd (www.yworks.com), or similar software tools.
```{r network1, eval=TRUE}
# Generate a ligand-receptor network and export it in .graphML
# for Cytoscape or similar tools
gLR <- getLRNetwork(bsrinf.redBP, qval.thres = 1e-3)
# save to file
# write.graph(gLR,file="SDC-LR-network.graphml",format="graphml")
# As an alternative to Cytoscape, you can play with igraph package functions.
plot(gLR,
edge.arrow.size = 0.1,
vertex.label.color = "black",
vertex.label.family = "Helvetica",
vertex.label.cex = 0.1
)
```
```{r network2, eval=TRUE}
# You can apply other functions.
# Community detection
u.gLR <- as_undirected(gLR) # most algorithms work for undirected graphs only
comm <- cluster_edge_betweenness(u.gLR)
# plot(comm,u.gLR,
# vertex.label.color="black",
# vertex.label.family="Helvetica",
# vertex.label.cex=0.1)
# Cohesive blocks
cb <- cohesive_blocks(u.gLR)
plot(cb, u.gLR,
vertex.label.color = "black",
vertex.label.family = "Helvetica",
vertex.label.cex = 0.1,
edge.color = "black"
)
```
```{r network3, eval=FALSE,warning=FALSE}
# For the next steps, we just share the code below, but graph generation
# functions are commented to lighten the vignette.
# Generate a ligand-receptor network complemented with intra-cellular,
# receptor downstream pathways [computations are a bit longer here]
#
# You can save to a file for cystoscape or plot with igraph.
gLRintra <- getLRIntracellNetwork(bsrinf.redBP, qval.thres = 1e-3)
lay <- layout_with_kk(gLRintra)
# plot(gLRintra,
# layout=lay,
# edge.arrow.size=0.1,
# vertex.label.color="black",
# vertex.label.family="Helvetica",
# vertex.label.cex=0.1)
# Reduce complexity by focusing on strongly targeted pathways
pairs <- LRinter(bsrinf.redBP)
top <- unique(pairs[pairs$pval < 1e-3, c("pw.id", "pw.name")])
top
gLRintra.res <- getLRIntracellNetwork(bsrinf.redBP,
qval.thres = 0.01,
restrict.pw = top$pw.id
)
lay <- layout_with_fr(gLRintra.res)
# plot(gLRintra.res,
# layout=lay,
# edge.arrow.size=0.1,
# vertex.label.color="black",
# vertex.label.family="Helvetica",
# vertex.label.cex=0.4)
```
\
# Non-human data
In order to process data from non-human organisms,
users only need to specify a few additional parameters and all the
other steps of the analysis remain unchanged.
By default, `BulksignalR` works with *Homo sapiens*.
We implemented a strategy using ortholog genes (mapped by the
`orthogene` BioConductor package) in `BulkSignalR` directly.
The function `findOrthoGenes` creates a correspondence table between
human and another organism.
`convertToHuman` then converts an initial expression matrix to
a *Homo sapiens* equivalent.
When calling `BSRDataModel`, the user only needs to pass this transformed
matrix, the actual non-human organism name, and the correspondence table.
Then, L-R interaction inference is performed as for human data.
Finally, users can switch back to gene names relative to the original organism
via `resetToInitialOrganism`.
The rest of the workflow is executed as usual for
computing gene signatures and visualizing.
\
```{r mouse,eval=TRUE,warning=FALSE}
data(bodyMap.mouse)
ortholog.dict <- findOrthoGenes(
from_organism = "mmusculus",
from_values = rownames(bodyMap.mouse)
)
matrix.expression.human <- convertToHuman(
counts = bodyMap.mouse,
dictionary = ortholog.dict
)
bsrdm <- BSRDataModel(
counts = matrix.expression.human,
species = "mmusculus",
conversion.dict = ortholog.dict
)
bsrdm <- learnParameters(bsrdm,quick=TRUE)
bsrinf <- BSRInference(bsrdm,reference="REACTOME")
bsrinf <- resetToInitialOrganism(bsrinf, conversion.dict = ortholog.dict)
# For example, if you want to explore L-R interactions
# you can proceed as shown above for a human dataset.
bsrinf.redBP <- reduceToBestPathway(bsrinf)
bsrsig.redBP <- BSRSignature(bsrinf.redBP, qval.thres=0.001)
scoresLR <- scoreLRGeneSignatures(bsrdm, bsrsig.redBP, name.by.pathway=FALSE)
head(LRinter(bsrinf.redBP))
#simpleHeatmap(scoresLR, column.names=TRUE,
# pointsize=8)
```
\
# Spatial Transcriptomics
`BulkSignalR` analysis can be applied to spatial transcriptomics (ST) at medium
resolution such as with 10x Genomics Visium:TM: or Nanostring GeoMx:TM:.
L-R interactions that display significant occurrence in a tissue are
retrieved. Additional functions
have been introduced to facilitate the visualization and analysis of the
results in a spatial context.
To account for the rather limited dynamics of ST data and dropouts, it is
usually recommended to release specific parameters controlling the
training of the statistical model. Namely, minimum correlation can be set
at -1 during training and the minimum number of target genes in a
pathway should be reduced to 2 instead of the default at 4.
Also, the thresholds on L-R interaction Q-values should
be raised at 1% or 5% instead of 0.1%.
A key spatial plot function is `spatialPlot` that enables visualizing
L-R interaction gene signature scores at their spatial coordinates with a
potential reference plot (raw tissue image or user-defined areas) on the side.
As we published `BulkSignalR` paper, we provided example scripts to apply
the library to spatial data represented in tabular text files
[BulkSignalR github companion](
https://github.com/jcolinge/BulkSignalR_companion).
It is also possible to work with an object of the `SpatialExperiment`
Bioconductor class. In addition, the `VisiumIO` package allows users
to readily import Visium data from the 10X Space Ranger pipeline and retrieve
a `SpatialExperiment object`.
\
```{r spatial1,eval=TRUE,message=FALSE}
# load data =================================================
# We re-initialize the environment variable of Reactome
# to have all the pathways (we reduced it to a subset above to fasten
# example computations)
reactSubset <- getResource(resourceName = "Reactome",
cache = TRUE)
resetPathways(dataframe = reactSubset,
resourceName = "Reactome")
# Few steps of pre-process to subset a spatialExperiment object
# from STexampleData package ==================================
spe <- Visium_humanDLPFC()
set.seed(123)
speSubset <- spe[, colData(spe)$ground_truth%in%c("Layer1","Layer2")]
idx <- sample(ncol(speSubset), 10)
speSubset <- speSubset[, idx]
my.image.as.raster <- SpatialExperiment::imgRaster(speSubset,
sample_id = imgData(spe)$sample_id[1], image_id = "lowres")
colData(speSubset)$idSpatial <- paste(colData(speSubset)[[4]],
colData(speSubset)[[5]],sep = "x")
annotation <- colData(speSubset)
```
```{r spatial2,eval=TRUE,warning=FALSE}
# prepare data =================================================
bsrdm <- BSRDataModel(speSubset,
min.count = 1,
prop = 0.01,
method = "TC",
symbol.col = 2,
x.col = 4,
y.col = 5,
barcodeID.col = 1)
bsrdm <- learnParameters(bsrdm,
quick = TRUE,
min.positive = 2,
verbose = TRUE)
bsrinf <- BSRInference(bsrdm, min.cor = -1,reference="REACTOME")
# spatial analysis ============================================
bsrinf.red <- reduceToBestPathway(bsrinf)
pairs.red <- LRinter(bsrinf.red)
thres <- 0.01
min.corr <- 0.01
pairs.red <- pairs.red[pairs.red$qval < thres & pairs.red$LR.corr > min.corr,]
head(pairs.red[
order(pairs.red$qval),
c("L", "R", "LR.corr", "pw.name", "qval")])
s.red <- BSRSignature(bsrinf.red, qval.thres=thres)
scores.red <- scoreLRGeneSignatures(bsrdm,s.red)
head(scores.red)
```
From here, one can start exploring the ST data
through different plots.
**Note:** As we work on a much reduced data set to fasten
the vignette generation, only a few point are
displayed in each plot. Actual plots are a lot more informative.
```{r spatialPlot3,eval=TRUE,fig.dim = c(6,4.5)}
# Visualization ============================================
# plot one specific interaction
# we have to follow the syntax with {}
# to be compatible with the reduction functions
inter <- "{SLIT2} / {GPC1}"
# with raw tissue reference
spatialPlot(scores.red[inter, ], annotation, inter,
ref.plot = TRUE, ref.plot.only = FALSE,
image.raster = NULL, dot.size = 1,
label.col = "ground_truth"
)
# or with synthetic image reference
spatialPlot(scores.red[inter, ], annotation, inter,
ref.plot = TRUE, ref.plot.only = FALSE,
image.raster = my.image.as.raster, dot.size = 1,
label.col = "ground_truth"
)
```
You can dissect one interaction to separately visualize the expression of the
ligand and the receptor involved in a specific L-R interaction.
```{r spatialPlot4,eval=TRUE,fig.dim = c(6,4.5)}
separatedLRPlot(scores.red, "SLIT2", "GPC1",
ncounts(bsrdm),
annotation,
label.col = "ground_truth")
```
```{r spatialPlot5,eval=TRUE}
# generate a visual index in a pdf file directly
spatialIndexPlot(scores.red, annotation,
label.col = "ground_truth",
out.file="spatialIndexPlot")
```
\
Finally, we provide a function to assess the statistical associations of
L-R interaction signature scores with user-defined areas of the sample.
Based on these associations, a further visualization function can represent the
latter in the form of a heatmap.
\
```{r spatialPlot6,eval=TRUE,fig.dim = c(6,4.5)}
# statistical association with tissue areas based on correlations
# For display purpose, we only use a subset here
assoc.bsr.corr <- spatialAssociation(scores.red[c(1:17), ],
annotation, label.col = "ground_truth",test = "Spearman")
head(assoc.bsr.corr)
spatialAssociationPlot(assoc.bsr.corr)
```
We also provide 2D-projections (see the `spatialDiversityPlot` function) to
assess the diversity among L-R interaction spatial distributions over
an entire data set. Other functions such as `generateSpatialPlots` can
output multiple individual spatial plots in a graphic file directly.
\
Note that we describe additional use cases in the
[BulkSignalR github companion](
https://github.com/jcolinge/BulkSignalR_companion).
\
See the reference manual for all the details.
```{r inferCells,eval=FALSE,include=FALSE}
## Additional functions
# This is not part of the main workflow for analyzing L-R interactions but
# we offer convenience functions for inferring cell types. However
# we recommend using other software packages specifically dedicated to this
# purpose.
# Inferring cell types
data(sdc, package = "BulkSignalR")
bsrdm <- BSRDataModel(counts = sdc)
bsrdm <- learnParameters(bsrdm)
bsrinf <- BSRInference(bsrdm)
# Common TME cell type signatures
data(immune.signatures, package = "BulkSignalR")
unique(immune.signatures$signature)
immune.signatures <- immune.signatures[immune.signatures$signature %in% c(
"B cells", "Dentritic cells", "Macrophages",
"NK cells", "T cells", "T regulatory cells"
), ]
data("tme.signatures", package = "BulkSignalR")
signatures <- rbind(immune.signatures,
tme.signatures[tme.signatures$signature %in%
c("Endothelial cells", "Fibroblasts"), ])
tme.scores <- scoreSignatures(bsrdm, signatures)
# assign cell types to interactions
lr2ct <- assignCellTypesToInteractions(bsrdm, bsrinf, tme.scores)
head(lr2ct)
# cellular network computation and plot
g.table <- cellularNetworkTable(lr2ct)
gCN <- cellularNetwork(g.table)
plot(gCN, edge.width = 5 * E(gCN)$score)
gSummary <- summarizedCellularNetwork(g.table)
plot(gSummary, edge.width = 1 + 30 * E(gSummary)$score)
# relationship with partial EMT---
# Should be tested HNSCC data instead of SDC!!
# find the ligands
data(p.EMT, package = "BulkSignalR")
gs <- p.EMT$gene
triggers <- relateToGeneSet(bsrinf, gs)
triggers <- triggers[triggers$n.genes > 1, ] # at least 2 target genes in the gs
ligands.in.gs <- intersect(triggers$L, gs)
triggers <- triggers[!(triggers$L %in% ligands.in.gs), ]
ligands <- unique(triggers$L)
# link to cell types
cf <- cellTypeFrequency(triggers, lr2ct, min.n.genes = 2)
missing <- setdiff(rownames(tme.scores), names(cf$s))
cf$s[missing] <- 0
cf$t[missing] <- 0
op <- par(mar = c(2, 10, 2, 2))
barplot(cf$s, col = "lightgray", horiz = T, las = 2)
par(op)
# random selections based on random gene sets
qval.thres <- 1e-3
inter <- LRinter(bsrinf)
tg <- tgGenes(bsrinf)
tcor <- tgCorr(bsrinf)
good <- inter$qval <= qval.thres
inter <- inter[good, ]
tg <- tg[good]
tcor <- tcor[good]
all.targets <- unique(unlist(tg))
r.cf <- list()
for (k in 1:100) { # should 1000 or more
r.gs <- sample(all.targets, length(intersect(gs, all.targets)))
r.triggers <- relateToGeneSet(bsrinf, r.gs, qval.thres = qval.thres)
r.triggers <- r.triggers[r.triggers$n.genes > 1, ]
r.ligands.in.gs <- intersect(r.triggers$L, r.gs)
r.triggers <- r.triggers[!(r.triggers$L %in% r.ligands.in.gs), ]
r <- cellTypeFrequency(r.triggers, lr2ct, min.n.genes = 2)
missing <- setdiff(rownames(tme.scores), names(r$s))
r$s[missing] <- 0
r$t[missing] <- 0
o <- order(names(r$t))
r$s <- r$s[o]
r$t <- r$t[o]
r.cf <- c(r.cf, list(r))
}
r.m.s <- foreach(i = seq_len(length(r.cf)), .combine = rbind) %do% {
r.cf[[i]]$s
}
# plot results
op <- par(mar = c(2, 10, 2, 2))
boxplot(r.m.s, col = "lightgray", horizontal = T, las = 2)
pts <- data.frame(x = as.numeric(cf$s[colnames(r.m.s)]), cty = colnames(r.m.s))
stripchart(x ~ cty, data = pts, add = TRUE, pch = 19, col = "red")
par(op)
for (cty in rownames(tme.scores)) {
cat(cty, ": P=", sum(r.m.s[, cty] >= cf$s[cty]) / nrow(r.m.s),
"\n", sep = "")
}
```
\
# Acknowledgements
We thank Guillaume Tosato for his help with the figures and
Gauthier Gadouas for testing the software on different platforms.
\
Thank you for reading this guide and for using `BulkSignalR`.
\
# Session Information
```{r session-info}
sessionInfo()
```