iSEE
DuoClustering2018 1.2.0
In this vignette we describe how to generate a SingleCellExperiment
object
combining observed values and clustering results for a data set from the
DuoClustering2018
package, and how the resulting object can be explored and
visualized with the iSEE
package (Rue-Albrecht et al. 2018).
suppressPackageStartupMessages({
library(SingleCellExperiment)
library(DuoClustering2018)
library(dplyr)
library(tidyr)
})
## snapshotDate(): 2019-04-29
The different ways of retrieving a data set from the package are described in
the plot_performance
vignette. Here, we will load a data set using the
shortcut function provided in the package.
dat <- sce_filteredExpr10_Koh()
## snapshotDate(): 2019-04-29
## see ?DuoClustering2018 and browseVignettes('DuoClustering2018') for documentation
## downloading 0 resources
## loading from cache
## 'EH1501 : 1501'
For this data set, we also load a set of clustering results obtained using different clustering methods.
res <- clustering_summary_filteredExpr10_Koh_v2()
## snapshotDate(): 2019-04-29
## see ?DuoClustering2018 and browseVignettes('DuoClustering2018') for documentation
## downloading 0 resources
## loading from cache
## 'EH1620 : 1620'
We add the cluster labels for one run and for a set of different imposed number of clusters to the data set.
res <- res %>% dplyr::filter(run == 1 & k %in% c(3, 5, 9)) %>%
dplyr::group_by(method, k) %>%
dplyr::filter(is.na(resolution) | resolution == resolution[1]) %>%
dplyr::ungroup() %>%
tidyr::unite(col = method_k, method, k, sep = "_", remove = TRUE) %>%
dplyr::select(cell, method_k, cluster) %>%
tidyr::spread(key = method_k, value = cluster)
colData(dat) <- DataFrame(
as.data.frame(colData(dat)) %>%
dplyr::left_join(res, by = c("Run" = "cell"))
)
head(colData(dat))
## DataFrame with 6 rows and 54 columns
## Run LibraryName phenoid libsize.drop feature.drop
## <character> <character> <character> <logical> <logical>
## 1 SRR3952323 H7hESC H7hESC FALSE FALSE
## 2 SRR3952325 H7hESC H7hESC FALSE FALSE
## 3 SRR3952326 H7hESC H7hESC FALSE FALSE
## 4 SRR3952327 H7hESC H7hESC FALSE FALSE
## 5 SRR3952328 H7hESC H7hESC FALSE FALSE
## 6 SRR3952329 H7hESC H7hESC FALSE FALSE
## total_features log10_total_features total_counts log10_total_counts
## <integer> <numeric> <numeric> <numeric>
## 1 4895 3.6898414091375 2248411.34571372 6.35187596127693
## 2 4887 3.6891311972345 2271617.36890415 6.35633537184513
## 3 4888 3.68922003726384 584682.409664363 5.76692077094787
## 4 4879 3.68841982200271 3191809.60023222 6.50403711271802
## 5 4873 3.68788552484871 2190384.61049403 6.34052057774775
## 6 4893 3.68966396501577 2187288.93804626 6.33990635512159
## pct_counts_top_50_features pct_counts_top_100_features
## <numeric> <numeric>
## 1 18.2789645082187 25.9753898639458
## 2 24.6725290693842 32.222803367377
## 3 22.7328390182813 30.2059881954046
## 4 20.8673775614106 29.003904032128
## 5 21.2879231261916 29.4236885040328
## 6 20.593115356144 27.7401057678724
## pct_counts_top_200_features pct_counts_top_500_features is_cell_control
## <numeric> <numeric> <logical>
## 1 35.5376157218203 52.410940848381 FALSE
## 2 41.5473580607458 57.9692329081111 FALSE
## 3 39.4313075652416 55.2858170919008 FALSE
## 4 38.7855579579296 56.0208594644678 FALSE
## 5 39.3076832898896 56.6409750805386 FALSE
## 6 36.7818664500694 52.7546829572243 FALSE
## CIDR_3 CIDR_5 CIDR_9 FlowSOM_3 FlowSOM_5 FlowSOM_9
## <character> <character> <character> <character> <character> <character>
## 1 1 1 1 2 2 4
## 2 1 1 1 2 2 4
## 3 1 1 1 2 2 4
## 4 1 1 1 2 2 4
## 5 1 1 1 2 2 4
## 6 1 1 1 2 2 4
## PCAHC_3 PCAHC_5 PCAHC_9 PCAKmeans_3 PCAKmeans_5 PCAKmeans_9
## <character> <character> <character> <character> <character> <character>
## 1 1 1 1 3 1 4
## 2 1 1 1 3 1 4
## 3 1 1 1 3 1 4
## 4 1 1 1 3 1 4
## 5 1 1 1 3 1 4
## 6 1 1 1 3 1 4
## RaceID2_3 RaceID2_5 RaceID2_9 RtsneKmeans_3 RtsneKmeans_5
## <character> <character> <character> <character> <character>
## 1 1 1 1 1 1
## 2 2 2 2 1 1
## 3 2 2 2 1 1
## 4 1 1 1 1 1
## 5 1 1 1 1 1
## 6 1 2 2 1 1
## RtsneKmeans_9 SAFE_3 SAFE_5 SAFE_9 SC3_3 SC3_5
## <character> <character> <character> <character> <character> <character>
## 1 9 2 1 3 1 3
## 2 9 2 1 5 1 3
## 3 9 2 1 3 1 3
## 4 9 2 1 5 1 3
## 5 9 2 1 5 1 3
## 6 9 2 1 5 1 3
## SC3_9 SC3svm_3 SC3svm_5 SC3svm_9 Seurat_9 TSCAN_3
## <character> <character> <character> <character> <character> <character>
## 1 4 3 3 3 5 1
## 2 4 3 3 3 5 1
## 3 4 3 3 3 5 3
## 4 4 3 3 3 5 1
## 5 4 3 3 3 5 2
## 6 4 3 3 3 5 1
## TSCAN_5 TSCAN_9 ascend_3 ascend_5 ascend_9 monocle_3
## <character> <character> <character> <character> <character> <character>
## 1 1 1 1 NA NA 3
## 2 1 2 1 NA NA 3
## 3 3 2 1 NA NA 3
## 4 1 1 1 NA NA 3
## 5 2 2 1 NA NA 3
## 6 1 1 1 NA NA 3
## monocle_5 monocle_9 pcaReduce_3 pcaReduce_5 pcaReduce_9
## <character> <character> <character> <character> <character>
## 1 3 3 1 5 5
## 2 3 3 1 5 5
## 3 3 3 1 5 5
## 4 3 3 1 5 5
## 5 3 3 1 5 5
## 6 3 3 1 5 5
iSEE
The resulting SingleCellExperiment
can be interactively explored using, e.g.,
the iSEE
package. This can be useful to gain additional understanding of the
partitions inferred by the different clustering methods, to visualize these in
low-dimensional representations (PCA or t-SNE), and to investigate how well they
agree with known or inferred groupings of the cells.
if (require(iSEE)) {
iSEE(dat)
}
sessionInfo()
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.9-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.9-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] tidyr_0.8.3 dplyr_0.8.0.1
## [3] DuoClustering2018_1.2.0 SingleCellExperiment_1.6.0
## [5] SummarizedExperiment_1.14.0 DelayedArray_0.10.0
## [7] BiocParallel_1.18.0 matrixStats_0.54.0
## [9] Biobase_2.44.0 GenomicRanges_1.36.0
## [11] GenomeInfoDb_1.20.0 IRanges_2.18.0
## [13] S4Vectors_0.22.0 BiocGenerics_0.30.0
## [15] BiocStyle_2.12.0
##
## loaded via a namespace (and not attached):
## [1] viridis_0.5.1 httr_1.4.0
## [3] bit64_0.9-7 viridisLite_0.3.0
## [5] AnnotationHub_2.16.0 shiny_1.3.2
## [7] assertthat_0.2.1 interactiveDisplayBase_1.22.0
## [9] BiocManager_1.30.4 BiocFileCache_1.8.0
## [11] blob_1.1.1 GenomeInfoDbData_1.2.1
## [13] yaml_2.2.0 pillar_1.3.1
## [15] RSQLite_2.1.1 lattice_0.20-38
## [17] glue_1.3.1 digest_0.6.18
## [19] promises_1.0.1 XVector_0.24.0
## [21] colorspace_1.4-1 htmltools_0.3.6
## [23] httpuv_1.5.1 Matrix_1.2-17
## [25] plyr_1.8.4 pkgconfig_2.0.2
## [27] bookdown_0.9 zlibbioc_1.30.0
## [29] xtable_1.8-4 purrr_0.3.2
## [31] scales_1.0.0 later_0.8.0
## [33] tibble_2.1.1 ggplot2_3.1.1
## [35] lazyeval_0.2.2 mime_0.6
## [37] magrittr_1.5 crayon_1.3.4
## [39] mclust_5.4.3 memoise_1.1.0
## [41] evaluate_0.13 ggthemes_4.1.1
## [43] tools_3.6.0 stringr_1.4.0
## [45] munsell_0.5.0 AnnotationDbi_1.46.0
## [47] compiler_3.6.0 rlang_0.3.4
## [49] grid_3.6.0 RCurl_1.95-4.12
## [51] rappdirs_0.3.1 bitops_1.0-6
## [53] rmarkdown_1.12 ExperimentHub_1.10.0
## [55] gtable_0.3.0 DBI_1.0.0
## [57] curl_3.3 reshape2_1.4.3
## [59] R6_2.4.0 gridExtra_2.3
## [61] knitr_1.22 bit_1.1-14
## [63] stringi_1.4.3 Rcpp_1.0.1
## [65] dbplyr_1.4.0 tidyselect_0.2.5
## [67] xfun_0.6
Rue-Albrecht, K, F Marini, C Soneson, and ATL Lun. 2018. “iSEE: Interactive SummarizedExperiment Explorer.” F1000Research 7:741.