scTreeViz
is a package for interactive visualization and exploration of Single
Cell RNA sequencing data. scTreeViz provides methods for
exploring hierarchical features (eg. clusters in single cell at
different resolutions or taxonomic hierarchy in single cell datasets),
while supporting other useful data visualization charts like heatmaps
for expression and scatter plots for dimensionality reductions like UMAP
or TSNE.
The first step in using the scTreeViz package is to wrap
datasets into TreeViz objects. The TreeViz
class extends SummarizedExperiment and provides various
methods to interactively perform various operations on the underlying
hierarchy and count or expression matrices. In this section, we show
various ways to generate a TreeViz object either from
existing Single Cell packages (SingleCellExperiment or Seurat) or from a
raw count matrix and cluster hierarchy.
SingleCellExperimentA number of Single cell datasets are available as
SingleCellExperiment objects through the
scRNAseq package, for this usecase, we use
LunSpikeInData dataset. In addition, we calculate the
dimensionality reductions; UMAP, TSNE and PCA from the functions
provided in scater package.
# load dataset
sce<- ZeiselBrainData()
# Normalization
sce <- logNormCounts(sce)
# calculate umap and tsne
sce <- runUMAP(sce)
sce<- runTSNE(sce)
sce<- runPCA(sce)We provide createFromSCE function to create a
TreeViz object from SingleCellExperiment
object. Here, the workflow works in two ways:
colData
of the SingleCellExperiment object, we create clusters at
different resolutions using the WalkTrap algorithm by
calling an internal function generate_walktrap_hierarchy
and use this cluster information for visualization.treeViz <- createFromSCE(sce, reduced_dim = c("UMAP","PCA","TSNE"))
#> [1] "1.cluster1" "2.cluster2" "3.cluster3" "4.cluster4" "5.cluster5"
#> [6] "6.cluster6" "7.cluster7" "8.cluster8" "9.cluster9" "10.cluster10"
#> [11] "11.cluster11" "12.cluster12" "13.cluster13" "14.cluster14" "samples"
plot(treeViz)colData of
the object, then the user should set the flag parameter
check_coldata to TRUE and provide prefix for
the columns where cluster information is stored.# Forming clusters
set.seed(1000)
for (i in seq(10)) {
clust.kmeans <- kmeans(reducedDim(sce, "TSNE"), centers = i)
sce[[paste0("clust", i)]] <- factor(clust.kmeans$cluster)
}
treeViz<- createFromSCE(sce, check_coldata = TRUE, col_regex = "clust")
plot(treeViz)Note: In both cases the user needs to provide the name of dimensionality reductions present in the object as a parameter.
SeuratWe use the dataset pbmc_small available through Seurat
to create a TreeViz object.
We then preprocess the data and find clusters at different resolutions.
pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
pbmc <- NormalizeData(pbmc)
all.genes <- rownames(pbmc)
pbmc <- ScaleData(pbmc, vars.to.regress = "percent.mt")
pbmc <- FindVariableFeatures(object = pbmc)
pbmc <- RunPCA(pbmc, features = VariableFeatures(object = pbmc))
pbmc <- FindNeighbors(pbmc, dims = 1:10)
pbmc <- FindClusters(pbmc, resolution = c(0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0), print.output = 0, save.SNN = TRUE)
pbmcThe measurements for dimensionality reduction methods we want to
visualize are also added to the object via native functions in
Seurat. Since PCA is already added, we
calculate TSNE and UMAP
We use the createFromSeurat function to create a
TreeViz object from Seurat object. In addition
the object, we pass the name of dimensionality reductions present in the
object as a paramter in vector format to indicate these measurements
should be added to treeviz for visualization. If the
mentioned reduced dimension is not present it would simply be
ignored.
treeViz<- createFromSeurat(pbmc, check_metadata = TRUE, reduced_dim = c("umap","pca","tsne"))
#> [1] "6.cluster6" "10.cluster10" "11.cluster11" "samples"
#> [1] "umap" "pca" "tsne"
plot(treeViz)n=64
# create a hierarchy
df<- data.frame(cluster0=rep(1,n))
for(i in seq(1,5)){
df[[paste0("cluster",i)]]<- rep(seq(1:(2**i)),each=ceiling(n/(2**i)),len=n)
}
# generate a count matrix
counts <- matrix(rpois(6400, lambda = 10), ncol=n, nrow=100)
colnames(counts)<- seq(1:64)
# create a `TreeViz` object
treeViz <- createTreeViz(df, counts)
plot(treeViz)Start the App from the treeViz object we created. This
adds a facetZoom to navigate the cluster hierarchy, a
heatmap of the top n most variable genes from the dataset,
where ‘n’ is selected by the user and one scatter plot for each of the
reduced dimensions.
Users can also use the interface to explore the same dataset using different visualizations available through Epiviz.
Users can also add Gene Box plots using either the frontend
application, or from R session. In the following example, we visualize
the 5th, 50th and 500th most variable gene as Box plots
Users need to select
Add Visualization -> Gene Box PLot option from menu and
then select the desired gene using the search pane in the appeared
dialogue box
sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] future_1.67.0 scRNAseq_2.23.1
#> [3] igraph_2.2.1 clustree_0.5.1
#> [5] ggraph_2.2.2 scater_1.39.0
#> [7] ggplot2_4.0.0 scran_1.39.0
#> [9] scuttle_1.21.0 SingleCellExperiment_1.33.0
#> [11] SC3_1.39.0 Seurat_5.3.1
#> [13] SeuratObject_5.2.0 sp_2.2-0
#> [15] scTreeViz_1.17.0 SummarizedExperiment_1.41.0
#> [17] Biobase_2.71.0 GenomicRanges_1.63.0
#> [19] Seqinfo_1.1.0 IRanges_2.45.0
#> [21] S4Vectors_0.49.0 BiocGenerics_0.57.0
#> [23] generics_0.1.4 MatrixGenerics_1.23.0
#> [25] matrixStats_1.5.0 epivizr_2.41.0
#> [27] BiocStyle_2.39.0
#>
#> loaded via a namespace (and not attached):
#> [1] epivizrData_1.39.0 goftest_1.2-3 Biostrings_2.79.1
#> [4] HDF5Array_1.39.0 vctrs_0.6.5 spatstat.random_3.4-2
#> [7] digest_0.6.37 png_0.1-8 proxy_0.4-27
#> [10] pcaPP_2.0-5 gypsum_1.7.0 ggrepel_0.9.6
#> [13] deldir_2.0-4 parallelly_1.45.1 alabaster.sce_1.9.0
#> [16] MASS_7.3-65 reshape2_1.4.4 httpuv_1.6.16
#> [19] foreach_1.5.2 withr_3.0.2 xfun_0.54
#> [22] survival_3.8-3 doRNG_1.8.6.2 memoise_2.0.1
#> [25] ggbeeswarm_0.7.2 zoo_1.8-14 pbapply_1.7-4
#> [28] DEoptimR_1.1-4 sys_3.4.3 KEGGREST_1.51.0
#> [31] promises_1.5.0 otel_0.2.0 httr_1.4.7
#> [34] restfulr_0.0.16 globals_0.18.0 fitdistrplus_1.2-4
#> [37] rhdf5filters_1.23.0 rhdf5_2.55.4 UCSC.utils_1.7.0
#> [40] miniUI_0.1.2 curl_7.0.0 ScaledMatrix_1.19.0
#> [43] h5mread_1.3.0 polyclip_1.10-7 ExperimentHub_3.1.0
#> [46] SparseArray_1.11.1 RBGL_1.87.0 xtable_1.8-4
#> [49] stringr_1.5.2 doParallel_1.0.17 evaluate_1.0.5
#> [52] S4Arrays_1.11.0 BiocFileCache_3.1.0 irlba_2.3.5.1
#> [55] filelock_1.0.3 ROCR_1.0-11 reticulate_1.44.0
#> [58] spatstat.data_3.1-9 magrittr_2.0.4 lmtest_0.9-40
#> [61] later_1.4.4 buildtools_1.0.0 viridis_0.6.5
#> [64] lattice_0.22-7 spatstat.geom_3.6-0 future.apply_1.20.0
#> [67] robustbase_0.99-6 scattermore_1.2 XML_3.99-0.19
#> [70] cowplot_1.2.0 RcppAnnoy_0.0.22 maketools_1.3.2
#> [73] class_7.3-23 pillar_1.11.1 nlme_3.1-168
#> [76] iterators_1.0.14 compiler_4.5.1 beachmat_2.27.0
#> [79] RSpectra_0.16-2 stringi_1.8.7 tensor_1.5.1
#> [82] GenomicAlignments_1.47.0 plyr_1.8.9 crayon_1.5.3
#> [85] abind_1.4-8 BiocIO_1.21.0 locfit_1.5-9.12
#> [88] graphlayouts_1.2.2 bit_4.6.0 dplyr_1.1.4
#> [91] codetools_0.2-20 BiocSingular_1.27.0 bslib_0.9.0
#> [94] e1071_1.7-16 alabaster.ranges_1.9.1 plotly_4.11.0
#> [97] mime_0.13 splines_4.5.1 Rcpp_1.1.0
#> [100] fastDummies_1.7.5 dbplyr_2.5.1 knitr_1.50
#> [103] blob_1.2.4 BiocVersion_3.23.1 AnnotationFilter_1.35.0
#> [106] WriteXLS_6.8.0 checkmate_2.3.3 listenv_0.10.0
#> [109] tibble_3.3.0 Matrix_1.7-4 statmod_1.5.1
#> [112] tweenr_2.0.3 pkgconfig_2.0.3 pheatmap_1.0.13
#> [115] tools_4.5.1 cachem_1.1.0 cigarillo_1.1.0
#> [118] RSQLite_2.4.3 viridisLite_0.4.2 DBI_1.2.3
#> [121] fastmap_1.2.0 rmarkdown_2.30 scales_1.4.0
#> [124] grid_4.5.1 ica_1.0-3 epivizrServer_1.39.0
#> [127] Rsamtools_2.27.0 AnnotationHub_4.1.0 sass_0.4.10
#> [130] FNN_1.1.4.1 patchwork_1.3.2 BiocManager_1.30.26
#> [133] dotCall64_1.2 graph_1.89.0 RANN_2.6.2
#> [136] alabaster.schemas_1.11.0 farver_2.1.2 tidygraph_1.3.1
#> [139] yaml_2.3.10 rtracklayer_1.69.1 cli_3.6.5
#> [142] purrr_1.1.0 lifecycle_1.0.4 uwot_0.2.3
#> [145] mvtnorm_1.3-3 backports_1.5.0 bluster_1.21.0
#> [148] BiocParallel_1.45.0 gtable_0.3.6 rjson_0.2.23
#> [151] ggridges_0.5.7 progressr_0.17.0 parallel_4.5.1
#> [154] limma_3.67.0 jsonlite_2.0.0 edgeR_4.9.0
#> [157] RcppHNSW_0.6.0 bitops_1.0-9 bit64_4.6.0-1
#> [160] Rtsne_0.17 alabaster.matrix_1.9.0 spatstat.utils_3.2-0
#> [163] BiocNeighbors_2.5.0 jquerylib_0.1.4 metapod_1.19.0
#> [166] alabaster.se_1.9.0 dqrng_0.4.1 spatstat.univar_3.1-4
#> [169] rrcov_1.7-7 lazyeval_0.2.2 alabaster.base_1.11.1
#> [172] shiny_1.11.1 htmltools_0.5.8.1 sctransform_0.4.2
#> [175] rappdirs_0.3.3 ensembldb_2.35.0 glue_1.8.0
#> [178] spam_2.11-1 httr2_1.2.1 XVector_0.51.0
#> [181] RCurl_1.98-1.17 gridExtra_2.3 R6_2.6.1
#> [184] tidyr_1.3.1 labeling_0.4.3 GenomicFeatures_1.63.1
#> [187] cluster_2.1.8.1 rngtools_1.5.2 Rhdf5lib_1.33.0
#> [190] GenomeInfoDb_1.47.0 DelayedArray_0.37.0 tidyselect_1.2.1
#> [193] vipor_0.4.7 ProtGenerics_1.43.0 ggforce_0.5.0
#> [196] AnnotationDbi_1.73.0 rsvd_1.0.5 KernSmooth_2.23-26
#> [199] S7_0.2.0 data.table_1.17.8 htmlwidgets_1.6.4
#> [202] RColorBrewer_1.1-3 rlang_1.1.6 spatstat.sparse_3.1-0
#> [205] spatstat.explore_3.5-3 beeswarm_0.4.0 OrganismDbi_1.53.2