ScaledMatrix
classThe ScaledMatrix provides yet another method of running
scale() on a matrix. In other words, these three operations
are equivalent:
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1.22731002 -0.8103571 -2.64985575 -0.4420285 1.2169369 0.04513855
## [2,] -0.95528444 -1.3623290 -0.02254971 0.3365111 -1.5907675 0.87612616
## [3,] 0.69312682 0.4470084 -0.86273407 -0.7687473 1.5310545 -0.35373062
## [4,] -0.13480477 0.9093283 -1.37152355 -1.4817372 0.6304335 0.26707124
## [5,] 1.22683532 -1.4033932 -1.24555367 1.0349000 -1.1849839 0.17208024
## [6,] -0.03727829 -0.3059770 -0.65830324 -1.1626487 0.7988388 -1.91970750
## [,7] [,8] [,9] [,10]
## [1,] -0.209379628 -0.1367500 0.5739843 0.72726336
## [2,] -0.570755365 0.1792940 1.1735877 0.07209481
## [3,] 0.005546874 -1.1296346 -1.1691520 1.39908028
## [4,] -1.011934036 0.3462437 1.3463385 0.49525978
## [5,] 0.424135419 -0.8146415 -1.3491765 -0.79271778
## [6,] 1.222595141 -0.3729114 -0.2404448 1.04167933
## <6 x 10> DelayedMatrix object of type "double":
## [,1] [,2] [,3] ... [,9] [,10]
## [1,] 1.22731002 -0.81035707 -2.64985575 . 0.57398425 0.72726336
## [2,] -0.95528444 -1.36232903 -0.02254971 . 1.17358768 0.07209481
## [3,] 0.69312682 0.44700845 -0.86273407 . -1.16915199 1.39908028
## [4,] -0.13480477 0.90932825 -1.37152355 . 1.34633852 0.49525978
## [5,] 1.22683532 -1.40339319 -1.24555367 . -1.34917651 -0.79271778
## [6,] -0.03727829 -0.30597698 -0.65830324 . -0.24044476 1.04167933
## <6 x 10> ScaledMatrix object of type "double":
## [,1] [,2] [,3] ... [,9] [,10]
## [1,] 1.22731002 -0.81035707 -2.64985575 . 0.57398425 0.72726336
## [2,] -0.95528444 -1.36232903 -0.02254971 . 1.17358768 0.07209481
## [3,] 0.69312682 0.44700845 -0.86273407 . -1.16915199 1.39908028
## [4,] -0.13480477 0.90932825 -1.37152355 . 1.34633852 0.49525978
## [5,] 1.22683532 -1.40339319 -1.24555367 . -1.34917651 -0.79271778
## [6,] -0.03727829 -0.30597698 -0.65830324 . -0.24044476 1.04167933
The biggest difference lies in how they behave in downstream matrix operations.
smat1 is an ordinary matrix, with the scaled and
centered values fully realized in memory. Nothing too unusual here.smat2 is a DelayedMatrix and undergoes
block processing whereby chunks are realized and operated on, one at a
time. This sacrifices speed for greater memory efficiency by avoiding a
copy of the entire matrix. In particular, it preserves the structure of
the original mat, e.g., from a sparse or file-backed
representation.smat3 is a ScaledMatrix that refactors
certain operations so that they can be applied to the original
mat without any scaling or centering. This takes advantage
of the original data structure to speed up matrix multiplication and
row/column sums, albeit at the cost of numerical precision.Given an original matrix \(\mathbf{X}\) with \(n\) columns, a vector of column centers \(\mathbf{c}\) and a vector of column scaling values \(\mathbf{s}\), our scaled matrix can be written as:
\[ \mathbf{Y} = (\mathbf{X} - \mathbf{c} \cdot \mathbf{1}_n^T) \mathbf{S} \]
where \(\mathbf{S} = \text{diag}(s_1^{-1}, ..., s_n^{-1})\). If we wanted to right-multiply it with another matrix \(\mathbf{A}\), we would have:
\[ \mathbf{YA} = \mathbf{X}\mathbf{S}\mathbf{A} - \mathbf{c} \cdot \mathbf{1}_n^T \mathbf{S}\mathbf{A} \]
The right-most expression is simply the outer product of \(\mathbf{c}\) with the column sums of \(\mathbf{SA}\). More important is the fact that we can use the matrix multiplication operator for \(\mathbf{X}\) with \(\mathbf{SA}\), as this allows us to use highly efficient algorithms for certain data representations, e.g., sparse matrices.
library(Matrix)
mat <- rsparsematrix(20000, 10000, density=0.01)
smat <- ScaledMatrix(mat, center=TRUE, scale=TRUE)
blob <- matrix(runif(ncol(mat) * 5), ncol=5)
system.time(out <- smat %*% blob)## user system elapsed
## 0.017 0.002 0.017
# The slower way with block processing.
da <- scale(DelayedArray(mat))
system.time(out2 <- da %*% blob)## user system elapsed
## 13.142 5.807 13.040
The same logic applies for left-multiplication and cross-products.
This allows us to easily speed up high-level operations involving matrix
multiplication by just switching to a ScaledMatrix, e.g.,
in approximate PCA algorithms from the BiocSingular
package.
## user system elapsed
## 8.692 16.026 6.466
Row and column sums are special cases of matrix multiplication and can be computed quickly:
## user system elapsed
## 0.008 0.006 0.007
## user system elapsed
## 11.409 0.565 11.975
Subsetting, transposition and renaming of the dimensions are all
supported without loss of the ScaledMatrix
representation:
## <20000 x 5> ScaledMatrix object of type "double":
## [,1] [,2] [,3] [,4] [,5]
## [1,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## [2,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## [3,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## [4,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## [5,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## ... . . . . .
## [19996,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## [19997,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## [19998,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## [19999,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## [20000,] -0.003527673 0.004119697 -0.004345367 -0.006984045 -0.004873021
## <10000 x 20000> ScaledMatrix object of type "double":
## [,1] [,2] [,3] ... [,19999] [,20000]
## [1,] -0.003527673 -0.003527673 -0.003527673 . -0.003527673 -0.003527673
## [2,] 0.004119697 0.004119697 0.004119697 . 0.004119697 0.004119697
## [3,] -0.004345367 -0.004345367 -0.004345367 . -0.004345367 -0.004345367
## [4,] -0.006984045 -0.006984045 -0.006984045 . -0.006984045 -0.006984045
## [5,] -0.004873021 -0.004873021 -0.004873021 . -0.004873021 -0.004873021
## ... . . . . . .
## [9996,] -0.019102504 -0.019102504 -0.019102504 . -0.019102504 -0.019102504
## [9997,] -0.007964258 -0.007964258 -0.007964258 . -0.007964258 -0.007964258
## [9998,] -0.010970981 -0.010970981 -0.010970981 . -0.010970981 -0.010970981
## [9999,] 0.000217881 0.000217881 0.000217881 . 0.000217881 0.000217881
## [10000,] -0.010548209 -0.010548209 -0.010548209 . -0.010548209 -0.010548209
## <20000 x 10000> ScaledMatrix object of type "double":
## [,1] [,2] [,3] ... [,9999] [,10000]
## GENE_1 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## GENE_2 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## GENE_3 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## GENE_4 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## GENE_5 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## ... . . . . . .
## GENE_19996 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## GENE_19997 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## GENE_19998 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## GENE_19999 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
## GENE_20000 -0.003527673 0.004119697 -0.004345367 . 0.000217881 -0.010548209
Other operations will cause the ScaledMatrix to collapse
to the general DelayedMatrix representation, after which
point block processing will be used.
## <20000 x 10000> DelayedMatrix object of type "double":
## [,1] [,2] [,3] ... [,9999] [,10000]
## GENE_1 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## GENE_2 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## GENE_3 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## GENE_4 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## GENE_5 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## ... . . . . . .
## GENE_19996 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## GENE_19997 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## GENE_19998 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## GENE_19999 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
## GENE_20000 0.9964723 1.0041197 0.9956546 . 1.0002179 0.9894518
For most part, the implementation of the multiplication assumes that
the \(\mathbf{A}\) matrix and the
matrix product are small compared to \(\mathbf{X}\). It is also possible to
multiply two ScaledMatrixes together if the underlying
matrices have efficient operators for their product. However, if this is
not the case, the ScaledMatrix offers little benefit for
increased overhead.
It is also worth noting that this speed-up is not entirely free. The
expression above involves subtracting two matrix with potentially large
values, which runs the risk of catastrophic cancellation. The example
below demonstrates how ScaledMatrix is more susceptible to
loss of precision than a normal DelayedArray:
set.seed(1000)
mat <- matrix(rnorm(1000000), ncol=100000)
big.mat <- mat + 1e12
# The 'correct' value, unaffected by numerical precision.
ref <- rowMeans(scale(mat))
head(ref)## [1] -0.0025584703 -0.0008570664 -0.0019225335 -0.0001039903 0.0024761772
## [6] 0.0032943203
# The value from scale'ing a DelayedArray.
library(DelayedArray)
smat2 <- scale(DelayedArray(big.mat))
head(rowMeans(smat2))## [1] -0.0025583534 -0.0008571123 -0.0019226040 -0.0001039539 0.0024761618
## [6] 0.0032943783
# The value from a ScaledMatrix.
library(ScaledMatrix)
smat3 <- ScaledMatrix(big.mat, center=TRUE, scale=TRUE)
head(rowMeans(smat3))## [1] -0.00256 -0.00160 -0.00096 -0.00304 0.00064 0.00352
In most practical applications, though, this does not seem to be a major concern, especially as most values (e.g., log-normalized expression matrices) lie close to zero anyway.
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] BiocSingular_1.25.1 ScaledMatrix_1.19.0 DelayedArray_0.37.0
## [4] SparseArray_1.9.2 S4Arrays_1.9.3 abind_1.4-8
## [7] IRanges_2.45.0 S4Vectors_0.47.6 MatrixGenerics_1.21.0
## [10] matrixStats_1.5.0 BiocGenerics_0.55.4 generics_0.1.4
## [13] Matrix_1.7-4 BiocStyle_2.37.1
##
## loaded via a namespace (and not attached):
## [1] jsonlite_2.0.0 compiler_4.5.1
## [3] BiocManager_1.30.26 rsvd_1.0.5
## [5] Rcpp_1.1.0 DelayedMatrixStats_1.31.0
## [7] parallel_4.5.1 jquerylib_0.1.4
## [9] BiocParallel_1.43.4 yaml_2.3.10
## [11] fastmap_1.2.0 lattice_0.22-7
## [13] R6_2.6.1 XVector_0.49.3
## [15] knitr_1.50 maketools_1.3.2
## [17] bslib_0.9.0 rlang_1.1.6
## [19] cachem_1.1.0 xfun_0.53
## [21] sass_0.4.10 sys_3.4.3
## [23] cli_3.6.5 digest_0.6.37
## [25] grid_4.5.1 irlba_2.3.5.1
## [27] sparseMatrixStats_1.21.0 lifecycle_1.0.4
## [29] evaluate_1.0.5 codetools_0.2-20
## [31] buildtools_1.0.0 beachmat_2.25.5
## [33] rmarkdown_2.30 tools_4.5.1
## [35] htmltools_0.5.8.1