library(Aerith)
The Aerith package streamlines stable isotope probing (SIP) proteomics workflows by combining data summarization, visualization, and result interpretation within a reproducible R environment. This vignette demonstrates how to explore peptide-spectrum match (PSM) SIP percentages using the curated demo.psm.txt dataset shipped with the package. Each section begins with a short overview, followed by runnable code and guidance on how to adapt the workflow to new datasets.
Peptide SIP percent summarizes the incorporation level of heavy isotopes within identified PSMs. Summaries help benchmark labeling efficiency, compare experimental conditions, and flag outliers before downstream modeling. The summaryPSMsipPCT helper accepts a tab-delimited file with SIP annotations and returns descriptive statistics for every label channel detected. When operating on custom datasets, verify that the file adheres to the Aerith column conventions documented in ?summaryPSMsipPCT, and adjust filtering parameters (for example, minimum score thresholds) upstream to focus on high-confidence identifications.
demo_file <- system.file("extdata", "demo.psm.txt", package = "Aerith")
summaryStats <- summaryPSMsipPCT(demo_file)
print(summaryStats)
#> Count AveragePCT medianPCT madPCT sdPCT FDRpct LabelNumber
#> 1 1987 50.02265 50 4.4478 4.541849 0.6 1959
#> LabeledPCTmedian
#> 1 50
The printed table reports key statistics (count, mean, sd) for each SIP channel, enabling rapid checks of labeling distributions. In routine analyses, inspect the upper quantiles to ensure they align with expected enrichment levels; substantial deviations may indicate instrument artifacts or misassigned labels. For large experiments, consider stratifying inputs by experimental group prior to summarization to support targeted quality control.
Density plots illustrate how isotope incorporation varies across all PSMs.
demo_file <- system.file("extdata", "demo.psm.txt", package = "Aerith")
p <- plotPSMsipPCT(demo_file)
p
The resulting visualization highlights the central tendency and spread of SIP incorporation. Tight, unimodal peaks near the theoretical enrichment confirm consistent labeling, whereas multimodal or broadened distributions suggest heterogeneous uptake or mixed populations. Use this insight to decide whether additional normalization, replicate screening, or targeted re-analysis is warranted before quantitative comparisons.
Aerith integrates contemporary best practices—automated feature extraction, SIP-aware scoring, and publication-ready graphics—into a single package. Its functions interoperate with widely adopted formats (mzML, pepXML, tabular exports) and embrace tidy data principles, simplifying workflows that otherwise require bespoke scripting. By combining high-level summaries with customizable visual outputs, Aerith enables both rapid assessments and in-depth explorations aligned with current state-of-the-art SIP proteomics pipelines.
sessionInfo()
#> R Under development (unstable) (2025-10-20 r88955)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
#>
#> locale:
#> [1] C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] tidyr_1.3.1 ggplot2_4.0.1 stringr_1.6.0 dplyr_1.1.4 Aerith_0.99.11
#>
#> loaded via a namespace (and not attached):
#> [1] rlang_1.1.6 magrittr_2.0.4
#> [3] clue_0.3-66 matrixStats_1.5.0
#> [5] compiler_4.6.0 systemfonts_1.3.1
#> [7] vctrs_0.6.5 reshape2_1.4.5
#> [9] ProtGenerics_1.43.0 pkgconfig_2.0.3
#> [11] MetaboCoreUtils_1.19.1 crayon_1.5.3
#> [13] fastmap_1.2.0 XVector_0.51.0
#> [15] labeling_0.4.3 rmarkdown_2.30
#> [17] preprocessCore_1.73.0 ragg_1.5.0
#> [19] purrr_1.2.0 xfun_0.54
#> [21] MultiAssayExperiment_1.37.2 cachem_1.1.0
#> [23] jsonlite_2.0.0 DelayedArray_0.37.0
#> [25] BiocParallel_1.45.0 parallel_4.6.0
#> [27] cluster_2.1.8.1 R6_2.6.1
#> [29] bslib_0.9.0 stringi_1.8.7
#> [31] RColorBrewer_1.1-3 limma_3.67.0
#> [33] GenomicRanges_1.63.0 jquerylib_0.1.4
#> [35] Rcpp_1.1.0 Seqinfo_1.1.0
#> [37] SummarizedExperiment_1.41.0 iterators_1.0.14
#> [39] knitr_1.50 IRanges_2.45.0
#> [41] Matrix_1.7-4 igraph_2.2.1
#> [43] tidyselect_1.2.1 dichromat_2.0-0.1
#> [45] abind_1.4-8 yaml_2.3.11
#> [47] doParallel_1.0.17 codetools_0.2-20
#> [49] affy_1.89.0 lattice_0.22-7
#> [51] tibble_3.3.0 plyr_1.8.9
#> [53] Biobase_2.71.0 withr_3.0.2
#> [55] S7_0.2.1 evaluate_1.0.5
#> [57] Spectra_1.21.0 pillar_1.11.1
#> [59] affyio_1.81.0 BiocManager_1.30.27
#> [61] MatrixGenerics_1.23.0 foreach_1.5.2
#> [63] stats4_4.6.0 MSnbase_2.37.0
#> [65] MALDIquant_1.22.3 ncdf4_1.24
#> [67] generics_0.1.4 S4Vectors_0.49.0
#> [69] scales_1.4.0 glue_1.8.0
#> [71] lazyeval_0.2.2 tools_4.6.0
#> [73] mzID_1.49.0 data.table_1.17.8
#> [75] QFeatures_1.21.0 vsn_3.79.0
#> [77] mzR_2.45.0 fs_1.6.6
#> [79] XML_3.99-0.20 grid_4.6.0
#> [81] impute_1.85.0 MsCoreUtils_1.23.1
#> [83] PSMatch_1.15.0 cli_3.6.5
#> [85] textshaping_1.0.4 S4Arrays_1.11.1
#> [87] AnnotationFilter_1.35.0 pcaMethods_2.3.0
#> [89] gtable_0.3.6 sass_0.4.10
#> [91] digest_0.6.39 BiocGenerics_0.57.0
#> [93] SparseArray_1.11.7 ggrepel_0.9.6
#> [95] farver_2.1.2 htmltools_0.5.8.1
#> [97] lifecycle_1.0.4 statmod_1.5.1
#> [99] MASS_7.3-65