OmnipathR 3.5.25
1 Institute for Computational Biomedicine, Heidelberg University
To see a full list of datasets call the omnipath_show_db
function:
library(OmnipathR)
omnipath_show_db()
## # A tibble: 19 × 10
## name last_used lifet…¹ package loader loader_param latest_param loaded db key
## <chr> <dttm> <dbl> <chr> <chr> <list> <list> <lgl> <list> <chr>
## 1 Gene Ontolo… 2022-10-23 19:27:56 300 Omnipa… go_on… <named list> <named list> TRUE <named list> go_b…
## 2 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_f…
## 3 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_a…
## 4 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_a…
## 5 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_s…
## 6 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_c…
## 7 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_d…
## 8 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_c…
## 9 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_m…
## 10 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 11 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_m…
## 12 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 13 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 14 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_y…
## 15 GO annotati… NA 300 Omnipa… go_an… <named list> <lgl [1]> FALSE <lgl [1]> goa_…
## 16 UniProt-Gen… NA 300 Omnipa… unipr… <named list> <lgl [1]> FALSE <lgl [1]> up_gs
## 17 Ensembl org… 2022-10-23 19:27:30 10800 Omnipa… taxon… <NULL> <NULL> TRUE <tibble> orga…
## 18 All SwissPr… NA 10800 Omnipa… all_u… <named list> <lgl [1]> FALSE <lgl [1]> swis…
## 19 All TrEMBL … NA 10800 Omnipa… all_u… <named list> <lgl [1]> FALSE <lgl [1]> trem…
## # … with abbreviated variable name ¹lifetime
It returns a tibble where each dataset has a human readable name and a key which can be used to refer to it. We can also check here if the dataset is currently loaded, the time it’s been last used, the loader function and its arguments.
Datasets can be accessed by the get_db
function. Ideally you should call
this function every time you use the dataset. The first time it will be
loaded, the subsequent times the already loaded dataset will be returned.
This way each access is registered and extends the expiry time. Let’s load
the human UniProt-GeneSymbol table. Above we see its key is up_gs
.
up_gs <- get_db('up_gs')
up_gs
## # A tibble: 20,372 × 2
## From To
## <chr> <chr>
## 1 P63120 ERVK-19
## 2 Q96EC8 YIPF6
## 3 Q6ZMS4 ZNF852
## 4 Q8N8L2 ZNF491
## 5 Q15916 ZBTB6
## 6 O60384 ZNF861P
## 7 Q3MIS6 ZNF528
## 8 Q86UK7 ZNF598
## 9 Q6P280 ZNF529
## 10 Q969W1 ZDHHC16
## # … with 20,362 more rows
This dataset is a two columns data frame of SwissProt IDs and Gene Symbols.
Looking again at the datasets, we find that this dataset is loaded now and
the last_used
timestamp is set to the time we called get_db
:
omnipath_show_db()
## # A tibble: 19 × 10
## name last_used lifet…¹ package loader loader_param latest_param loaded db key
## <chr> <dttm> <dbl> <chr> <chr> <list> <list> <lgl> <list> <chr>
## 1 Gene Ontolo… 2022-10-23 19:27:56 300 Omnipa… go_on… <named list> <named list> TRUE <named list> go_b…
## 2 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_f…
## 3 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_a…
## 4 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_a…
## 5 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_s…
## 6 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_c…
## 7 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_d…
## 8 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_c…
## 9 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_m…
## 10 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 11 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_m…
## 12 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 13 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 14 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_y…
## 15 GO annotati… NA 300 Omnipa… go_an… <named list> <lgl [1]> FALSE <lgl [1]> goa_…
## 16 UniProt-Gen… 2022-10-23 19:28:00 300 Omnipa… unipr… <named list> <named list> TRUE <tibble> up_gs
## 17 Ensembl org… 2022-10-23 19:27:58 10800 Omnipa… taxon… <NULL> <NULL> TRUE <tibble> orga…
## 18 All SwissPr… NA 10800 Omnipa… all_u… <named list> <lgl [1]> FALSE <lgl [1]> swis…
## 19 All TrEMBL … NA 10800 Omnipa… all_u… <named list> <lgl [1]> FALSE <lgl [1]> trem…
## # … with abbreviated variable name ¹lifetime
The above table contains also a reference to the dataset, and the arguments passed to the loader function:
d <- omnipath_show_db()
d %>% dplyr::pull(db) %>% magrittr::extract2(16)
## # A tibble: 20,372 × 2
## From To
## <chr> <chr>
## 1 P63120 ERVK-19
## 2 Q96EC8 YIPF6
## 3 Q6ZMS4 ZNF852
## 4 Q8N8L2 ZNF491
## 5 Q15916 ZBTB6
## 6 O60384 ZNF861P
## 7 Q3MIS6 ZNF528
## 8 Q86UK7 ZNF598
## 9 Q6P280 ZNF529
## 10 Q969W1 ZDHHC16
## # … with 20,362 more rows
d %>% dplyr::pull(latest_param) %>% magrittr::extract2(16)
## $to
## [1] "genesymbol"
##
## $organism
## [1] 9606
If we call get_db
again, the timestamp is updated, resetting the expiry
counter:
up_gs <- get_db('up_gs')
omnipath_show_db()
## # A tibble: 19 × 10
## name last_used lifet…¹ package loader loader_param latest_param loaded db key
## <chr> <dttm> <dbl> <chr> <chr> <list> <list> <lgl> <list> <chr>
## 1 Gene Ontolo… 2022-10-23 19:27:56 300 Omnipa… go_on… <named list> <named list> TRUE <named list> go_b…
## 2 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_f…
## 3 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_a…
## 4 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_a…
## 5 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_s…
## 6 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_c…
## 7 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_d…
## 8 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_c…
## 9 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_m…
## 10 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 11 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_m…
## 12 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 13 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_p…
## 14 Gene Ontolo… NA 300 Omnipa… go_on… <named list> <lgl [1]> FALSE <lgl [1]> go_y…
## 15 GO annotati… NA 300 Omnipa… go_an… <named list> <lgl [1]> FALSE <lgl [1]> goa_…
## 16 UniProt-Gen… 2022-10-23 19:28:11 300 Omnipa… unipr… <named list> <named list> TRUE <tibble> up_gs
## 17 Ensembl org… 2022-10-23 19:27:58 10800 Omnipa… taxon… <NULL> <NULL> TRUE <tibble> orga…
## 18 All SwissPr… NA 10800 Omnipa… all_u… <named list> <lgl [1]> FALSE <lgl [1]> swis…
## 19 All TrEMBL … NA 10800 Omnipa… all_u… <named list> <lgl [1]> FALSE <lgl [1]> trem…
## # … with abbreviated variable name ¹lifetime
The loaded datasets live in an environment which belong to the OmnipathR
package. Normally users don’t need to access this environment. As we see
below, omnipath_show_db
presents us all information availble by directly
looking at the environment:
OmnipathR:::omnipath.env$db$up_gs
## $name
## [1] "UniProt-GeneSymbol table"
##
## $last_used
## [1] "2022-10-23 19:28:11 EDT"
##
## $lifetime
## [1] 300
##
## $package
## [1] "OmnipathR"
##
## $loader
## [1] "uniprot_full_id_mapping_table"
##
## $loader_param
## $loader_param$to
## [1] "genesymbol"
##
## $loader_param$organism
## [1] 9606
##
##
## $latest_param
## $latest_param$to
## [1] "genesymbol"
##
## $latest_param$organism
## [1] 9606
##
##
## $loaded
## [1] TRUE
##
## $db
## # A tibble: 20,372 × 2
## From To
## <chr> <chr>
## 1 P63120 ERVK-19
## 2 Q96EC8 YIPF6
## 3 Q6ZMS4 ZNF852
## 4 Q8N8L2 ZNF491
## 5 Q15916 ZBTB6
## 6 O60384 ZNF861P
## 7 Q3MIS6 ZNF528
## 8 Q86UK7 ZNF598
## 9 Q6P280 ZNF529
## 10 Q969W1 ZDHHC16
## # … with 20,362 more rows
The default expiry of datasets is given by the option omnipath.db_lifetime
.
By calling omnipath_save_config
this option is saved to the default config
file and will be valid in all subsequent sessions. Otherwise it’s valid only
in the current session.
options(omnipath.db_lifetime = 600)
omnipath_save_config()
The built-in dataset definitions are in a JSON file shipped with the package. Easiest way to see it is by the git web interface.
Currently no API available for this, but it would be super easy to implement. It would be matter of providing a JSON similar to the above, or calling a function. Please open an issue if you are interested in this feature.
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_GB
## [4] LC_COLLATE=C LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] OmnipathR_3.5.25 BiocStyle_2.25.0
##
## loaded via a namespace (and not attached):
## [1] progress_1.2.2 tidyselect_1.2.0 xfun_0.34 bslib_0.4.0 purrr_0.3.5
## [6] vctrs_0.5.0 generics_0.1.3 htmltools_0.5.3 yaml_2.3.6 utf8_1.2.2
## [11] rlang_1.0.6 jquerylib_0.1.4 pillar_1.8.1 later_1.3.0 withr_2.5.0
## [16] glue_1.6.2 DBI_1.1.3 selectr_0.4-2 rappdirs_0.3.3 bit64_4.0.5
## [21] readxl_1.4.1 lifecycle_1.0.3 stringr_1.4.1 cellranger_1.1.0 rvest_1.0.3
## [26] evaluate_0.17 knitr_1.40 tzdb_0.3.0 fastmap_1.1.0 parallel_4.2.1
## [31] curl_4.3.3 fansi_1.0.3 Rcpp_1.0.9 readr_2.1.3 backports_1.4.1
## [36] checkmate_2.1.0 BiocManager_1.30.18 cachem_1.0.6 vroom_1.6.0 jsonlite_1.8.3
## [41] bit_4.0.4 hms_1.1.2 digest_0.6.30 stringi_1.7.8 bookdown_0.29
## [46] dplyr_1.0.10 cli_3.4.1 tools_4.2.1 magrittr_2.0.3 logger_0.2.2
## [51] sass_0.4.2 tibble_3.1.8 tidyr_1.2.1 crayon_1.5.2 pkgconfig_2.0.3
## [56] ellipsis_0.3.2 xml2_1.3.3 prettyunits_1.1.1 assertthat_0.2.1 rmarkdown_2.17
## [61] httr_1.4.4 R6_2.5.1 igraph_1.3.5 compiler_4.2.1