Chapter 2 Learning R
2.1 Links to R tutorials
There are many, many online resources available for learning how to use R. To name a few:
- The R for data science book, which is a fairly enjoyable read though it focuses heavily on a specific dialect of the R language.
- A free course from Codecademy, which uses a web-based console; this allows people to start learning without actually installing R on their own computers.
- A free course from EdX, which focuses on the use of R’s statistical functionality.
- An Introduction to R, a definitive description of R that is best read after some basic familiarity has been established.
We will not attempt to repeat the contents of these resources here, as they already do a good job of explaining themselves.
2.2 Code formatting
This book contains code chunks interspersed with results, plots and explanatory text. Code chunks contain R code that is to be evaluated, and interested readers can copy-paste these lines into the R console to try it out themselves. Each code chunk looks like this:
Terms are colored differently depending on their category - this is mostly aesthetic and can be ignored for the time being. If a code chunk produces any visible output, it is shown in another chunk like so:
## [1] 10
Alternatively, as a figure:
Any text after a #
is considered a comment and is ignored when running the code.
The content of output chunks is always prefixed with #
so that users can just copy-paste sections of code without having to explicitly remove the lines containing the results.
In some chapters, chunks may also be hidden in collapsible boxes. This usually contains code to set up objects for later steps but is otherwise not particularly interesting (e.g., downloading files, formatting data), and so is hidden to avoid distracting the reader.
All chapters will finish with a printout of the session information. This describes the system on which the chapter was compiled and the versions of all packages that were used, which is useful for reproducing old results and diagnosing changes due to package updates.
2.3 Getting help
If you have a question about how a function works, it can often be answered by the function’s documentation.
This is accessible by prepending the function name with ?
.
More general questions on how to use a package may be answered by the package’s vignette, if it is available. (One aspect of Bioconductor software that distinguishes it from CRAN packages is the required documentation of packages and workflows.)
vignette(package='SingleCellExperiment') # list all available vignettes
vignette(package='SingleCellExperiment', topic='intro') # open specific vignette
Beyond the R console, there are myriad online resources to get help. The R for Data Science book has a great section dedicated to looking for help outside of R. For example, Stack Overflow’s R tag is a helpful resource for asking and exploring general R programming questions.
For Bioconductor specifically, the support site contains a question and answer-style support site that is actively updated by both users and package developers. This should generally be the first port of call for questions that are not answered by any existing documentation.
Users can also connect to the Bioconductor community through our Slack group, which hosts various channels dedicated to packages and workflows. The Bioc-community Slack is a great way to stay in the loop on the latest developments happening across Bioconductor, and we recommend exploring the “Channels” section to find topics of interest.
2.4 Beyond the basics
Once comfortable with the basic concepts of the language, we take things to the next level:
- Advanced R, as its name suggests, goes through some of the more advanced concepts in the language.
- The aptly named What They Forgot to Teach You About R discusses topics such as file naming, maintaining an R installation, and reproducible analysis habits.
- The R Inferno dives into many of the unique quirks of R and some of the common user mistakes.
- Happy Git and Github for the useR, which describes how to use the Git version control system with R.
Over time, you may accumulate a collection of your own functions that you might want to re-use across projects or even share with other people. This can be done easily by creating your own R package. The R Packages book provides a user-friendly guide for doing so; more experienced developers will consult Writing R extensions, the definitive documentation for the R packaging system. Bioconductor itself also provides some educational resources for package development within the Bioconductor context.
Session Info
R version 4.3.0 RC (2023-04-13 r84269)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS
Matrix products: default
BLAS: /home/biocbuild/bbs-3.17-bioc/R/lib/libRblas.so
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] BiocStyle_2.28.0 rebook_1.10.0
loaded via a namespace (and not attached):
[1] cli_3.6.1 knitr_1.42 rlang_1.1.0
[4] xfun_0.39 highr_0.10 CodeDepends_0.6.5
[7] jsonlite_1.8.4 dir.expiry_1.8.0 htmltools_0.5.5
[10] XML_3.99-0.14 graph_1.78.0 sass_0.4.5
[13] stats4_4.3.0 rmarkdown_2.21 evaluate_0.20
[16] jquerylib_0.1.4 filelock_1.0.2 fastmap_1.1.1
[19] yaml_2.3.7 bookdown_0.33 BiocManager_1.30.20
[22] compiler_4.3.0 codetools_0.2-19 digest_0.6.31
[25] R6_2.5.1 bslib_0.4.2 tools_4.3.0
[28] BiocGenerics_0.46.0 cachem_1.0.7