--- title: "CoreGx: Class and Function Abstractions for PharmacoGx, RadioGx and ToxicoGx" author: - name: Petr Smirnov - name: Ian Smith - name: Christopher Eeles - name: Benjamin Haibe-Kains output: BiocStyle::html_document vignette: | %\VignetteEngine{knitr::knitr} %\VignetteIndexEntry{CoreGx: Class and Function Abstractions} --- # CoreGx ```{r setup, include=FALSE} library(knitr) knitr::opts_chunk$set(echo = TRUE) ``` This package provides a foundation for the PharmacoGx, RadioGx and ToxicoGx packages. It is not intended for standalone use, only as a dependency for the aforementioned software. Its' existence allows abstracting generic definitions, method definitions and class structures common to all three of the Gx suite packages. ## Importing and Using CoreGx Load the pacakge: ```{r coregx_load, hide=TRUE, message=FALSE} library(CoreGx) library(Biobase) library(SummarizedExperiment) ``` ## The CoreSet Class The CoreSet class is intended as a general purpose data structure for storing multiomic treatment response data. Extensions of this class have been customized for their respective fields of study. For example, the PharmacoSet class inherits form the CoreSet and is specialized for storing and analyzing drug sensitivity and perturbation experiments on cancer cell lines together with associated multiomic data for each sample treatment. The RadioSet class serves a role similar to the PharmacoSet with radiation instead of drug treaments. Finally, the ToxicoSet class is used to store toxicity data for healthy human and rat hepatocytes along with the associated multiomic profile for each treatment. ```{r CoreSet_class} getClass("CoreSet") ``` The `annotation` slot holds the CoreSet name, the original constructor call, and a range of metadata about the R session in which the constructor was called. This allows easy comparison of CoreSet versions across time and ensures the code used to generate a CoreSet is documented and reproducible. The `molecularProfiles` slot contains a list of `SummarizedExperiment` objects for each multi-omic molecular datatype available for a given experiment. Within the `SummarizedExperiments` are feature and sample annotations for each data type. The `cell` slot contains a `data.frame` with annotations for cell lines used in the sensitivty and/or perturbation slots. The `datasetType` slot contains a character vector indicating the experiment type the `CoreSet` contains. The `sensitivty` slot contains a list of raw, curated and meta data for sensitivity experiments. The `perturbation` slot contains a list of raw, curated and meta data for perturbation experiments. The `curation` slot contains a list of ground truth curations sample identifiers such as cell line names/ids, tissue names/ids, drug names/ids, etc. This slot is to assist in curating across experiment and molecular profile slots to esnure consistent nomenclature. The class provides a set of standardized accessor methods which allow easy curation, annotation and retrieval of data associated with a specfic treatment response experiment. All accessors are implemented as generics to allow new methods to be defined on classes inheriting from the CoreSet. ```{r CoreSet_accessors} methods(class="CoreSet") ``` We have provided a sample CoreSet in this package. In the below code we load the example cSet and demonstrate a few of the accessor methods. ```{r the_cSet_object} data(clevelandSmall_cSet) clevelandSmall_cSet ``` Access specific molecular profiles: ```{r cSet_accessor_demo} mProf <- molecularProfiles(clevelandSmall_cSet, "rna") mProf[seq_len(5), seq_len(5)] ``` Access cell line curations: ```{r cSet_accessor_demo1} cInfo <- cellInfo(clevelandSmall_cSet) cInfo[seq_len(5), seq_len(5)] ``` Access sensitivty data: ```{r cSet_accessor_demo2} sensProf <- sensitivityProfiles(clevelandSmall_cSet) sensProf[seq_len(5), seq_len(5)] ``` For more information about the accessor methods available for the `CoreSet` class please see the `class?CoreSet` help page. ## Extending the CoreSet Class Given that the CoreSet class is intended for extension, we will show some examples of how to define a new class based on it and implement new methods for the generics provided for the CoreSet class. Here we will define a new class, the `DemoSet`, with an additional slot, the `demoSlot`. We will then view the available methods for this class as well as define new S4 methods on it. ```{r defining_a_new_class} DemoSet <- setClass("DemoSet", representation(demoSlot="character"), contains="CoreSet") getClass("DemoSet") ``` Here we can see the class extending `CoreSet` has all of the same slots as the original `CoreSet`, plus the new slot we defined: `demoSlot`. We can see which methods are available for this new class. ```{r inherit_methods} methods(class="DemoSet") ``` We see that all the accessors defined for the `CoreSet` are also defined for the inherting `DemoSet`. These methods all assume the inherit slots have the same structure as the `CoreSet`. If this is not true, for example, if molecularProfiles holds `ExpressionSets` instead of `SummarizedExperiments`, we can redefine existing methods as follows: ```{r defining_methods_for_a_new_class} clevelandSmall_dSet <- DemoSet(clevelandSmall_cSet) class(clevelandSmall_dSet@molecularProfiles$rna) expressionSets <- lapply(molecularProfilesSlot(clevelandSmall_dSet), as, 'ExpressionSet') molecularProfilesSlot(clevelandSmall_dSet) <- expressionSets # Now this will error tryCatch({molecularProfiles(clevelandSmall_dSet, 'rna')}, error=function(e) print(paste("Error: ", e$message))) ``` Since we changed the data in the `molecularProfiles` slot of the `DemoSet`, the original method from `CoreGx` no longer works. Thus we get an error when trying to access that slot. To fix this we will need to set a new S4 method for the molecularProfiles generic function defined in `CoreGx`. ```{r redefine_the_method} setMethod(molecularProfiles, signature("DemoSet"), function(object, mDataType) { pData(object@molecularProfiles[[mDataType]]) }) ``` This new method is now called whenever we use the `molecularProfiles` method on a `DemoSet`. Since the new method uses `ExpressionSet` accessor methods instead of `SummarizedExperiment` accessor methods, we now expect to be able to access the data in our modified slot. ```{r testing_new_method} # Now we test our new method mProf <- molecularProfiles(clevelandSmall_dSet, 'rna') head(mProf)[seq_len(5), seq_len(5)] ``` We can see our new method works! In order to finish updating the methods for our new class, we would have to redefine all the generics which access the modified slot. However, additional work needs to be done to define accessors for the new `demoSlot`. Since no generics are available in CoreGx to access this slot, we need to first define a generic, then implement methods which distpatch on the 'DemoSet' class to retrieve data int he slot. ```{r defining_setter_methods} # Define generic for setter method setGeneric('demoSlot<-', function(object, value) standardGeneric('demoSlot<-')) # Define a setter method setReplaceMethod('demoSlot', signature(object='DemoSet', value="character"), function(object, value) { object@demoSlot <- value return(object) }) # Lets add something to our demoSlot demoSlot(clevelandSmall_dSet) <- c("This", "is", "the", "demoSlot") ``` ```{r defining_getter_methods} # Define generic for getter method setGeneric('demoSlot', function(object, ...) standardGeneric("demoSlot")) # Define a getter method setMethod("demoSlot", signature("DemoSet"), function(object) { paste(object@demoSlot, collapse=" ") }) # Test our getter method demoSlot(clevelandSmall_dSet) ``` Now you should have all the knowledge you need to extend the CoreSet class for use in other treatment-response experiments! For more information about this package and the possibility of collaborating on its extension please contact benjamin.haibe.kains@utoronto.ca.