The software in this package aims to support refinements and functional interpretation of members of a collection of association statistics on a family of feature \(\times\) genome hypotheses. provide a basis for refinement or functional interpretation.
We take for granted the use of the gQTL* infrastructure for testing and management of test results. We use for examples elements of the geuvPack and geuvStore2 packages.
We work with a ciseStore
instance based on a small subset of transcriptome-wide cis-eQTL tests for GEUVADIS FPKM data. The overall testing procedure was conducted for all SNP:probe pairs for which SNP minor allele frequency (MAF) is at least 1% and for which the minimum distance between SNP and either boundary of the gene coding region for the probe is at most 1 million bp.
library(geuvStore2)
library(gQTLBase)
library(gQTLstats)
library(parallel)
nco = detectCores()
library(doParallel)
## Loading required package: foreach
## Loading required package: iterators
registerDoSEQ()
if (.Platform$OS.type != "windows") {
registerDoParallel(cores=max(c(1, floor(nco/2))))
}
prst = makeGeuvStore2()
Quantile estimation is very memory-efficient, based on a temporary ff representation of the vector of all association test results.
qassoc = storeToQuantiles(prst, field="chisq",
probs=c(seq(0,.999,.001), 1-(c(1e-4,1e-5,1e-6))))
tail(qassoc)
## 99.7% 99.8% 99.9% 99.99% 99.999% 99.9999%
## 20.82392 29.06245 54.93786 193.82782 220.55134 235.75722
Because we compute fixed breaks, contributions to the overall histogram can be assembled in parallel, with small footprint. This is a tremendous reduction of data.
hh = storeToHist( prst, breaks= c(0,qassoc,1e9) )
tail(hh$counts)
## [1] 4333 1933 416 0 0 1
FDR computation is post-hoc relative to filtering that need not be specified prior to testing. For illustration, we survey the results in geuvStore2 to obtain FDRs for each SNP:probe pair in two forms. First, we obtain FDR without any filtering. Second, we compute an a FDR for those SNP:probe pairs separated by at most 500kb, and for which the MAF for the SNP is at least 5 per cent.
rawFDR = storeToFDR(prst,
xprobs=c(seq(.05,.95,.05),.975,.990,.995,.9975,.999,
.9995, .9999, .99999) )
## counting tests...
## counting #NA...
## obtaining assoc quantiles...
## computing perm_assoc histogram....
dmfilt = function(x) # define the filtering function
x[ which(x$MAF >= 0.05 & x$mindist <= 500000) ]
filtFDR = storeToFDR(prst,
xprobs=c(seq(.05,.95,.05),.975,.990,.995,.9975,.999,
.9995, .9999, .99999), filter = dmfilt )
## counting tests...
## counting #NA...
## obtaining assoc quantiles...
## computing perm_assoc histogram....
rawFDR
## FDRsupp instance with 27 rows.
## assoc fdr ncalls avg.false
## 5% 0.003666364 0.9503595 5874027 5582437
## 10% 0.014663963 0.9489049 5564867 5280530
## 15% 0.033307623 0.9466335 5255708 4975229
## ...
## assoc fdr ncalls avg.false
## 99.95% 157.9120 5.390964e-05 3091.59300 0.1666667
## 99.99% 193.8278 2.695482e-04 618.31860 0.1666667
## 99.999% 220.5513 0.000000e+00 61.83186 0.0000000
## No interpolating function is available; use 'setFDRfunc'.
filtFDR
## FDRsupp instance with 27 rows.
## assoc fdr ncalls avg.false
## 5% 0.004052362 0.9461793 2102872 1989694
## 10% 0.016469885 0.9415121 1992194 1875675
## 15% 0.037696938 0.9363941 1881517 1761841
## ...
## assoc fdr ncalls avg.false
## 99.95% 193.8278 0 1106.77450 0
## 99.99% 194.9828 0 221.35490 0
## 99.999% 228.3958 0 22.13549 0
## No interpolating function is available; use 'setFDRfunc'.
The filtering leads to a lower FDR for a given strength of association. This is an inspiration for sensitivity analysis. Even with 5 million observations there is an effect of histogram bin selection in summarizing the permutation distribution of association. This can be seen fairly clearly in the wiggliness of the trace over the unfiltered association score:FDR plot.
rawtab = getTab(rawFDR)
filttab = getTab(filtFDR)
plot(rawtab[-(1:10),"assoc"],
-log10(rawtab[-(1:10),"fdr"]+1e-6), log="x", axes=FALSE,
xlab="Observed association", ylab="-log10 plugin FDR")
axis(1, at=c(seq(0,10,1),100,200))
axis(2)
points(filttab[-(1:10),1], -log10(filttab[-(1:10),2]+1e-6), pch=2)
legend(1, 5, pch=c(1,2), legend=c("all loci", "MAF >= 0.05 & dist <= 500k"))
We’ll address this below by fitting smooth functions for the score:FDR relationship.
The storeToFDRByProbe
FDR function examines the maximal association score by gene, for observed and permuted measures.
Good performance of this procedure is obtained by using group_by
and summarize
utilities of dplyr. Iteration employs foreach.
fdbp = storeToFDRByProbe( prst, xprobs=c(seq(.025,.975,.025),.99))
tail(getTab(fdbp),5)
fdAtM05bp = storeToFDRByProbe( prst, filter=function(x) x[which(x$MAF > .05)],
xprobs=c(seq(.025,.975,.025),.99))
tail(getTab(fdAtM05bp),5)
We’ll focus here on all-pairs analysis, with and without filtering.
Especially in this small example there will be some wiggling or even non-monotonicity in the trace of empirical FDR against association. We want to be able to compute the approximate FDR quickly and with minimal assumptions and pathology. To accomplish this, we will bind an interpolating model to the FDR estimates that we have. Interpolation will be accomplished with scatterplot smoothing in the mgcv framework.
The code that is used to fit the interpolating model is
fdrmod = gam(-log10(fdr+fudge)~s(assoc,bs="tp"), data=...,
subset=assoc<(1.1*maxch))
where fudge defaults to 1e-6 and maxch defaults to 30
library(mgcv)
## Loading required package: nlme
##
## Attaching package: 'nlme'
## The following object is masked from 'package:BBmisc':
##
## collapse
## The following object is masked from 'package:IRanges':
##
## collapse
## This is mgcv 1.8-31. For overview type 'help("mgcv-package")'.
rawFDR = setFDRfunc(rawFDR)
filtFDR = setFDRfunc(filtFDR)
par(mfrow=c(2,2))
txsPlot(rawFDR)
txsPlot(filtFDR)
directPlot(rawFDR)
directPlot(filtFDR)
More work is needed on assessing tolerability of relative error in FDR interpolation.
Recall that dmfilt
is a function that obtains the SNP-probe pairs for which SNP has MAF at least five percent and SNP-probe distance at most 500kbp.
We use the FDRsupp
instances with ciseStore
to list the SNP-probe pairs with FDR lying beneath a given upper bound.
Unfiltered pairs:
rawEnum = enumerateByFDR(prst, rawFDR, threshold=.05)
rawEnum[order(rawEnum$chisq,decreasing=TRUE)[1:3]]
## GRanges object with 3 ranges and 15 metadata columns:
## seqnames ranges strand | paramRangeID REF
## <Rle> <IRanges> <Rle> | <factor> <DNAStringSet>
## 55 14 106552724 * | ENSG00000211968.2 C
## 15 1 54683925 * | ENSG00000231581.1 G
## 15 1 54685855 * | ENSG00000231581.1 G
## ALT chisq permScore_1 permScore_2 permScore_3 permScore_4
## <CharacterList> <numeric> <numeric> <numeric> <numeric> <numeric>
## 55 T 244.347 0.0501552 0.0619204 1.11877 0.0128366
## 15 A 242.443 3.8005324 0.0717419 0.31365 0.0787366
## 15 A 242.443 3.8005324 0.0717419 0.31365 0.0787366
## permScore_5 permScore_6 snp MAF probeid mindist
## <numeric> <numeric> <character> <numeric> <character> <numeric>
## 55 0.0108659 0.000311471 rs587662269 0.00449438 ENSG00000211968.2 525649
## 15 4.9822362 0.044368761 rs6621 0.12584270 ENSG00000231581.1 6256
## 15 4.9822362 0.044368761 rs33988698 0.12584270 ENSG00000231581.1 4326
## estFDR
## <numeric>
## 55 0
## 15 0
## 15 0
## -------
## seqinfo: 86 sequences from hg19 genome
length(rawEnum)
## [1] 44750
A small quantity of metadata is bound into the resulting GRanges
instance.
names(metadata(rawEnum))
## [1] "sessInfo" "init.Random.seed" "dimSummex"
## [4] "rowRangesSummex" "vcf.tf" "vcfHeader"
## [7] "requestSize" "nRequestsSatisfied" "dimliteGT"
## [10] "theCall" "enumCall" "enumSess"
## [13] "fdrCall"
Pairs meeting MAF and distance conditions are obtained with a filter
setting to the enumerating function.
filtEnum = enumerateByFDR(prst, filtFDR, threshold=.05,
filter=dmfilt)
filtEnum[order(filtEnum$chisq,decreasing=TRUE)[1:3]]
## GRanges object with 3 ranges and 15 metadata columns:
## seqnames ranges strand | paramRangeID REF
## <Rle> <IRanges> <Rle> | <factor> <DNAStringSet>
## 15 1 54683925 * | ENSG00000231581.1 G
## 15 1 54685855 * | ENSG00000231581.1 G
## 15 1 54683014 * | ENSG00000231581.1 C
## ALT chisq permScore_1 permScore_2 permScore_3 permScore_4
## <CharacterList> <numeric> <numeric> <numeric> <numeric> <numeric>
## 15 A 242.443 3.80053 0.0717419 0.313650 0.0787366
## 15 A 242.443 3.80053 0.0717419 0.313650 0.0787366
## 15 G 241.973 4.22501 0.0223164 0.209243 0.0274453
## permScore_5 permScore_6 snp MAF probeid mindist
## <numeric> <numeric> <character> <numeric> <character> <numeric>
## 15 4.98224 0.0443688 rs6621 0.125843 ENSG00000231581.1 6256
## 15 4.98224 0.0443688 rs33988698 0.125843 ENSG00000231581.1 4326
## 15 4.32654 0.0510412 rs1410896 0.122472 ENSG00000231581.1 7167
## estFDR
## <numeric>
## 15 0
## 15 0
## 15 0
## -------
## seqinfo: 86 sequences from hg19 genome
length(filtEnum)
## [1] 81837
The yield of an enumeration procedure depends on filtering based on SNP-gene distance and SNP MAF. This can be illustrated as follows, with minimal computational effort owing to the retention of genome-scale permutations and the use of the plug-in FDR algorithm.
data(sensByProbe) # see example(senstab) for construction approach
tab = senstab( sensByProbe )
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
plot(tab)
If we wish to maximize the yield of eQTL enumeration at FDR at most 0.05, we can apply a filter to the store.
flens = storeApply( prst, function(x) {
length(x[ which(x$MAF >= .08 & x$mindist <= 25000), ] )
})
sum(unlist(flens))
## [1] 175988
This is a count of gene-snp pairs satisfying structural and genetic criteria.
In the case of geuFPKM
there is some relevant metadata in the rowRanges
element. We will bind that into the collection of significant findings.
library(geuvPack)
data(geuFPKM)
basic = mcols(rowRanges(geuFPKM))[, c("gene_id", "gene_status", "gene_type",
"gene_name")]
rownames(basic) = basic$gene_id
extr = basic[ filtEnum$probeid, ]
mcols(filtEnum) = cbind(mcols(filtEnum), extr)
stopifnot(all.equal(filtEnum$probeid, filtEnum$gene_id))
filtEnum[1:3]
## GRanges object with 3 ranges and 19 metadata columns:
## seqnames ranges strand | paramRangeID REF
## <Rle> <IRanges> <Rle> | <factor> <DNAStringSet>
## 1 1 940005 * | ENSG00000215915.5 A
## 1 1 941539 * | ENSG00000215915.5 C
## 1 1 1357992 * | ENSG00000215915.5 C
## ALT chisq permScore_1 permScore_2 permScore_3 permScore_4
## <CharacterList> <numeric> <numeric> <numeric> <numeric> <numeric>
## 1 G 6.01991 0.1080638 0.000327623 0.0865779 0.0043722
## 1 T 6.41084 0.0936188 0.007244507 0.0756189 0.3935886
## 1 T 6.10357 1.3783709 0.184207535 0.6716683 3.3793913
## permScore_5 permScore_6 snp MAF probeid mindist
## <numeric> <numeric> <character> <numeric> <character> <numeric>
## 1 0.0766232 0.3580752 rs2799056 0.4146067 ENSG00000215915.5 445064
## 1 0.0731399 0.3412637 rs9778087 0.4191011 ENSG00000215915.5 443530
## 1 1.3557961 0.0174709 rs3737716 0.0651685 ENSG00000215915.5 27077
## estFDR gene_id gene_status gene_type gene_name
## <numeric> <character> <character> <character> <character>
## 1 0.0488083 ENSG00000215915.5 KNOWN protein_coding ATAD3C
## 1 0.0314624 ENSG00000215915.5 KNOWN protein_coding ATAD3C
## 1 0.0445583 ENSG00000215915.5 KNOWN protein_coding ATAD3C
## -------
## seqinfo: 86 sequences from hg19 genome
We have a utility to create an annotated Manhattan plot for a search cis to a gene. The basic ingredients are
ciseStore
instance for basic location and association informationFDRsupp
instance that includes the function that maps from association scores to FDR, and the filter employed during FDR estimationhmm878
GRanges instance in gQTLstats/data.It is important to recognize that, given an FDRsupp
instance we can compute the FDR for any association score, but validity of the FDR attribution requires that we refrain from computing it for any locus excluded by filtering. the manhWngr
executes the FDRsupp
-resident filter by default.
data(hmm878)
library(geuvStore2)
prst = makeGeuvStore2()
myg = "ENSG00000183814.10" # LIN9
data(filtFDR)
library(ggplot2)
manhWngr( store = prst, probeid = myg, sym="LIN9",
fdrsupp=filtFDR, namedGR=hmm878 )
## assuming Homo.sapiens for gene models
## 'select()' returned 1:1 mapping between keys and columns
For a dynamic visualization procedure, see the vjcitn/gQTLbrowse github archive.
We can use VariantAnnotation to establish basic structural characteristics for all filtered variants. This code is blocked owing to changes to seqlevelsStyle that need to be reconciled at a deeper level. Please notify stvjc at channing.harvard.edu if there is a need to run this code.
suppressPackageStartupMessages({
library(VariantAnnotation)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
})
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene
seqlevelsStyle(filtEnum) = "UCSC"
#seqinfo(filtEnum) = seqinfo(txdb)
seqlengths(filtEnum)[paste0("chr", c(1:22,"M"))] =
seqlengths(txdb)[paste0("chr", c(1:22,"M"))]
filtEnum = keepStandardChromosomes(filtEnum)
suppressWarnings({
allv = locateVariants(filtEnum, txdb, AllVariants()) # multiple recs per eQTL
})
table(allv$LOCATION)
hits = findOverlaps( filtEnum, allv )
filtEex = filtEnum[ queryHits(hits) ]
mcols(filtEex) = cbind(mcols(filtEex), mcols(allv[subjectHits(hits)])[,1:7])
filtEex[1:3]
The resulting table is SNP:transcript specific, and will likely need further processing.
The following tasks need to be addressed in the modeling of phenorelevance
We will make a temporary reconstruction of geuvStore2 contents with the enhanced information.
The workhorse function is AllAssoc. The interface is
args(AllAssoc)
## function (summex, vcf.tf, variantRange, rhs = ~1, nperm = 3,
## genome = "hg19", assayind = 1, lbmaf = 1e-06, lbgtf = 1e-06,
## dropUnivHet = TRUE, infoFields = c("LDAF", "SVTYPE"))
## NULL
This differs from cisAssoc through the addition of a variantRange
argument.
The basic operation will be as follows. For a given RangedSummarizedExperiment instance summex
, all features will be tested for association with all SNP in the variantRange
restriction of the VCF identified in vcf.tf
. The basic iteration strategy is
tile the genome to obtain chunks of SNPs
decompose the SE into chunks of transcriptome (or other ’ome)
for each chunk of SNPs, for each chunk of transcriptome, seek associations and retain the top K in a buffering structure
Management of this buffering structure needs work.
require(GenomeInfoDb)
require(geuvPack)
require(Rsamtools)
## Loading required package: Rsamtools
## Loading required package: Biostrings
## Loading required package: XVector
##
## Attaching package: 'Biostrings'
## The following object is masked from 'package:base':
##
## strsplit
data(geuFPKM) # get a ranged summarized expt
lgeu = geuFPKM[ which(seqnames(geuFPKM)=="chr20"), ] # limit to chr20
seqlevelsStyle(lgeu) = "NCBI"
tf20 = TabixFile(system.file("vcf/c20exch.vcf.gz", package="gQTLstats"))
if (require(VariantAnnotation)) scanVcfHeader(tf20)
## Loading required package: VariantAnnotation
##
## Attaching package: 'VariantAnnotation'
## The following object is masked from 'package:base':
##
## tabulate
## class: VCFHeader
## samples(1092): HG00096 HG00097 ... NA20826 NA20828
## meta(2): fileformat reference
## fixed(2): FILTER ALT
## info(22): LDAF AVGPOST ... VT SNPSOURCE
## geno(3): GT DS GL
set.seed(1234)
mysr = GRanges("20", IRanges(33.099e6, 33.52e6))
lita = AllAssoc(geuFPKM[1:10,], tf20, mysr)
## Warning in .local(x, ...): non-diploid variants are set to NA
## checking for universal heterozygous loci for exclusion (as dropUnivHet == TRUE) ...
## done checking.
## Warning in col.summary(gtdata[[1]]): 238 rows were empty - ignored when
## calculating call rates
## Warning in col.summary(gtdata[[1]]): non-diploid variants are set to NA
## Warning in col.summary(gtdata$genotypes): 238 rows were empty - ignored when
## calculating call rates
names(mcols(lita))
## [1] "paramRangeID" "REF"
## [3] "ALT" "ENSG00000152931.6_obs"
## [5] "ENSG00000183696.9_obs" "ENSG00000139269.2_obs"
## [7] "ENSG00000169129.8_obs" "ENSG00000134602.11_obs"
## [9] "ENSG00000136237.12_obs" "ENSG00000259425.1_obs"
## [11] "ENSG00000242284.2_obs" "ENSG00000235027.1_obs"
## [13] "ENSG00000228169.3_obs" "ENSG00000152931.6_permScore_1"
## [15] "ENSG00000183696.9_permScore_1" "ENSG00000139269.2_permScore_1"
## [17] "ENSG00000169129.8_permScore_1" "ENSG00000134602.11_permScore_1"
## [19] "ENSG00000136237.12_permScore_1" "ENSG00000259425.1_permScore_1"
## [21] "ENSG00000242284.2_permScore_1" "ENSG00000235027.1_permScore_1"
## [23] "ENSG00000228169.3_permScore_1" "ENSG00000152931.6_permScore_2"
## [25] "ENSG00000183696.9_permScore_2" "ENSG00000139269.2_permScore_2"
## [27] "ENSG00000169129.8_permScore_2" "ENSG00000134602.11_permScore_2"
## [29] "ENSG00000136237.12_permScore_2" "ENSG00000259425.1_permScore_2"
## [31] "ENSG00000242284.2_permScore_2" "ENSG00000235027.1_permScore_2"
## [33] "ENSG00000228169.3_permScore_2" "ENSG00000152931.6_permScore_3"
## [35] "ENSG00000183696.9_permScore_3" "ENSG00000139269.2_permScore_3"
## [37] "ENSG00000169129.8_permScore_3" "ENSG00000134602.11_permScore_3"
## [39] "ENSG00000136237.12_permScore_3" "ENSG00000259425.1_permScore_3"
## [41] "ENSG00000242284.2_permScore_3" "ENSG00000235027.1_permScore_3"
## [43] "ENSG00000228169.3_permScore_3" "snp"
## [45] "MAF" "z.HWE"
## [47] "probeid"
The trans search for this segment of chr20 proceeds by obtaining additional association scores for additional genes.
litb = AllAssoc(geuFPKM[11:20,], tf20, mysr)
## Warning in .local(x, ...): non-diploid variants are set to NA
## checking for universal heterozygous loci for exclusion (as dropUnivHet == TRUE) ...
## done checking.
## Warning in col.summary(gtdata[[1]]): 238 rows were empty - ignored when
## calculating call rates
## Warning in col.summary(gtdata[[1]]): non-diploid variants are set to NA
## Warning in col.summary(gtdata$genotypes): 238 rows were empty - ignored when
## calculating call rates
litc = AllAssoc(geuFPKM[21:30,], tf20, mysr)
## Warning in .local(x, ...): non-diploid variants are set to NA
## checking for universal heterozygous loci for exclusion (as dropUnivHet == TRUE) ...
## done checking.
## Warning in col.summary(gtdata[[1]]): 238 rows were empty - ignored when
## calculating call rates
## Warning in .local(x, ...): non-diploid variants are set to NA
## Warning in col.summary(gtdata$genotypes): 238 rows were empty - ignored when
## calculating call rates
Now we want to reduce this information by collecting the strongest associations over the 30 genes tested.
buf = gQTLstats:::collapseToBuf(lita, litb, frag="_obs")
buf
## GRanges object with 504 ranges and 7 metadata columns:
## seqnames ranges strand | REF ALT
## <Rle> <IRanges> <Rle> | <DNAStringSet> <CharacterList>
## rs6120668 20 33099793 * | A G
## rs6059887 20 33101102 * | C G
## rs6059890 20 33101653 * | G C
## rs6088514 20 33102835 * | A G
## rs6058070 20 33103521 * | G C
## ... ... ... ... . ... ...
## rs4911451 20 33512466 * | T G
## rs6088650 20 33514465 * | T C
## rs725521 20 33516071 * | T C
## rs1801310 20 33517014 * | A G
## rs6087651 20 33518353 * | C T
## snp MAF z.HWE scorebuf
## <character> <numeric> <numeric> <matrix>
## rs6120668 rs6120668 0.4371585 -1.209946 2.89733:2.40489:2.29748:...
## rs6059887 rs6059887 0.4371585 -1.209946 2.89733:2.40489:2.29748:...
## rs6059890 rs6059890 0.4371585 -1.209946 2.89733:2.40489:2.29748:...
## rs6088514 rs6088514 0.0655738 -0.257106 3.31124:3.02917:2.96208:...
## rs6058070 rs6058070 0.4344262 -1.342787 3.12023:2.43881:2.36666:...
## ... ... ... ... ...
## rs4911451 rs4911451 0.407104 0.100832 6.54073:4.0809:3.05186:...
## rs6088650 rs6088650 0.407104 0.100832 6.54073:4.0809:3.05186:...
## rs725521 rs725521 0.407104 0.100832 6.54073:4.0809:3.05186:...
## rs1801310 rs1801310 0.407104 0.100832 6.54073:4.0809:3.05186:...
## rs6087651 rs6087651 0.407104 0.100832 6.54073:4.0809:3.05186:...
## elnames
## <matrix>
## rs6120668 ENSG00000242284.2:ENSG00000259425.1:ENSG00000247157.2:...
## rs6059887 ENSG00000242284.2:ENSG00000259425.1:ENSG00000247157.2:...
## rs6059890 ENSG00000242284.2:ENSG00000259425.1:ENSG00000247157.2:...
## rs6088514 ENSG00000247157.2:ENSG00000183696.9:ENSG00000136237.12:...
## rs6058070 ENSG00000242284.2:ENSG00000205981.2:ENSG00000247157.2:...
## ... ...
## rs4911451 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## rs6088650 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## rs725521 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## rs1801310 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## rs6087651 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## -------
## seqinfo: 1 sequence from hg19 genome; no seqlengths
buf = gQTLstats:::collapseToBuf(buf, litc, frag="_obs")
buf
## GRanges object with 504 ranges and 7 metadata columns:
## seqnames ranges strand | REF ALT
## <Rle> <IRanges> <Rle> | <DNAStringSet> <CharacterList>
## rs6120668 20 33099793 * | A G
## rs6059887 20 33101102 * | C G
## rs6059890 20 33101653 * | G C
## rs6088514 20 33102835 * | A G
## rs6058070 20 33103521 * | G C
## ... ... ... ... . ... ...
## rs4911451 20 33512466 * | T G
## rs6088650 20 33514465 * | T C
## rs725521 20 33516071 * | T C
## rs1801310 20 33517014 * | A G
## rs6087651 20 33518353 * | C T
## snp MAF z.HWE scorebuf
## <character> <numeric> <numeric> <matrix>
## rs6120668 rs6120668 0.4371585 -1.209946 2.91722:2.89733:2.40489:...
## rs6059887 rs6059887 0.4371585 -1.209946 2.91722:2.89733:2.40489:...
## rs6059890 rs6059890 0.4371585 -1.209946 2.91722:2.89733:2.40489:...
## rs6088514 rs6088514 0.0655738 -0.257106 3.57588:3.31124:3.02917:...
## rs6058070 rs6058070 0.4344262 -1.342787 3.12023:2.43881:2.36666:...
## ... ... ... ... ...
## rs4911451 rs4911451 0.407104 0.100832 6.54073:4.0809:3.05186:...
## rs6088650 rs6088650 0.407104 0.100832 6.54073:4.0809:3.05186:...
## rs725521 rs725521 0.407104 0.100832 6.54073:4.0809:3.05186:...
## rs1801310 rs1801310 0.407104 0.100832 6.54073:4.0809:3.05186:...
## rs6087651 rs6087651 0.407104 0.100832 6.54073:4.0809:3.05186:...
## elnames
## <matrix>
## rs6120668 ENSG00000198632.7:ENSG00000242284.2:ENSG00000259425.1:...
## rs6059887 ENSG00000198632.7:ENSG00000242284.2:ENSG00000259425.1:...
## rs6059890 ENSG00000198632.7:ENSG00000242284.2:ENSG00000259425.1:...
## rs6088514 ENSG00000017260.13:ENSG00000247157.2:ENSG00000183696.9:...
## rs6058070 ENSG00000242284.2:ENSG00000205981.2:ENSG00000247157.2:...
## ... ...
## rs4911451 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## rs6088650 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## rs725521 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## rs1801310 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## rs6087651 ENSG00000152931.6:ENSG00000158482.8:ENSG00000259425.1:...
## -------
## seqinfo: 1 sequence from hg19 genome; no seqlengths
Let’s do the same buffering process for the first permutation.
pbuf = gQTLstats:::collapseToBuf(lita, litb, frag="_permScore_1")
pbuf = gQTLstats:::collapseToBuf(pbuf, litc, frag="_permScore_1")
pbuf
## GRanges object with 504 ranges and 7 metadata columns:
## seqnames ranges strand | REF ALT
## <Rle> <IRanges> <Rle> | <DNAStringSet> <CharacterList>
## rs6120668 20 33099793 * | A G
## rs6059887 20 33101102 * | C G
## rs6059890 20 33101653 * | G C
## rs6088514 20 33102835 * | A G
## rs6058070 20 33103521 * | G C
## ... ... ... ... . ... ...
## rs4911451 20 33512466 * | T G
## rs6088650 20 33514465 * | T C
## rs725521 20 33516071 * | T C
## rs1801310 20 33517014 * | A G
## rs6087651 20 33518353 * | C T
## snp MAF z.HWE scorebuf
## <character> <numeric> <numeric> <matrix>
## rs6120668 rs6120668 0.4371585 -1.209946 4.59945:4.06076:3.83062:...
## rs6059887 rs6059887 0.4371585 -1.209946 4.59945:4.06076:3.83062:...
## rs6059890 rs6059890 0.4371585 -1.209946 4.59945:4.06076:3.83062:...
## rs6088514 rs6088514 0.0655738 -0.257106 2.50149:2.39476:2.03576:...
## rs6058070 rs6058070 0.4344262 -1.342787 4.72893:3.96870:3.87404:...
## ... ... ... ... ...
## rs4911451 rs4911451 0.407104 0.100832 2.78448:2.65189:2.26759:...
## rs6088650 rs6088650 0.407104 0.100832 2.78448:2.65189:2.26759:...
## rs725521 rs725521 0.407104 0.100832 2.78448:2.65189:2.26759:...
## rs1801310 rs1801310 0.407104 0.100832 2.78448:2.65189:2.26759:...
## rs6087651 rs6087651 0.407104 0.100832 2.78448:2.65189:2.26759:...
## elnames
## <matrix>
## rs6120668 ENSG00000235027.1:ENSG00000257773.1:ENSG00000249263.2:...
## rs6059887 ENSG00000235027.1:ENSG00000257773.1:ENSG00000249263.2:...
## rs6059890 ENSG00000235027.1:ENSG00000257773.1:ENSG00000249263.2:...
## rs6088514 ENSG00000147164.6:ENSG00000205981.2:ENSG00000228169.3:...
## rs6058070 ENSG00000235027.1:ENSG00000249263.2:ENSG00000257773.1:...
## ... ...
## rs4911451 ENSG00000163207.5:ENSG00000235027.1:ENSG00000183696.9:...
## rs6088650 ENSG00000163207.5:ENSG00000235027.1:ENSG00000183696.9:...
## rs725521 ENSG00000163207.5:ENSG00000235027.1:ENSG00000183696.9:...
## rs1801310 ENSG00000163207.5:ENSG00000235027.1:ENSG00000183696.9:...
## rs6087651 ENSG00000163207.5:ENSG00000235027.1:ENSG00000183696.9:...
## -------
## seqinfo: 1 sequence from hg19 genome; no seqlengths
We can compare the distributions of maximal association per SNP as observed or under permutation.
plot(density(buf$scorebuf[,1]))
lines(density(pbuf$scorebuf[,1]), lty=2)