I plot the q-values (FDR adjusted p-values) corresponding to the genome-wide significance \(p\)-value threshold of \(5\times 10^{-8}\). From this, I decide to use an FDR threshold of \(5e-06\) to call significant associations.
For each simulation, I calculate the sensitivity (the proportion of associated things that are significant) and the specificity (the proportion of not associated things that are not called significant) where “significant” is \(p_{BH}\leq 5e-06\), “associated” is \(r^2>0.8\) with a CV and “not associated” is \(r^2<0.2\) with all CVs. To check whether the FDR is actually controlled, I compare specificity to 1-FDR threshold (\(1-5e-06\)). But it seems that none of the methods are controlling FDR…
We first sought to investigate how our method compared to that described in Liley and Wallace (2020), which we call “empirical cFDR”. When leveraging indepedent auxiliary data, we found that the performance of our method was comparable to that of empirical cFDR, in terms of sensitivity and specificity. For simulations representing iteratively leverging p-values from related traits, we found that across iterations the empirical cFDR method had greater sensitivity, at the expense of losing specificity and therefore FDR control. On the other hand, for functional cFDR the sensitivity steadily increased across iterations, with a much less severe drop in specificity in latter iterations. Due to the leave-one-out procedure that is required for empirical cFDR but not functional cFDR (due to empirical CDFs being more heavily influenced by correlated values than KDEs), our method is faster than empirical cFDR, taking approximately 2 minutes as opposed to 5 minutes to complete a single simulation leveraging auxillary data for 80,000 SNPs (using one core of an Intel Xeon E5-2670 processor running at 2.6GHz).
The results when using functional cFDR with auxiliary data in the \([0,1]\) range are similar to those when using the original empirical method. However, leveraging auxiliary data outside of this range is not supported by the original empirical method, whilst functional cFDR can leverage auxiliary data from any arbitrary distribution. When iterating over independent auxiliary data sampled from a mixture normal distribution, the sensitivity and the specificity of the resultant v-values remain constant, showing that the functional cFDR method will not systematically adjust the test statistics when the auxiliary data that is being leveraged is irrelevant.
When leveraging relevant auxilary data that is sampled from a mixture normal distribution, the test statistics for associated SNPs are systematically re-weighted, shown by an increase in sensitivity without a loss in specificity.
We next sought to investigate the impact of iterating over auxiliary data that is sampled from the same distribution (so that the auxiliary data that we are iterating over is highly correlated (\(r_{spearman}\approx 0.2)\)). The purpose of these results is to instill caution in users to only iterate over auxiliary data which they believe is capturing different information. When iterating over the same auxiliary data, the sensitivity increases drastically but at the expense of a huge drop off in specificity.
[Note - I need to run this again using the final subset of SNPs]
Functional cFDR can be used to re-weight GWAS \(p\)-values leveraging scores measuring SNP functionality. For example, GenoCanyon scores are readily avaliable for millions of SNPs which represent the union of functional elements in different cell types. Leveraging GenoCanyon scores for each SNP in our Asthma data set, we compare re-weighted \(p\)-values from functional cFDR and posterior probabilities obtained from GenoWAP.
The results when applying GenoWAP and functional cFDR (both leveraging GenoCanyon scores) are largely comparable and are validated in the larger UKBB dataset. We ranked 1,978,017 SNPs in the Demenais study based on their P-values (demenais), v-values (functional cFDR) and posterior scores (GenoWAP). Then, for each of the 4247 SNPs that passed genome-wide significance in the UKBB dataset, we compared the rank of the unweighted P-values (demenais) with (1) the rank of the (negative) posterior score from GenoWAP and (2) the rank of the v-value from functional cFDR. 60.7% had equal or improved rank after applying GenoWAP compared to 64.7% after applying functional cFDR. 82.8% of the SNPs that had an improved or equal rank were shared between the two methods. Similarly, for the 1,964,405 that were not genome-wide significant in UKBB, 44.8% had a decreased or equal rank after applying GenoWAP, compared to 48.3% when applying functional cFDR (and **% of these SNPs were shared by both methods).
A total of 297 loci passed genome-wide significance level in the validation UKBB data set. Within each of the 297 loci, we identified the SNP with the lowest UKBB P-value (“lead-SNP”), and compared the rank of the unweighted P-values (demenais) with (1) the rank of the (negative) posterior score from GenoWAP and (2) the rank of the v-value from functional cFDR. 70.4% had an improved or equal rank after applying GenoWAP compared to 73.7% when applying functional cFDR. 86.9% (199/229) of the loci that had an improved or equal rank were shared between the two methods.
Ideally, we will compare results to those from FINDOR but this means that we should be leveraging the same functional data. In the FINDOR analysis, I leverage data for the 96 annotations in the baseline-LD V2.2 LD model. However, I don’t want to iterate 96 times and some of the annotations are not functional (LD-based and MAF bins). I remove these non-functional annotations and perform PCA (hoping that the non-monotonicity problem goes away).
I check a few methods:
Run PCAmix on the functional subset of annotations (77 of these) –> iterate over the 5 dimensions with the largest t-statistics when regressing co-ordinates on original p-values. Dimension 1 is first, which is monotonic, then dimension 9 which is not monotonic.
Run PCAmix on the functional subset of annotations (77 of these) –> on each iteration use regression to pick out the dimension to iterate over. Dimension 1 is first, which is monotonic, then dimension 4 which is not monotonic.
Run sparse grouped Lasso (the trade-off between these two types of sparsity is achieved through the alpha parameter: alpha=1 is the lasso fit so coefficients will be shrunk to 0 regardless of groups, alpha=0 means that all elements in a group will have null or non-null coefficients) on the functional subset of annotations and iterate over the annotations picked out - but the annotations that are picked out are all binary so this is highlighting binary cFDR and not our functional cFDR method.
Run Lasso to pick out the significant annotations and then use PCAmix. But using lambda.1se
doesn’t pick out any annotations, whilst lambda.min
picks out 54. Only 1 dimension is significant (dimension 48) when regressing lasso PCAmix dimensions on \(\chi^2\) values.
Comments and queries
Asthma application - leaning towards just using our monotonicity hack (pick out 5 dimensions by regressing log(p) on coords and then use monotonicity hack if required). E.g. method 1 which picks out dimensions 1, 9, 19, 6 and 3 accounts for 20% of the variability.
Should I ammend the functional cFDR functions to output BH adjusted v-values rather than raw v-values?
Binary cFDR.
Think I’m gonna go with “flexible cFDR” or “extended cFDR”.