I’ve started looking at the data from https://www.nature.com/articles/s41588-019-0505-9#Sec1 (Calderon et al., 2019). Specifically, dataset3 which contains significant stimulation-associated ATAC-seq peaks in a variety of immune cells (\(FDR<0.01\)).
I’ve mapped the SNPs from the Robertson 2020 T1D GWAS (https://www.biorxiv.org/content/10.1101/2020.06.19.158071v1) to any significant stimulation-associated peaks. I use the \(p\)-value of association from the meta-analysis of case-control and family-based estimates.
For example, in Robertson et al., (2020) they suggest that rs72928038 is the causal variant in the \(\textit{BACH2}\) gene region (although they incorrectly refer to this SNP as rs72938038 in the manuscript abstract). Our corrcoverage method also picked this SNP out as the CV in this region.
Across all the immune cells, this SNP only falls in a significant stimulation-associated ATAC-seq peak in effector CD4+ T cells. The logFC=-1.67756
value indicates that the peak is greater in unstimulated cells than stimulated cells.
## [1] "Effector_CD4pos_T_S-Effector_CD4pos_T_U"
## MarkerName ichipreg chromosome position ID alleleA alleleB
## 1: chr6:90267049:G:A yes 6 90267049 rs72928038 G A
## Effect_EUR SE_EUR AF_EUR EffectphaseIandII SEphaseIandII pphaseIandII
## 1: 0.172309 0.0202005 0.1731371 0.1635 0.0168 2.529e-22
## hg19_pos logFC AveExpr t P.Value adj.P.Val B
## 1: 90976768 -1.67756 1.500223 -3.895642 0.0001394576 0.004728978 0.4942096
## peak_id contrast start end chr
## 1: 54569 4 90975833 90977316 chr6
I look at the correlation between logFC values across conditions for the T1D SNPs (\(q\); if the SNP doesn’t fall in a significant peak then it is given a logFC
value of 0). We see B cells clustering, various T cell subtypes clustering and monocytes and mature NK cells seperating.
qs_logFC <- lapply(res, function(x){
x$logFC[is.na(x$logFC)] <- 0
x$logFC
})
library(pheatmap)
pheatmap(cor(do.call(cbind, qs_logFC)), display_numbers = TRUE)
An idea would be to use the T1D GWAS \(p\)-values as \(p\) in cFDR and leverage the logFC
values. However, very few of the T1D SNPs fall into significiant stimulation-associated ATAC peaks.
The proportion of SNPs that do not fall into a significant ATAC-peak in each condition is shown below:
lapply(qs_logFC, function(x) length(which(x==0))/length(x))
## $`Bulk_B_S-Bulk_B_U`
## [1] 0.9972784
##
## $`CD8pos_T_S-CD8pos_T_U`
## [1] 0.9901304
##
## $`Central_memory_CD8pos_T_S-Central_memory_CD8pos_T_U`
## [1] 0.9881444
##
## $`Effector_CD4pos_T_S-Effector_CD4pos_T_U`
## [1] 0.9816202
##
## $`Effector_memory_CD8pos_T_S-Effector_memory_CD8pos_T_U`
## [1] 0.9856424
##
## $`Follicular_T_Helper_S-Follicular_T_Helper_U`
## [1] 0.9825516
##
## $`Gamma_delta_T_S-Gamma_delta_T_U`
## [1] 0.9940589
##
## $`Mature_NK_S-Mature_NK_U`
## [1] 0.9994434
##
## $`Mem_B_S-Mem_B_U`
## [1] 0.9979385
##
## $`Memory_Teffs_S-Memory_Teffs_U`
## [1] 0.977535
##
## $`Memory_Tregs_S-Memory_Tregs_U`
## [1] 0.9948589
##
## $`Monocytes_S-Monocytes_U`
## [1] 0.9999413
##
## $`Naive_B_S-Naive_B_U`
## [1] 0.9957638
##
## $`Naive_CD8_T_S-Naive_CD8_T_U`
## [1] 0.9961596
##
## $`Naive_Teffs_S-Naive_Teffs_U`
## [1] 0.9825278
##
## $`Naive_Tregs_S-Naive_Tregs_U`
## [1] 0.9998867
##
## $`Regulatory_T_S-Regulatory_T_U`
## [1] 0.9920968
##
## $`Th1_precursors_S-Th1_precursors_U`
## [1] 0.9776049
##
## $`Th17_precursors_S-Th17_precursors_U`
## [1] 0.9816565
##
## $`Th2_precursors_S-Th2_precursors_U`
## [1] 0.9810048
I think the data is too sparse to use cFDR in this way.
Instead, I investigate whether this data can be useful to prioritise fine-mapped SNPs (from GUESSFM in immunochip regions).
I see whether any fine-mapped SNPs fall into any of the significant stimulation-associated peaks. 258 out of the 3612 SNPs fall into such peaks. Comparing the PPs for those SNPs that fall into these peaks and those that don’t, we see that there isn’t much difference.
## Warning: NAs introduced by coercion
## [1] 1.751487e-02 4.463159e-03 1.032981e-02 1.032981e-02 1.034641e-03
## [6] 9.777626e-04 9.659399e-04 9.291001e-04 4.831635e-03 2.979361e-03
## [11] 4.992721e-03 3.191598e-02 5.527923e-03 5.254263e-03 1.125276e-02
## [16] 6.335970e-03 1.610099e-03 5.075188e-03 3.278932e-01 4.719541e-02
## [21] 3.054125e-04 3.261938e-04 2.531616e-04 1.354480e-02 7.103902e-04
## [26] 2.471800e-04 6.892095e-03 2.800839e-02 4.981114e-02 3.506642e-02
## [31] 5.749459e-03 7.183161e-03 3.472816e-03 6.128756e-03 7.117475e-03
## [36] 4.605429e-03 6.682921e-03 4.582785e-03 6.682554e-03 4.568754e-03
## [41] NA 4.464318e-03 3.064808e-03 1.479767e-04 1.576589e-03
## [46] 1.661200e-03 1.982979e-03 1.902357e-03 2.802051e-03 2.515725e-03
## [51] 2.565474e-03 2.568997e-03 2.944060e-03 3.994876e-03 6.514253e-03
## [56] 6.977154e-03 4.128837e-03 9.594409e-02 9.594409e-02 1.747487e-02
## [61] 1.094140e-02 7.181717e-01 1.421529e-01 1.921072e-01 NA
## [66] NA 3.159697e-03 4.184411e-04 4.613359e-02 4.613359e-02
## [71] 2.406655e-02 5.160325e-02 3.181202e-05 3.261307e-05 9.823060e-04
## [76] 7.711309e-03 1.378706e-02 3.452535e-03 7.276422e-03 7.652751e-03
## [81] 3.164903e-03 4.522758e-03 4.424692e-03 1.923178e-03 1.162119e-03
## [86] 2.044447e-03 2.267487e-03 6.842337e-04 1.834072e-03 5.758467e-03
## [91] 6.515168e-03 4.322066e-03 2.805850e-02 5.407387e-02 5.537699e-02
## [96] 2.049351e-03 1.723237e-03 1.996392e-03 1.617782e-02 5.614851e-04
## [101] 4.564098e-04 4.114185e-04 7.172281e-04 2.128756e-04 3.196552e-04
## [106] 3.197025e-04 2.397953e-04 4.943888e-04 1.027246e-02 1.025522e-02
## [111] 9.864498e-02 2.846133e-03 2.533766e-03 5.879079e-03 2.587420e-03
## [116] 9.107300e-02 6.851087e-02 8.194620e-02 8.496321e-04 8.312506e-04
## [121] 7.226226e-03 8.124640e-03 8.012775e-03 8.603975e-03 7.751372e-03
## [126] NA 2.249115e-02 7.424428e-01 1.411434e-02 2.064321e-02
## [131] 2.575877e-01 6.053995e-03 2.853635e-01 4.514695e-03 3.925788e-03
## [136] 8.165681e-03 5.130291e-03 3.914713e-03 2.164883e-02 6.391962e-03
## [141] 3.278436e-03 1.105648e-02 5.735602e-04 5.856577e-04 7.215018e-02
## [146] 7.215018e-02 6.904728e-02 6.131015e-04 6.880149e-02 6.093419e-04
## [151] 1.021231e-02 3.145773e-03 1.489722e-01 4.318087e-02 5.866830e-04
## [156] 1.283955e-03 1.048954e-02 1.926414e-02 5.127204e-01 7.506242e-03
## [161] 6.462077e-03 6.791220e-03 7.731364e-03 2.118929e-02 1.459075e-02
## [166] 1.797221e-03 1.755946e-02 1.658057e-02 1.950617e-02 6.057796e-02
## [171] 6.216390e-02 5.627531e-05 3.363373e-04 3.439321e-04 3.637896e-04
## [176] 3.362721e-04 3.385012e-04 3.546080e-04 3.546080e-04 3.794357e-04
## [181] 3.362721e-04 8.358353e-05 5.393640e-02 1.123277e-02 6.489534e-01
## [186] 9.742872e-03 2.878103e-04 9.872701e-02 3.042943e-04 3.479917e-02
## [191] 3.344597e-02 3.346390e-02 3.345903e-02 3.345903e-02 3.575665e-02
## [196] 2.137051e-02 3.210469e-03 1.480818e-03 1.535025e-03 1.525499e-03
## [201] 1.035605e-03 7.358710e-04 1.538794e-04 1.369361e-02 2.360346e-03
## [206] 3.213706e-02 6.176134e-03 5.795382e-04 4.782437e-03 6.188101e-03
## [211] 5.391730e-03 5.565050e-03 1.428015e-02 1.275601e-02 2.320828e-02
## [216] 5.027115e-02 8.796092e-03 5.341985e-03 7.027657e-04 6.945212e-02
## [221] 3.529393e-04 2.265702e-03 4.862118e-04 5.310963e-01 2.660564e-02
## [226] 5.864262e-04 3.702085e-03 1.953951e-01 1.799459e-02 1.040603e-02
## [231] 1.042768e-02 1.650826e-01 5.522610e-02 7.921473e-05 1.659734e-03
## [236] 1.437380e-03 7.510794e-02 1.347827e-01 6.264266e-04 2.315042e-03
## [241] 3.811841e-04 1.279520e-03 1.481390e-03 1.713126e-03 5.407633e-02
## [246] 5.346586e-02 4.873717e-02 2.194690e-02 3.217126e-02 4.169222e-02
## [251] 5.487649e-03 1.056871e-02 9.100420e-03 4.172347e-03 3.259407e-03
## [256] 2.199341e-04 3.430155e-04 3.702522e-02
## Warning in na.omit(as.numeric(fm_res[match(names(final_out), fm_res$Rsid), : NAs
## introduced by coercion
## Warning in na.omit(as.numeric(fm_res[-match(names(final_out), fm_res$Rsid), :
## NAs introduced by coercion
Go through the fine-mapping results matched to ATAC-peaks and see if there is anything interesting.
IKZF1 CHiP-Seq data: