GoShifter (Genomic Annotation Shifter) uses a circularised permutation method for functional enrichment of GWAS variants. Contrary to SNP matching based methods, whereby matching (confounding) parameters must be specified, GoShifter does not require any prior knowledge of the confounding factors, because the null distribution is derived within the tested loci.

The paper discusses two specific types of confounding that they claim their method accounts for:

  1. Trait-associated SNPs often map to regions with greater gene density, genetic variation and LD than the rest of the genome.

  2. Functional annotations that colocalise are often enriched within trait associated loci (e.g. DHSs colocalise with exons). This means that annotations could be labelled as enriched when this is only due to colocalisation with another annotation which is actually enriched.


Method

  1. Derive set of potentially functional variants (index SNPs + those with \(r^2>0.8\) in 1000 Genomes Project European samples).

  2. Define loci as the region between the furthest linked SNPs and extend by twice the median size of the tested annotation (X) (ensures sufficient size for testing the significance of an overlap within a locus defined by an index variant with no other variants in linkage).

  3. Quantify the proportion of loci in which at least one SNP in LD overlapped X.

  4. Circularise the loci and randomly shift X sites within each locus and quantify the proportion of loci overlapping X while fixing the locations of the SNPs many times to generate a null distribution.

  5. Compute P values as the proportion of iterations for which the number of overlapping loci was equal to or greater than that for the tested SNPs.


The method is extended for “stratified enrichment of an association” whereby enrichment of an annotation (X) is calculated whilst controlling for a potentially colocalising second annotation (Y).

  1. Fragment each locus on the basis of the presence of Y while fixing the relative positions of he SNPs and annotation X (splitting X annotations if they partially overlap Y).

  2. Concatenate these fragments (preserving the relationships and relative positions among X, Y and the SNPs in the locus in both segments).

  3. To generate the stratified null distribution, circularise and randomly shift X within the two segments (overlapping Y and not overlapping Y) independently and quantify the proportion of loci that had at least one SNP that overlapped X in either region.

  4. Define P value of the enrichment as the proportion of iterations where the number of loci with SNPs overlapping X exceeded the number of loci overlapping X prior to shifting.


Other info