Multiple causal variants

This describes deprecated functions - the SuSiE approach is more accurate and should be used instead

We load some simulated data.

## This is a new update to coloc.
data(coloc_test_data)
attach(coloc_test_data) # contains D3, D4 that we will use in this vignette

First, let us do a standard coloc (single causal variant) analysis to serve as a baseline comparison. The analysis concludes there is colocalisation, because it “sees” the SNPs on the left which are strongly associated with both traits. But it misses the SNPs on the right of the top left plot which are associated with only one trait.

library(coloc)
my.res <- coloc.abf(dataset1=D3, dataset2=D4)
## PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
##  1.39e-15  2.84e-07  5.35e-12  9.76e-05  1.00e+00 
## [1] "PP abf for shared variant: 100%"
class(my.res)
## [1] "coloc_abf" "list"
## print.coloc_abf
my.res
## Coloc analysis of trait 1, trait 2
## 
## SNP Priors
##    p1    p2   p12 
## 1e-04 1e-04 1e-05
## 
## Hypothesis Priors
##         H0    H1    H2       H3    H4
##  0.9894755 0.005 0.005 2.45e-05 5e-04
## 
## Posterior
##        nsnps           H0           H1           H2           H3           H4 
## 5.000000e+01 1.386329e-15 2.843375e-07 5.351202e-12 9.763482e-05 9.999021e-01
sensitivity(my.res,"H4 > 0.9")
## Results pass decision rule H4 > 0.9

Even though the sensitivity analysis itself looks good, the Manhattan plots suggest we are violating the assumption of a single causal variant per trait.
We can use =finemap.signals= to test whether there are additional signals after conditioning.

finemap.signals(D3,method="cond")
##       s5      s38 
## 8.268599 6.621488
finemap.signals(D4,method="cond")
##       s5 
## 6.557987

Note that every colocalisation conditions out every other signal except one for each trait. For that reason, trying to colocalise many signals per trait is not recommended. Instead, use pthr to set the significance (p value) required to call a signal. If you set if too low, you will capture signals that are non-significant, or too high and you will miss true signals. pthr=5e-8 would correspond to a genome-wide significance level for common variants in a European study, but we typically choose a slightly relaxed pthr=1e-6 on the basis that if there is one GW-significant signal in a region, we expect there is a greater chance for secondary signals to exist.

finemap.signals(D3,method="cond",pthr=1e-20) ## too small
## NULL
finemap.signals(D4,method="cond",pthr=0.1) ## too big
##        s5       s31       s41 
##  6.557987  2.736165 -3.713089

Now we can ask coloc to consider these as separate signals using the coloc.signals() function.

res <- coloc.signals(D3,D4,method="cond",p12=1e-6,pthr=1e-6)
## PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
##  1.39e-14  2.84e-06  5.35e-11  9.75e-04  9.99e-01 
## [1] "PP abf for shared variant: 99.9%"
## PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
##  1.69e-08  2.59e-04  6.51e-05  1.00e+00  4.45e-06 
## [1] "PP abf for shared variant: 0.000445%"
res
## Coloc analysis of trait 1, trait 2
## 
## SNP Priors
##    p1    p2   p12 
## 1e-04 1e-04 1e-06
## 
## Hypothesis Priors
##         H0    H1    H2       H3    H4
##  0.9899255 0.005 0.005 2.45e-05 5e-05
## 
## Posterior
##    nsnps hit1 hit2           H0           H1           H2         H3
## 1:    50  s38   s5 1.685256e-08 2.589837e-04 6.505054e-05 0.99967150
## 2:    50   s5   s5 1.371408e-14 2.812772e-06 5.293606e-11 0.01085724
##              H4
## 1: 4.450517e-06
## 2: 9.891399e-01

Note that because we are doing multiple colocalisations, sensitivity() needs to know which to consider:

sensitivity(res,"H4 > 0.9",row=1)
## Results fail decision rule H4 > 0.9

sensitivity(res,"H4 > 0.9",row=2)
## Results pass decision rule H4 > 0.9

Because these signals are truly independent, we could also split them by using masking, which doesn’t condition at all, but restricts the search space for colocalisation to SNPs which are not in LD with any-but-one of each signal SNP. Here you also need to consider r2thr which sets the maximum \(r^2\) between two SNPs for them to be considered independent.

finemap.signals(D3,method="mask")
##       s5      s49 
## 8.268599 4.926711
finemap.signals(D4,method="mask")
##       s5 
## 6.557987
resm=coloc.signals(D3,D4,method="mask",p12=1e-6,pthr=1e-6,r2thr=0.01)
## PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
##  1.39e-14  2.84e-06  5.35e-11  9.75e-04  9.99e-01 
## [1] "PP abf for shared variant: 99.9%"
## 1
## dropping 0/50 : 0 + 0
## 2
## dropping 46/50 : 46 + 0
resm
## Coloc analysis of trait 1, trait 2
## 
## SNP Priors
##    p1    p2   p12 
## 1e-04 1e-04 1e-06
## 
## Hypothesis Priors
##         H0    H1    H2       H3    H4
##  0.9899255 0.005 0.005 2.45e-05 5e-05
## 
## Posterior
##    nsnps hit1 hit2           H0           H1           H2         H3
## 1:    50  s49   s5 1.237770e-04 1.354141e-04 4.777764e-01 0.52196383
## 2:    50   s5   s5 1.371408e-14 2.812772e-06 5.293606e-11 0.01085724
##              H4
## 1: 5.542865e-07
## 2: 9.891399e-01
sensitivity(resm,"H4 > 0.9",row=1)
## Results fail decision rule H4 > 0.9

sensitivity(resm,"H4 > 0.9",row=2)
## Results pass decision rule H4 > 0.9