Performs the colocalisation tests described in Plagnol et al (2009) and Wallace et al (2012).

coloc.bma(
  df1,
  df2,
  snps = intersect(setdiff(colnames(df1), c(response1, stratum1)),
    setdiff(colnames(df2), c(response2, stratum2))),
  response1 = "Y",
  response2 = "Y",
  stratum1 = NULL,
  stratum2 = NULL,
  family1 = "binomial",
  family2 = "binomial",
  bayes = !is.null(bayes.factor),
  thr = 0.01,
  nsnps = 2,
  n.approx = 1001,
  bayes.factor = NULL,
  plot.coeff = FALSE,
  r2.trim = 0.95,
  quiet = FALSE,
  bma = FALSE,
  ...
)

Arguments

df1, df2

Each is a dataframe, containing response and potential explanatory variables for two independent datasets.

snps

The SNPs to consider as potential explanatory variables

response1, response2

The names of the response variables in df1 and df2 respectively

stratum1

optional column name of df1 that gives stratum information

stratum2

optional column name of df2 that gives stratum information

family1, family2

the error family for use in glm

bayes

Logical, indicating whether to perform Bayesian inference for the coefficient of proportionality, eta. If bayes.factor is supplied, Bayes factors are additionally computed for the specificed values. This can add a little time as it requires numerical integration, so can be set to FALSE to save time in simulations, for example.

thr

posterior probability threshold used to trim SNP list. Only SNPs with a marginal posterior probability of inclusion greater than this with one or other trait will be included in the full BMA analysis

nsnps

number of SNPs required to model both traits. The BMA analysis will average over all possible nsnp SNP models, subject to thr above.

n.approx

number of values at which to numerically approximate the posterior

bayes.factor

if true, compare specific models

plot.coeff

deprecated

r2.trim

for pairs SNPs with r2>r2.trim, only one SNP will be retained. This avoids numerical instability problems caused by including two highly correlated SNPs in the model.

quiet

suppress messages about how the model spaced is trimmed for BMA

bma

if true (default), average over models

...

other parameters passed to coloc.test

Value

a coloc or colocBayes object

Details

This is a test for proportionality of regression coefficients from two independent regressions. Analysis can either be based on a profile likelihood approach, where the proportionality coefficient, eta, is replaced by its maximum likelihood value, and inference is based on a chisquare test (p.value), or taking a hybrid-Bayesian approach and integrating the p value over the posterior distribution of eta, which gives a posterior predictive p value. The Bayesian approach can also be used to give a credible interval for eta. See the references below for further details.

References

Wallace et al (2012). Statistical colocalisation of monocyte gene expression and genetic risk variants for type 1 diabetes. Hum Mol Genet 21:2815-2824. http://europepmc.org/abstract/MED/22403184

Plagnol et al (2009). Statistical independence of the colocalized association signals for type 1 diabetes and RPS26 gene expression on chromosome 12q13. Biostatistics 10:327-34. http://www.ncbi.nlm.nih.gov/pubmed/19039033

Examples

## simulate covariate matrix (X) and continuous response vector (Y) ## for two populations/triats Y1 and Y2 depend equally on f1 and f2 ## within each population, although their distributions differ between ## populations. They are compatible with a null hypothesis that they ## share a common causal variant set.seed(1) X1 <- matrix(rbinom(2000,1,0.4),ncol=4) Y1 <- rnorm(500,rowSums(X1[,1:2]),2) X2 <- matrix(rbinom(2000,1,0.6),ncol=4) Y2 <- rnorm(500,rowSums(X2[,1:2]),5) boxplot(list(Y1,Y2),names=c("Y1","Y2"))
## fit and store linear model objects colnames(X1) <- colnames(X2) <- sprintf("f%s",1:ncol(X1)) summary(lm1 <- lm(Y1~f1+f2+f3+f4,data=as.data.frame(X1)))
#> #> Call: #> lm(formula = Y1 ~ f1 + f2 + f3 + f4, data = as.data.frame(X1)) #> #> Residuals: #> Min 1Q Median 3Q Max #> -6.4771 -1.4158 -0.0845 1.3748 7.1976 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 0.2863 0.1680 1.704 0.08903 . #> f1 0.6611 0.1855 3.565 0.00040 *** #> f2 0.5185 0.1819 2.850 0.00455 ** #> f3 0.3080 0.1851 1.664 0.09670 . #> f4 -0.2047 0.1823 -1.123 0.26196 #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> Residual standard error: 1.996 on 495 degrees of freedom #> Multiple R-squared: 0.04906, Adjusted R-squared: 0.04137 #> F-statistic: 6.384 on 4 and 495 DF, p-value: 5.152e-05 #>
summary(lm2 <- lm(Y2~f1+f2+f3+f4,data=as.data.frame(X2)))
#> #> Call: #> lm(formula = Y2 ~ f1 + f2 + f3 + f4, data = as.data.frame(X2)) #> #> Residuals: #> Min 1Q Median 3Q Max #> -13.6564 -3.1427 0.0631 3.2446 13.5783 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -0.8462 0.6420 -1.318 0.1881 #> f1 1.0047 0.4658 2.157 0.0315 * #> f2 1.0072 0.4736 2.127 0.0339 * #> f3 0.5396 0.4735 1.139 0.2551 #> f4 1.0390 0.4663 2.228 0.0263 * #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> Residual standard error: 5.119 on 495 degrees of freedom #> Multiple R-squared: 0.02742, Adjusted R-squared: 0.01957 #> F-statistic: 3.489 on 4 and 495 DF, p-value: 0.007988 #>
## test colocalisation using bma df1=as.data.frame(cbind(Y1=Y1,X1)) df2=as.data.frame(cbind(Y2=Y2,X2)) result <- coloc.bma( df1, df2, snps=colnames(X1), response1="Y1", response2="Y2", family1="gaussian", family2="gaussian", nsnps=2,bayes.factor=c(1,2,3))
#> Dropped 0 of 4 SNPs due to LD: r2 > 0.95.
#> 4 SNPs remain.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Restricting model space.
#> 2 SNPs have single SNP posterior probabilities < 0.01
#> Models containing only these SNPs will not be explored.
#> Fitting 5 multi SNP models to dataset 1
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Fitting 5 multi SNP models to dataset 2
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Averaging coloc testing over 3 models with posterior probabilities >= 2e-08 #> ...
result
#> eta.hat n ppp.value #> 1.6564113 2.0000000 0.8085261
plot(result)
#> Error: Elements must equal the number of rows or 1
## test colocalisation when one dataset contains a stratifying factor in column named "s" df1$s <- rbinom(500,1,0.5) result <- coloc.bma( df1, df2, snps=colnames(X1), response1="Y1", response2="Y2", stratum1="s", family1="gaussian", family2="gaussian", nsnps=2,bayes.factor=c(1,2,3))
#> Dropped 0 of 4 SNPs due to LD: r2 > 0.95.
#> 4 SNPs remain.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Restricting model space.
#> 1 SNPs have single SNP posterior probabilities < 0.01
#> Models containing only these SNPs will not be explored.
#> Fitting 6 multi SNP models to dataset 1
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Fitting 6 multi SNP models to dataset 2
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Warning: Recycling array of length 1 in vector-array arithmetic is deprecated. #> Use c() or as.vector() instead.
#> Averaging coloc testing over 3 models with posterior probabilities >= 2.2e-08 #> ...
result
#> eta.hat n ppp.value #> 1.658547 2.000000 0.810177
plot(result)
#> Error: Elements must equal the number of rows or 1