Uses the BIC approximation to calculate approximate Bayes factors for specified models

abf.calc(y, x, models, family = "binomial", q = NULL,
  method = c("speedglm", "glm.fit", "glm"), R2 = NULL, snp.data = NULL,
  return.R2 = FALSE, verbose = FALSE, parallel.dir = NULL)

Arguments

y

response vector

x

explanatory variables, a SnpMatrix object, data.frame or numeric matrix

models

vector of models to consider

family

family argument to pass to glm. Currently only "binomial" and "gaussian" are implemented for the glm.fit method.

q

optional vector of covariates to include in all models

method

use the speedglm library (if available), glm.fit or glm to fit the models. Default is speedglm which should be faster, but, like glm.fit, is less forgiving about the odd missing value etc, so if your code is giving potentially glm related errors, try running with method="glm" in the first instance.

R2

matrix giving pairwise r-squared measures of LD from which tag SNPs will be calculated when not directly available

snp.data

if R2 is missing, it is calculated from this SnpMatrix object

return.R2

if true, return the calculated R2 matrix. Useful if you are analysing several strata of a population and you wish to avoid repeating the calculation.

verbose

print lots of progress messages if TRUE. Default is FALSE.

parallel.dir

optional directory name to enable manual parallelisation.

Value

a data.frame containing model name, ABF, and an indicator of whether the ABF was calculating directly or via a tag SNP

Details

The central idea of GUESSFM is to use GUESS to rapidly survery the model space for a tagged version of the data, and select a set of plausible models. From their, the models tagged by the top model from GUESS should be evaluated using abf.calc.

The use of a parallel file enables abf.calc to be run in two ways. If you name a file that does not exist, objects will be saved to that file to enable subsets of models to be fitted in a parallel fashion. If you name a file that exists, it is assumed to be the joined results of such model fits and will be loaded. Without a parallel file, all models will be fitted, which may take a long time, particularly for glms. See vignette for more information.

After abf.calc, you may want to use abf2snpmod to generate a snpmod with the same structed returned by read.snpmod.