qc a GUESSFM run

qc(object, data)

# S4 method for ppnsnp,missing
qc(object)

# S4 method for snpmod,SnpMatrix
qc(object, data)

# S4 method for list,ANY
qc(object)

Arguments

object

snpmod object or object returned by pp.nsnp

data

SnpMatrix data for LD calculation if object is a snpmod

Value

data.frame of traits in pp.nsnp together with qc measures or data.frame of models and associated size and max r squared.

Details

With all genetic data, we use some QC measures to determine "bad" SNPs. The qc() functions in GUESSFM attempt to flag features that experience suggests is related to spurious differential SNP calls between cases and controls.

The function pp.nsnp generates a posterior distribution for the number of SNPs in a model. We expect this posterior distribution to have some right skew (as does the binomial or beta binomial prior) and be unimodal. Experience suggests that a posterior that does not have these properties may have favoured models with "bad" SNPs. Running qc on the object returned by pp.nsnp will flag these issues.

You can also call qc directly on a snpmod object. This may take a little longer, and attempts to estimate the maximum r squared between SNPs in any model. GUESS has a prior which should enforce that highly correlated SNPs are not both placed in a model. Sometimes it may be that two correlated SNPs are indeed required to model a trait, but experience with imputed data suggests that when a majority of models above a given size contain highly correlated SNPs, there is a problem with differential genotype calling which requires further investigation.