qc.Rd
qc a GUESSFM run
qc(object, data) # S4 method for ppnsnp,missing qc(object) # S4 method for snpmod,SnpMatrix qc(object, data) # S4 method for list,ANY qc(object)
object | snpmod object or object returned by |
---|---|
data | SnpMatrix data for LD calculation if object is a snpmod |
data.frame of traits in pp.nsnp together with qc measures or data.frame of models and associated size and max r squared.
With all genetic data, we use some QC measures to determine "bad" SNPs. The qc() functions in GUESSFM attempt to flag features that experience suggests is related to spurious differential SNP calls between cases and controls.
The function pp.nsnp
generates a posterior
distribution for the number of SNPs in a model. We expect this
posterior distribution to have some right skew (as does the
binomial or beta binomial prior) and be unimodal. Experience
suggests that a posterior that does not have these properties may
have favoured models with "bad" SNPs. Running qc
on the
object returned by pp.nsnp
will flag these issues.
You can also call qc
directly on a snpmod
object.
This may take a little longer, and attempts to estimate the
maximum r squared between SNPs in any model. GUESS has a prior
which should enforce that highly correlated SNPs are not both
placed in a model. Sometimes it may be that two correlated SNPs
are indeed required to model a trait, but experience with imputed
data suggests that when a majority of models above a given size
contain highly correlated SNPs, there is a problem with
differential genotype calling which requires further
investigation.