Published August 1, 2008 | Version v1
Journal article Open

Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies

  • 1. University of Michigan
  • 2. University of Chicago

Description

Quality control (QC) is a critical step in large-scale studies of genetic variation. While, on average, high-throughput single nucleotide polymorphism (SNP) genotyping assays are now very accurate, the errors that remain tend to cluster into a small percentage of "problem" SNPs, which exhibit unusually high error rates. Because most large-scale studies of genetic variation are searching for phenomena that are rare (e.g., SNPs associated with a phenotype), even this small percentage of problem SNPs can cause important practical problems. Here we describe and illustrate how patterns of linkage disequilibrium (LD) can be used to improve QC in large-scale, population-based studies. This approach has the advantage over existing filters (e.g., HWE or call rate) that it can actually reduce genotyping error rates by automatically correcting some genotyping errors. Applying this LD-based QC procedure to data from The International HapMap Project, we identify over 1,500 SNPs that likely have high error rates in the CHB and JPT samples and estimate corrected genotypes. Our method is implemented in the software package fastPHASE, available from the Stephens Lab website (http://stephenslab.uchicago.edu/software.html).

Files

journal.pgen.1000147.pdf

Files (671.0 kB)

Name Size Download all
Article
md5:e9b54ea64a1adecda57e899dac4b964f
416.9 kB Preview Download
md5:61a792298bf8c7ae7e4c681ffc3ef565
254.1 kB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pgen.1000147
Other
oai:uchicago.tind.io:10309

Funding

National Institutes of Health
1RO1HG/LM02585-01
National Institutes of Health
HL084729-02

UChicago Information

Division(s)
Biological Sciences Division, Physical Sciences Division
Department(s)
Human Genetics, Statistics