Published October 23, 2009 | Version v1
Journal article Open

Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data

  • 1. Los Alamos National Laboratory
  • 2. University of Chicago
  • 3. Cornell University

Description

Demographic models built from genetic data play important roles in illuminating prehistorical events and serving as null models in genome scans for selection. We introduce an inference method based on the joint frequency spectrum of genetic variants within and between populations. For candidate models we numerically compute the expected spectrum using a diffusion approximation to the one-locus, two-allele Wright-Fisher process, involving up to three simultaneous populations. Our approach is a composite likelihood scheme, since linkage between neutral loci alters the variance but not the expectation of the frequency spectrum. We thus use bootstraps incorporating linkage to estimate uncertainties for parameters and significance values for hypothesis tests. Our method can also incorporate selection on single sites, predicting the joint distribution of selected alleles among populations experiencing a bevy of evolutionary forces, including expansions, contractions, migrations, and admixture. We model human expansion out of Africa and the settlement of the New World, using 5 Mb of noncoding DNA resequenced in 68 individuals from 4 populations (YRI, CHB, CEU, and MXL) by the Environmental Genome Project. We infer divergence between West African and Eurasian populations 140 thousand years ago (95% confidence interval: 40–270 kya). This is earlier than other genetic studies, in part because we incorporate migration. We estimate the European (CEU) and East Asian (CHB) divergence time to be 23 kya (95% c.i.: 17–43 kya), long after archeological evidence places modern humans in Europe. Finally, we estimate divergence between East Asians (CHB) and Mexican-Americans (MXL) of 22 kya (95% c.i.: 16.3–26.9 kya), and our analysis yields no evidence for subsequent migration. Furthermore, combining our demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations accurately predicts the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU).

Files

journal.pgen.1000695.pdf

Files (1.8 MB)

Name Size Download all
Article
md5:1f07dae79eea201c61df795772da1b6e
422.6 kB Preview Download
md5:a3ea251e0f1f1517035544aa51875d19
1.4 MB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pgen.1000695
Other
oai:uchicago.tind.io:10322

Funding

National Science Foundation
PHY05-51164
National Institutes of Health
1R01GM83606
National Institutes of Health
2R01HG003229
Department of Energy
DE-AC52-06NA25396

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Human Genetics