Published December 1, 2009 | Version v1
Journal article Open

Laplacian Eigenfunctions Learn Population Structure

  • 1. University of Chicago

Description

Principal components analysis has been used for decades to summarize genetic variation across geographic regions and to infer population migration history. More recently, with the advent of genome-wide association studies of complex traits, it has become a commonly-used tool for detection and correction of confounding due to population structure. However, principal components are generally sensitive to outliers. Recently there has also been concern about its interpretation. Motivated from geometric learning, we describe a method based on spectral graph theory. Regarding each study subject as a node with suitably defined weights for its edges to close neighbors, one can form a weighted graph. We suggest using the spectrum of the associated graph Laplacian operator, namely, Laplacian eigenfunctions, to infer population structure. In simulations and real data on a ring species of birds, Laplacian eigenfunctions reveal more meaningful and less noisy structure of the underlying population, compared with principal components. The proposed approach is simple and computationally fast. It is expected to become a promising and basic method for population genetics and disease association studies.

Files

journal.pone.0007928.pdf

Files (1.0 MB)

Name Size Download all
Article
md5:e4c6b4c5819f0176b9299b78ffa76be1
785.7 kB Preview Download
md5:4e5a2d16f02acaf172cd8ca5c09a62c6
219.6 kB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pone.0007928
Other
oai:uchicago.tind.io:10679

Funding

National Institutes of Health
R01 HG001645

UChicago Information

Division(s)
Biological Sciences Division, Physical Sciences Division
Department(s)
Computer Science, Human Genetics, Radiology, Statistics