Published August 10, 2018 | Version v1
Journal article Open

Genetic architecture of gene expression traits across diverse populations

  • 1. Loyola University Chicago
  • 2. University of Chicago
  • 3. University of California Los Angeles
  • 4. University of Washington
  • 5. Wake Forest University

Description

For many complex traits, gene regulation is likely to play a crucial mechanistic role. How the genetic architectures of complex traits vary between populations and subsequent effects on genetic prediction are not well understood, in part due to the historical paucity of GWAS in populations of non-European ancestry. We used data from the MESA (Multi-Ethnic Study of Atherosclerosis) cohort to characterize the genetic architecture of gene expression within and between diverse populations. Genotype and monocyte gene expression were available in individuals with African American (AFA, n = 233), Hispanic (HIS, n = 352), and European (CAU, n = 578) ancestry. We performed expression quantitative trait loci (eQTL) mapping in each population and show genetic correlation of gene expression depends on shared ancestry proportions. Using elastic net modeling with cross validation to optimize genotypic predictors of gene expression in each population, we show the genetic architecture of gene expression for most predictable genes is sparse. We found the best predicted gene in each population, TACSTD2 in AFA and CHURC1 in CAU and HIS, had similar prediction performance across populations with R2>0.8 in each population. However, we identified a subset of genes that are well-predicted in one population, but poorly predicted in another. We show these differences in predictive performance are due to allele frequency differences between populations. Using genotype weights trained in MESA to predict gene expression in independent populations showed that a training set with ancestry similar to the test set is better at predicting gene expression in test populations, demonstrating an urgent need for diverse population sampling in genomics. Our predictive models and performance statistics in diverse cohorts are made publicly available for use in transcriptome mapping methods at https://github.com/WheelerLab/DivPop.

Data availability

MESA genotype data is available at dbGaP (phs000209.v13.p3) and expression data at GEO (GSE56045). HapMap and Geuvadis expression data is at Array Express (E-MTAB-264 and E-GEUV-1) and genotype data is at http://www.internationalgenome.org/. Framingham Heart Study genotype and expression data is at dbGaP (phs000007.v29.p1). Summary statistics and predictive models of gene expression developed in this study are made publicly available at https://github.com/WheelerLab/DivPop.

Files

journal.pgen.1007586.pdf

Files (9.8 MB)

Name Size Download all
Article
md5:fd8dee5fda59a76f07d2cd75eef7ce15
4.3 MB Preview Download
Supporting information
md5:24dcad07da26a5dda65c8225c53d7eae
5.5 MB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pgen.1007586
Other
oai:uchicago.tind.io:6550

Related works

Funding

National Human Genome Research Institute
Academic Research Enhancement Award
Loyola University Chicago
Loyola University Chicago
Carbon Undergraduate Research Fellowship
Loyola University Chicago
Biology Summer Research Fellowship
Loyola University Chicago
Mulcahy Scholars Program
Unknown funder
R01 MH107666
National Heart, Lung, and Blood Institute
Unknown funder
HHSN268201500003I
Unknown funder
N01-HC-95159
Unknown funder
N01-HC-95160
Unknown funder
N01-HC-95161
Unknown funder
N01-HC-95162
Unknown funder
N01-HC-95163
Unknown funder
N01-HC-95164
Unknown funder
N01-HC-95165
Unknown funder
N01-HC-95166
Unknown funder
N01-HC-95167
Unknown funder
N01-HC-95168
Unknown funder
N01-HC-95169
Unknown funder
UL1-TR-000040
Unknown funder
UL1-TR-001079
Unknown funder
UL1-TR-001420
Unknown funder
UL1-TR-001881
Unknown funder
DK063491
National Heart, Lung, and Blood Institute
N02-HL-64278
NIA
1R01HL101250-01
National Heart, Lung, and Blood Institute
N01-HC-25195
National Heart, Lung, and Blood Institute
HHSN268201500001I
National Heart, Lung, and Blood Institute
N02-HL- 64278
Andrew D. Johnson
Christopher J. O'Donnell

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Medicine