Published August 15, 2008 | Version v1
Journal article Open

Selective Constraints in Experimentally Defined Primate Regulatory Regions

  • 1. McGill University
  • 2. University of Chicago

Description

Changes in gene regulation may be important in evolution. However, the evolutionary properties of regulatory mutations are currently poorly understood. This is partly the result of an incomplete annotation of functional regulatory DNA in many species. For example, transcription factor binding sites (TFBSs), a major component of eukaryotic regulatory architecture, are typically short, degenerate, and therefore difficult to differentiate from randomly occurring, nonfunctional sequences. Furthermore, although sites such as TFBSs can be computationally predicted using evolutionary conservation as a criterion, estimates of the true level of selective constraint (defined as the fraction of strongly deleterious mutations occurring at a locus) in regulatory regions will, by definition, be upwardly biased in datasets that are a priori evolutionarily conserved. Here we investigate the fitness effects of regulatory mutations using two complementary datasets of human TFBSs that are likely to be relatively free of ascertainment bias with respect to evolutionary conservation but, importantly, are supported by experimental data. The first is a collection of almost >2,100 human TFBSs drawn from the literature in the TRANSFAC database, and the second is derived from several recent high-throughput chromatin immunoprecipitation coupled with genomic microarray (ChIP-chip) analyses. We also define a set of putative cis-regulatory modules (pCRMs) by spatially clustering multiple TFBSs that regulate the same gene. We find that a relatively high proportion (∼37%) of mutations at TFBSs are strongly deleterious, similar to that at a 2-fold degenerate protein-coding site. However, constraint is significantly reduced in human and chimpanzee pCRMS and ChIP-chip sequences, relative to macaques. We estimate that the fraction of regulatory mutations that have been driven to fixation by positive selection in humans is not significantly different from zero. We also find that the level of selective constraint in our TFBSs, pCRMs, and ChIP-chip sequences is negatively correlated with the expression breadth of the regulated gene, whereas the opposite relationship holds at that gene's nonsynonymous and synonymous sites. Finally, we find that the rate of protein evolution in a transcription factor appears to be positively correlated with the breadth of expression of the gene it regulates. Our study suggests that strongly deleterious regulatory mutations are considerably more likely (1.6-fold) to occur in tissue-specific than in housekeeping genes, implying that there is a fitness cost to increasing "complexity" of gene expression.

Files

journal.pgen.1000157.pdf

Files (1.3 MB)

Name Size Download all
Article
md5:e1fb57ddfc13b4265668b1e799a789c9
302.0 kB Preview Download
md5:c05a18a43bb610000a60c6ee0112387c
993.6 kB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pgen.1000157
Other
oai:uchicago.tind.io:10310

Funding

FRSQ
postdoctoral fellowship
CIHR
209549
National Institutes of Health
GM077959

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Human Genetics