Published July 30, 2025 | Version v1
Journal article Open

Positional distribution of transcription factor binding sites in the human genome

  • 1. Academia Sinica
  • 2. Institute of Advanced Study in Science and Technology
  • 3. University of Chicago

Description

As transcription factors (TFs) play a major role in gene regulation, we studied their binding motifs (positional weight matrices, PWMs) and binding sites (TFBSs) in the human genome, and how TFs bind DNA motifs, including the involvement of binding co-factors. Using the chromatin immunoprecipitation sequencing data recently released by ENCODE (Encyclopedia of DNA Elements), we obtained new PWMs for 196 TFs and revised PWMs for 119 TFs. From these and the PWMs previously obtained for 235 TFs, we inferred the canonical PWMs for 500 TFs, including 243 new PWMs. Analysis revealed that most TFBSs are in introns (42.6%) and intergenic regions (31.6%), with only 11.3% in promoters. However, the TFBS density is considerably higher in promoters, showing a bell-shaped distribution of TFBSs with a peak at the transcription start site. Many TFBSs lie close to CTCF (CCCTC-binding factor) binding sites. Tethered binding is far more frequent than co-binding, with the latter often requiring co-factors.

Data availability

All relevant data are within the manuscript and its Supporting Information files.

Files

journal.pone.0329226.pdf

Files (6.7 MB)

Name Size Download all
Supporting information
md5:4c3ddbc59a4ab123e8e4305a291971d3
5.4 MB Preview Download
Article
md5:9edcdb17f34056957515ce0b36a052dc
1.3 MB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pone.0329226
Other
oai:uchicago.tind.io:15891

Funding

National Science and Technology Council
NSTC 112-2311-B-001 -045

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Ecology and Evolution