Published December 17, 2024 | Version v1
Journal article Open

The theory of massively repeated evolution and full identifications of cancer-driving nucleotides (CDNs)

  • 1. Sun Yat-sen University
  • 2. Chinese Academy of Sciences
  • 3. University of Chicago

Description

Tumorigenesis, like most complex genetic traits, is driven by the joint actions of many mutations. At the nucleotide level, such mutations are cancer-driving nucleotides (CDNs). The full sets of CDNs are necessary, and perhaps even sufficient, for the understanding and treatment of each cancer patient. Currently, only a small fraction of CDNs is known as most mutations accrued in tumors are not drivers. We now develop the theory of CDNs on the basis that cancer evolution is massively repeated in millions of individuals. Hence, any advantageous mutation should recur frequently and, conversely, any mutation that does not is either a passenger or deleterious mutation. In the TCGA cancer database (sample size n=300–1000), point mutations may recur in i out of n patients. This study explores a wide range of mutation characteristics to determine the limit of recurrences (i*) driven solely by neutral evolution. Since no neutral mutation can reach i*=3, all mutations recurring at i≥3 are CDNs. The theory shows the feasibility of identifying almost all CDNs if n increases to 100,000 for each cancer type. At present, only <10% of CDNs have been identified. When the full sets of CDNs are identified, the evolutionary mechanism of tumorigenesis in each case can be known and, importantly, gene targeted therapy will be far more effective in treatment and robust against drug resistance.

Data availability

The key scripts used in this study are available at GitLab, copy archived at Zhang, 2024. A subset of key example files for breast cancer analysis can be found in the "/example_data_files" directory. The complete list of CDNs analyzed in this study is provided in Supplementary file 1.

Files

elife-99340-v1.pdf

Files (3.8 MB)

Name Size Download all
Article
md5:07b95d9f6aeb2f4947f40aa4268633e7
3.2 MB Preview Download
md5:9d8b7731b9ff02bdc31552ab63f3508a
560.7 kB Preview Download

Additional details

Identifiers

DOI
10.7554/eLife.99340.3
Other
oai:uchicago.tind.io:14285

Funding

National Natural Science Foundation of China
32150006
Guangdong Key R&D Project of China
2022B1111030001
National Natural Science Foundation of China
32293193
National Natural Science Foundation of China
32293190
Yunnan Revitalization Talent Support Program Top Team
202405AS350022
National Natural Science Foundation of China
82341092
National Natural Science Foundation of China
32200493
National Key Research and Development Program of China
2021YFC2301300
National Key Research and Development Program of China
2021YFC0863400
Yunnan Revitalization Talent Support Program Yunling Scholar Project
National Natural Science Foundation of China
32370659
Guangdong Basic and Applied Basic Research Foundation
2023A1515010016

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Ecology and Evolution