Published December 17, 2024 | Version v1
Journal article Open

Characterization of cancer-driving nucleotides (CDNs) across genes, cancer types, and patients

  • 1. Sun Yat-sen University
  • 2. Chinese Academy of Sciences
  • 3. Jinan University
  • 4. Southern Medical University
  • 5. University of Chicago

Description

A central goal of cancer genomics is to identify, in each patient, all the cancer-driving mutations. Among them, point mutations are referred to as cancer-driving nucleotides (CDNs), which recur in cancers. The companion study shows that the probability of i recurrent hits in n patients would decrease exponentially with i; hence, any mutation with i ≥ 3 hits in The Cancer Genome Atlas (TCGA) database is a high-probability CDN. This study characterizes the 50–150 CDNs identifiable for each cancer type of TCGA (while anticipating 10 times more undiscovered ones) as follows: (i) CDNs tend to code for amino acids of divergent chemical properties. (ii) At the genic level, far more CDNs (more than fivefold) fall on noncanonical than canonical cancer-driving genes (CDGs). Most undiscovered CDNs are expected to be on unknown CDGs. (iii) CDNs tend to be more widely shared among cancer types than canonical CDGs, mainly because of the higher resolution at the nucleotide than the whole-gene level. (iv) Most important, among the 50–100 coding region mutations carried by a cancer patient, 5–8 CDNs are expected but only 0–2 CDNs have been identified at present. This low level of identification has hampered functional test and gene-targeted therapy. We show that, by expanding the sample size to 105, most CDNs can be identified. Full CDN identification will then facilitate the design of patient-specific targeting against multiple CDN-harboring genes.

Data availability

The scripts for generating the key results of this study and the accompanying paper (Zhang et al., 2024) are available at GitLab (copy archived at Zhang, 2024). Example files for breast cancer analysis have also been included. The complete set of CDNs can be found in Supplementary file 1 of the accompanying paper (Zhang et al., 2024).

Files

elife-99341-v1.pdf

Files (2.0 MB)

Name Size Download all
Additional file
md5:80660afd053e3563844e58b2050a3e9e
195.5 kB Preview Download
Article
md5:ce5ddaa557ebb77cd0119e076a0f3d6f
1.8 MB Preview Download

Additional details

Identifiers

DOI
10.7554/eLife.99341.3
Other
oai:uchicago.tind.io:14286

Funding

National Natural Science Foundation of China
32150006
Guangdong Key R&D Project of China
2022B1111030001
National Natural Science Foundation of China
32293193
National Natural Science Foundation of China
32293190
National Natural Science Foundation of China
82341092
National Key Research and Development Program of China
2021YFC0863300
National Key Research and Development Program of China
2021YFC0863400
National Natural Science Foundation of China
32200493
National Natural Science Foundation of China
32370659
Guangdong Basic and Applied Basic Research Foundation
2023A1515010016

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Ecology and Evolution