Published December 17, 2024
| Version v1
Journal article
Open
Characterization of cancer-driving nucleotides (CDNs) across genes, cancer types, and patients
Creators
- 1. Sun Yat-sen University
- 2. Chinese Academy of Sciences
- 3. Jinan University
- 4. Southern Medical University
- 5. University of Chicago
Description
A central goal of cancer genomics is to identify, in each patient, all the cancer-driving mutations. Among them, point mutations are referred to as cancer-driving nucleotides (CDNs), which recur in cancers. The companion study shows that the probability of i recurrent hits in n patients would decrease exponentially with i; hence, any mutation with i ≥ 3 hits in The Cancer Genome Atlas (TCGA) database is a high-probability CDN. This study characterizes the 50–150 CDNs identifiable for each cancer type of TCGA (while anticipating 10 times more undiscovered ones) as follows: (i) CDNs tend to code for amino acids of divergent chemical properties. (ii) At the genic level, far more CDNs (more than fivefold) fall on noncanonical than canonical cancer-driving genes (CDGs). Most undiscovered CDNs are expected to be on unknown CDGs. (iii) CDNs tend to be more widely shared among cancer types than canonical CDGs, mainly because of the higher resolution at the nucleotide than the whole-gene level. (iv) Most important, among the 50–100 coding region mutations carried by a cancer patient, 5–8 CDNs are expected but only 0–2 CDNs have been identified at present. This low level of identification has hampered functional test and gene-targeted therapy. We show that, by expanding the sample size to 105, most CDNs can be identified. Full CDN identification will then facilitate the design of patient-specific targeting against multiple CDN-harboring genes.
Data availability
The scripts for generating the key results of this study and the accompanying paper (Zhang et al., 2024) are available at GitLab (copy archived at Zhang, 2024). Example files for breast cancer analysis have also been included. The complete set of CDNs can be found in Supplementary file 1 of the accompanying paper (Zhang et al., 2024).Files
elife-99341-v1.pdf
Files
(2.0 MB)
| Name | Size | Download all |
|---|---|---|
|
Additional file md5:80660afd053e3563844e58b2050a3e9e |
195.5 kB | Preview Download |
|
Article md5:ce5ddaa557ebb77cd0119e076a0f3d6f |
1.8 MB | Preview Download |
Additional details
Identifiers
- DOI
- 10.7554/eLife.99341.3
- Other
- oai:uchicago.tind.io:14286
Funding
- National Natural Science Foundation of China
- 32150006
- Guangdong Key R&D Project of China
- 2022B1111030001
- National Natural Science Foundation of China
- 32293193
- National Natural Science Foundation of China
- 32293190
- National Natural Science Foundation of China
- 82341092
- National Key Research and Development Program of China
- 2021YFC0863300
- National Key Research and Development Program of China
- 2021YFC0863400
- National Natural Science Foundation of China
- 32200493
- National Natural Science Foundation of China
- 32370659
- Guangdong Basic and Applied Basic Research Foundation
- 2023A1515010016