Published March 6, 2017 | Version v1
Journal article Open

Robust stratification of breast cancer subtypes using differential patterns of transcript isoform expression

Description

Breast cancer, the second leading cause of cancer death of women worldwide, is a heterogenous disease with multiple different subtypes. These subtypes carry important implications for prognosis and therapy. Interestingly, it is known that these different subtypes not only have different biological behaviors, but also have distinct gene expression profiles. However, it has not been rigorously explored whether particular transcriptional isoforms are also differentially expressed among breast cancer subtypes, or whether transcript isoforms from the same sets of genes can be used to differentiate subtypes. To address these questions, we analyzed the patterns of transcript isoform expression using a small set of RNA-sequencing data for eleven Estrogen Receptor positive (ER+) subtype and fourteen triple negative (TN) subtype tumors. We identified specific sets of isoforms that distinguish these tumor subtypes with higher fidelity than standard mRNA expression profiles. We found that alternate promoter usage, alternative splicing, and alternate 3'UTR usage are differentially regulated in breast cancer subtypes. Profiling of isoform expression in a second, independent cohort of 68 tumors confirmed that expression of splice isoforms differentiates breast cancer subtypes. Furthermore, analysis of RNAseq data from 594 cases from the TCGA cohort confirmed the ability of isoform usage to distinguish breast cancer subtypes. Also using our expression data, we identified several RNA processing factors that were differentially expressed between tumor subtypes and/or regulated by estrogen receptor, including YBX1, YBX2, MAGOH, MAGOHB, and PCBP2. RNAi knock-down of these RNA processing factors in MCF7 cells altered isoform expression. These results indicate that global dysregulation of splicing in breast cancer occurs in a subtype-specific and reproducible manner and is driven by specific differentially expressed RNA processing factors.

Data availability

FPKMs and other expression data will be available at Gene Expression Omnibus. Raw sequencing data will be hosted at the Protected Data Cloud at Bionimbus (https://bionimbus-pdc.opensciencedatacloud.org/). To obtain access to the raw sequencing data, you must have have your eRA, Shibboleth, or Oauth identifier added to the authorization list of a project. To do so, please send an email to accounts@occ-data.org, which will respond with a Data Use Agreement. Once signed and returned, the raw sequencing data will be made available for download. RNAseq data is available at GEO, with accession number GSE94899.

Files

journal.pgen.1006589.pdf

Files (5.5 MB)

Name Size Download all
Article
md5:1ac98f2ef975a5684fe65d60b29df9fc
1.5 MB Preview Download
md5:99cb2c6e1842e7c0d1c4710ad7de47f1
4.1 MB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pgen.1006589
Other
oai:uchicago.tind.io:6749

Funding

NCI
1K08CA148912

UChicago Information

Division(s)
Biological Sciences Division, Institutes & Centers
Center(s) or Institute(s)
Institute for Genomics and Systems Biology