Liming Chena,b,1, Piroon Jenjaroenpunc,1, Andrea Mun Ching Pillaia,1, Anna V. Ivshinac, Ghim Siong Owc, Motakis Efthimiosc, Tang Zhiqunc, Tuan Zea Tand, Song-Choon Leea, Keith Rogersa, Jerrold M. Warda, Seiichi Morie,
David J. Adamsf, Nancy A. Jenkinsa,g, Neal G. Copelanda,g,2, Kenneth Hon-Kim Bana,h, Vladimir A. Kuznetsovc,i,2 and Jean Paul Thierya,d,h,2
a Institute of Molecular and Cell Biology, Singapore 138673;
b Jiangsu Key Laboratory for Molecular and Medical Biotechnology, College of Life Sciences,
Nanjing Normal University, Nanjing 210023, People's Republic of China;
c Division of Genome and Gene Expression Data Analysis, Bioinformatics Institute,
d Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599;
e Japanese Foundation of CancerResearch, Tokyo 1358550, Japan;
f Experimental Cancer Genetics, Hinxton Campus, Wellcome Trust Sanger Institute, Cambridge CB10 1HH, United Kingdom;
g Cancer Biology Program, Methodist Hospital Research Institute, Houston, TX 77030;
h Department of Biochemistry, Yong Loo Lin School of Medicine,
National University of Singapore, Singapore 117597; and
i School of Computer Science and Engineering, Nanyang Technological University,
1 L.C., P.J., and A.M.C.P. contributed equally to this work.
2 To whom correspondence may be addressed. Email: firstname.lastname@example.org,
email@example.com, or firstname.lastname@example.org.
Published in Proc Natl Acad Sci U. S. A. on 14 March 2017.
Robust prognostic gene signatures and therapeutic targets are difficult to derive from expression profiling because of the significant heterogeneity within breast cancer (BC) subtypes. Here, we performed forward genetic screening in mice using Sleeping Beauty transposon mutagenesis to identify candidate BC driver genes in an unbiased manner, using a stabilized N-terminal truncated β-catenin gene as a sensitizer. We identified 134 mouse susceptibility genes from 129 common insertion sites within 34 mammary tumors. Of these, 126 genes were orthologous to protein-coding genes in the human genome (hereafter, human BC susceptibility genes, hBCSGs), 70% of which are previously reported cancer-associated genes, and ∼16% are known BC suppressor genes. Network analysis revealed a gene hub consisting of E1A binding protein P300 (EP300), CD44 molecule (CD44), neurofibromin (NF1) and phosphatase and tensin homolog (PTEN), which are linked to a significant number of mutated hBCSGs. From our survival prediction analysis of the expression of human BC genes in 2,333 BC cases, we isolated a six-gene-pair classifier that stratifies BC patients with high confidence into prognostically distinct low-, moderate-, and high-risk subgroups. Furthermore, we proposed prognostic classifiers identifying three basal and three claudin-low tumor subgroups. Intriguingly, our hBCSGs are mostly unrelated to cell cycle/mitosis genes and are distinct from the prognostic signatures currently used for stratifying BC patients. Our findings illustrate the strength and validity of integrating functional mutagenesis screens in mice with human cancer transcriptomic data to identify highly prognostic BC subtyping biomarkers.
Fig. Thirty-one of 126 hBCSGs code for proteins involved in a tumor-suppression network with EP300 as the hub. The boxes below the gene names indicate the annotation of those genes: blue boxes denote BC-mutated, red boxes denote cancer-driver mutated genes, green boxes denote tumor-suppressor–like genes, and yellow boxes denote oncogene-like genes.