Funder: BioTalent Canada Student Internship ; Funder: Canadian Institutes of Health Research (CIHR) Canadian Graduate Scholarship ; Funder: Ontario Institute for Cancer Research (OICR) Investigator Awards provided by the Government of Ontario; Terry Fox Research Institute (TFRI) and Canadian Institutes of Health Research (CIHR) New Investigator Awards. ; Multi-omics datasets represent distinct aspects of the central dogma of molecular biology. Such high-dimensional molecular profiles pose challenges to data interpretation and hypothesis generation. ActivePathways is an integrative method that discovers significantly enriched pathways across multiple datasets using statistical data fusion, rationalizes contributing evidence and highlights associated genes. As part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we integrated genes with coding and non-coding mutations and revealed frequently mutated pathways and additional cancer genes with infrequent mutations. We also analyzed prognostic molecular pathways by integrating genomic and transcriptomic features of 1780 breast cancers and highlighted associations with immune response and anti-apoptotic signaling. Integration of ChIP-seq and RNA-seq data for master regulators of the Hippo pathway across normal human tissues identified processes of tissue regeneration and stem cell regulation. ActivePathways is a versatile method that improves systems-level understanding of cellular organization in health and disease through integration of multiple molecular datasets and pathway annotations.
The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53, TLE4, and TCF4. We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments. ; B.J.R. received funding from NIH grants U24CA211000 and R01HG007069. J.M.S. received funding from NIH grants U24CA143858, R01CA180778, and U24CA210990. J.R. received funding from the Ontario Institute for Cancer Research (OICR) Investigator Award provided by the Government of Ontario, Operating Grant from Cancer Research Society (CRS) (#21089), the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant (#RGPIN-2016-06485), and the Canadian Institutes of Health Research (CIHR) Project Grant. K.M. received funding from IWT/SBO NEMOA and FWO 3G046318 and G.0371.06 grants. J.M.G.I. received funding from the Novo Nordisk Foundation (NNF17OC0027594 and NNF14CC0001) and the Innovation Fund Denmark (5184-00102B). S.B. received funding from the Novo Nordisk Foundation (NNF17OC0027594 and NNF14CC0001). J.B. received funding from the BioTalent Canada Student Internship Program. A.V. and M.V. received funding from the Joint BSC-IRB-CRG Program in Computational Biology and the Severo Ochoa Award (SEV 2015-0493). M.A.R. was supported in part by the National Cancer Institute of the NIH (Cancer Target Discovery and Development Network grant U01CA217875)