For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale.
Improved risk stratification and prognosis prediction in sepsis is a critical unmet need. Clinical severity scores and available assays such as blood lactate reflect global illness severity with suboptimal performance, and do not specifically reveal the underlying dysregulation of sepsis. Here, we present prognostic models for 30-day mortality generated independently by three scientific groups by using 12 discovery cohorts containing transcriptomic data collected from primarily community-onset sepsis patients. Predictive performance is validated in five cohorts of community-onset sepsis patients in which the models show summary AUROCs ranging from 0.765-0.89. Similar performance is observed in four cohorts of hospital-acquired sepsis. Combining the new gene-expression-based prognostic models with prior clinical severity scores leads to significant improvement in prediction of 30-day mortality as measured via AUROC and net reclassification improvement index These models provide an opportunity to develop molecular bedside tests that may improve risk stratification and mortality prediction in patients with sepsis. ; y NIGMS Glue Grant Legacy Award R24GM102656. J.F.B.-M., R.A., and E.T. were supported by Instituto de Salud Carlos III (grants EMER07/050, PI13/02110, PI16/01156). R.J.L. was supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under award number UL1TR001417. The CAPSOD study was supported by NIH (U01AI066569, P20RR016480, HHSN266200400064C). P.K. is supported by grants from Bill Melinda Gates Foundation, R01 AI125197-01, 1U19AI109662, and U19AI057229, outside the submitted work. The GAinS study was supported by the National Institute for Health Research through the Comprehensive Clinical Research Network for patient recruitment; Wellcome Trust (Grants 074318 [to J.C.K.], and 090532/Z/09/Z [core facilities Wellcome Trust Centre for Human Genetics including High-Throughput Genomics Group]); European Research Council under the European Union's Seventh Framework Programme (FP7/2007–2013)/ERC Grant agreement no. 281824 (to J.C.K.), the Medical Research Council (98082 [to J.C.K.]); UK Intensive Care Society; and NIHR Oxford Biomedical Research Centre. The Duke HAI study was supported by a research agreement between Duke University and Novartis Vaccines and Diagnostics, Inc. According to the terms of the agreement, representatives of the sponsor had an opportunity to review and comment on a draft of the manuscript. The authors had full control of the analyses, the preparation of the manuscript, and the decision to submit the manuscript for publication. For the University of Florida 'P50' Study, data were obtained from the Sepsis and Critically Illness Research Center (SCIRC) at the University of Florida College of Medicine, which is supported in part by NIGMS P50 GM111152. This work was supported by Defense Advanced Research Projects Agency and the Army Research Office through Grant W911NF-15-1-0107. T
For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types.