Uganda Genome Resource enables insights into population history and genomic discovery in Africa
Summary GWAS and allele frequency data are publicly available at https://www.ebi.ac.uk/gwas/downloads/summary-statistics. The combined UG2G+AGV imputation panel is available for imputation from the Haplotype Reference Consortium:http://www.haplotype-reference-consortium.org/participating-cohorts. All individual level data, phenotype, genotype and sequence data are available under managed access to researchers. Requests for access to the phenotypic data will be granted for all research consistent with the consent provided by participants. This would include any research in the context of health and disease, that does not involve identifying the participants in any way. The UMIC committees are responsible for curation, storage, and sharing of phenotypic and genetic data under managed access. The array and low and high depth sequence data have been deposited at the European Genomephenome Archive (EGA, http://www.ebi.ac.uk/ega/, accession numbers EGAS00001001558/EGAD00010000965, EGAS00001000545/EGAD00001001639 and EGAS00001000545/EGAD00001005346 respectively). Requests for access to data may be directed to segun.fatumo@mrcuganda.org. While data cannot be released on public databases as this would conflict with the study protocol and participant consent under which data were collected, we aim to facilitate data access for all bona fide researchers. Applications are reviewed by an independent data access committee (DAC) and access is granted if the request is consistent with the consent provided by participants within two weeks of submission. The data producers may be consulted by the DAC to evaluate potential ethical conflicts. Requestors also sign an agreement which governs the terms on which access to data is granted. ; The file associated with this record is under embargo until 12 months after publication, in accordance with the publisher's self-archiving policy. The full text may be available through the publisher links provided above. ; Genomic studies in African populations provide unique opportunities to understand disease aetiology, human diversity and population history. In the largest study of its kind, comprising genome-wide data from 6,400 individuals, and whole-genome sequences from 1,978 individuals from rural Uganda, we find evidence of geographically-correlated fine-scale population substructure. Historically, the ancestry of modern Ugandans is best represented by a mixture of ancient East African pastoralists. We demonstrate the value of the largest sequence panel from Africa to date as an imputation resource. Examining 34 cardiometabolic traits, we show systematic differences in trait heritability between European and African populations, probably reflecting the differential impact of genes and environment. In a multi-trait pan-African GWAS of up to 14,126 individuals, we identify novel loci associated with anthropometric, haematological, lipid and glycemic traits. We find that several functionally important signals are driven by Africa-specific variants, highlighting the value of studying diverse populations across the region. ; This work was funded by the Wellcome Trust, The Wellcome Sanger Institute (WT098051), the UK Medical Research Council (G0901213-92157, G0801566, and MR/K013491/1), and the Medical Research Council/Uganda Virus Research Institute Uganda Research Unit on AIDS core funding. This work was funded in part by IAVI with the generous support of the United States Agency for International Development (USAID) and other donors. The full list of IAVI donors is available at http://www.iavi.org. The contents of this manuscript are the responsibility of IAVI and co-authors and do not necessarily reflect the views of USAID or the US Government. DG is funded by a UKRI HDRUK Innovation Fellowship (reference number MR/S003711/1). We thank the African Partnership for Chronic Disease Research (APCDR) for providing a network to support this study as well as a repository for deposition of curated data. We thank all study participants who contributed to this study. We also acknowledge the National Institute for Health Research Cambridge Biomedical Research Centre. The authors wish to acknowledge the use of The Uganda Medical Informatics Centre (UMIC) compute cluster. Computational support from UMIC was made possible through funding from the Medical Research Council (MC_EX_MR/L016273/1). We acknowledge the Sanger core pipeline teams for their help with sequencing and mapping the whole genome sequence data. The authors acknowledge with thanks the participants in the AADM project, their families and their physicians. The study was supported in part by the Intramural Research Program of the National Institutes of Health in the Center for Research on Genomics and Global Health (CRGGH). The CRGGH is supported by the National Human Genome Research Institute (NHGRI), the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), the Center for Information Technology, and the Office of the Director at the National Institutes of Health (1ZIAHG200362). NS's research is supported by the Wellcome Trust (Grant Codes WT098051 and WT091310), the EU FP7 (EPIGENESYS Grant Code 257082 and BLUEPRINT Grant Code HEALTH-F5-2011-282510) and the National Institute for Health Research Blood and Transplant Research Unit (NIHR BTRU) in Donor Health and Genomics at the University of Cambridge in partnership with NHS Blood and Transplant (NHSBT). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR, the Department of Health or NHSBT. DG was funded by the MRC (MR/S003711/1). AJM was funded by the Wellcome Trust (WT106289). We acknowledge use of summary data from the Global Lipids Genomics Consortium (GLGC). (Consortium, 2013) We acknowledge the H3Africa Bioinformatics Network (H3ABioNet) Node, National Biotechnology Development Agency (NABDA), Federal Ministry of Science and Technology (FMST) Abuja, Nigeria for funding SF for his post-doctoral research. DNC wishes to acknowledge the financial support of Qiagen Inc through a License Agreement with Cardiff University. We also acknowledge the 1000 Genomes Project, UK10K, Simon's Foundation Genome Diversity Project and African Genome Variation Project (AGVP) for providing data resources that were used to contextualise the UG2G data. The GATK3 program was made available through the generosity of Medical and Population Genetics program at the Broad Institute, Inc. The research was partially supported by the NIHR Leicester Biomedical Research Centre; the views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. L.V. Wain holds a GSK/British Lung Foundation Chair in Respiratory Research. ; Peer-reviewed ; Post-print