GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing
The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies. ; GCAT|Genomes for Life, a cohort study of the Genomes of Catalonia, Fundació Institut Germans Trias i Pujol (IGTP); IGTP is part of the CERCA Program/Generalitat de Catalunya; GCAT is supported by Acción de Dinamización del ISCIII-MINECO; Ministry of Health of the Generalitat of Catalunya [ADE 10/00026]; Agència de Gestió d'Ajuts Universitaris i de Recerca (AGAUR) [2017-SGR 529]; B.C. is supported by national grants [PI18/01512]; X.F. is supported by VEIS project [001-P-001647] (co-funded by European Regional Development Fund (ERDF), 'A way to build Europe'); a full list of the investigators who contributed to the generation of the GCAT data is available from www.genomesforlife.com/; Severo Ochoa Program, awarded by the Spanish Government [SEV-2011-00067 and SEV2015-0493]; Spanish Ministry of Science [TIN2015-65316-P]; Innovation and by the Generalitat de Catalunya [2014-SGR-1051 to D.T.]; Agencia Estatal de Investigación (AEI, Spain) [BFU2016-77244-R and PID2019-107836RB-I00]; European Regional Development Fund (FEDER, EU) (to M.C.); Spanish Ministry of Science and Innovation [FPI BES-2016-0077344 to J.V.M.]; C.S. received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement [H2020-MSCA-COFUND-2016-754433]; this study made use of data generated by the UK10K Consortium from UK10K COHORT IMPUTATION [EGAS00001000713]; formal agreement with the Barcelona Supercomputing Center (BSC); this study made use of data generated by the Genome of the Netherlands' project, which is funded by the Netherlands Organization for Scientific Research [184021007], allowing us to use the GoNL reference panel containing SVs, upon request (GoNL Data Access request 2019203); this study also used data generated by the Haplotype Reference Consortium (HRC) accessed through the European Genome-phenome Archive with the accession numbers EGAD00001002729; formal agreement of the Barcelona Supercomputing Center (BSC) with WTSI; this study made use of data generated by the 1000 Genomes (1000G), accessed through the FTP portal (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/); this study used the GeneHancer-for-AnnotSV dump for GeneCards Suite Version 4.14, through a formal agreement between the BSC and the Weizmann Institute of Science. ; Peer Reviewed ; "Article signat per 21 autors/es: Jordi Valls-Margarit, Iván Galván-Femenía, Daniel Matías-Sánchez, Natalia Blay, Montserrat Puiggròs, Anna Carreras, Cecilia Salvoro, Beatriz Cortés, Ramon Amela, Xavier Farre, Jon Lerga-Jaso, Marta Puig, Jose Francisco Sánchez-Herrero, Victor Moreno, Manuel Perucho, Lauro Sumoy, Lluís Armengol, Olivier Delaneau, Mario Cáceres, Rafael de Cid, David Torrents" ; Postprint (published version)