Computational pan-genomics: Status, promises and challenges

Marschall, Tanja; Marz, Manja; Abeel, Thomas; Dijkstra, Louis; Dutilh, Bas; Ghaffaari, Ali; Kersey, Paul; Kloosterman, Wigard; Mäkinen, Veli; Novak, Adam M.; Paten, Benedict; Porubsky, David; Rivals, Eric; Alkan, Can; Baaijens, Jasmijn A.; de Bakker, Paul; Boeva, Valentina; Bonnal, Raoul J.P.; Chiaromonte, Francesca; Chikhi, Rayan; Ciccarelli, Francesca D.; Cijvat, Robin; Datema, Erwin; van Duijn, Cornelia; Eichler, Evan; Ernst, Corinna; Eskin, E.; Garrison, Erik; El-Kebir, Mohammed; Klau, Gunnar W.; Korbel, Jan; Lameijer, Eric-Wubbo; Langmead, Benjamin; Martin, Marcel; Medvedev, Paul; Mu, John C.; Neerincx, Pieter B T; Ouwens, Klaasjan; Peterlongo, Pierre; Pisanti, Nadia; Rahmann, S.; Raphael, Benjamin J.; Reinert, Knut; de Ridder, Dick; de Ridder, Jeroen; Schlesner, Matthias; Schulz-Trieglaff, Ole; Sanders, Ashley D.; Sheikhizadeh, Siavash; Shneider, Carl; Smit, Sandra; Valenzuela, Daniel; Wang, Jiayin; Wessels, Lodewyk; Zhang, Ying; Guryev, Victor; Vandin, Fabio; Ye, Kai; Schönhuth, Alexander

doi:10.1093/bib/bbw089

Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different Computational methods and paradigms are needed.We will witness the rapid extension of Computational pan-genomics, a new sub-area of research in Computational biology. In this article, we generalize existing definitions and understand a pangenome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a Computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common Computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to Computational pangenomics can help address many of the problems currently faced in various domains.

Additional Metadata
Keywords	Data structures, Haplotypes, Pan-genome, Read mapping, Sequence graph
Persistent URL	doi.org/10.1093/bib/bbw089, hdl.handle.net/1765/112706
Journal	Briefings in Bioinformatics
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Marschall, T., Marz, M. (Manja), Abeel, T. (Thomas), Dijkstra, L. (Louis), Dutilh, B., Ghaffaari, A. (Ali), … Schönhuth, A. (Alexander). (2018). Computational pan-genomics: Status, promises and challenges. Briefings in Bioinformatics, 19(1), 118–135. doi:10.1093/bib/bbw089

Free Full Text ( Final Version , 928kb )

Computational pan-genomics: Status, promises and challenges

Publication

Publication

About

Computational pan-genomics: Status, promises and challenges

Publication

Publication

Workflow

Workflow

Add Content