1887

Abstract

doi: 10.1099/mgen.0.000123.001.

Yersinia pestis is the causative agent of the bubonic plague, a disease responsible for several dramatic historical pandemics. Progress in ancient DNA (aDNA) sequencing rendered possible the sequencing of whole genomes of important human pathogens, including the ancient Y. pestis strains responsible for outbreaks of the bubonic plague in London in the 14th century and in Marseille in the 18th century, among others. However, aDNA sequencing data are still characterized by short reads and non-uniform coverage, so assembling ancient pathogen genomes remains challenging and often prevents a detailed study of genome rearrangements. It has recently been shown that comparative scaffolding approaches can improve the assembly of ancient Y. pestis genomes at a chromosome level. In the present work, we address the last step of genome assembly, the gap-filling stage. We describe an optimization-based method AGapEs (ancestral gap estimation) to fill in inter-contig gaps using a combination of a template obtained from related extant genomes and aDNA reads. We show how this approach can be used to refine comparative scaffolding by selecting contig adjacencies supported by a mix of unassembled aDNA reads and comparative signal. We applied our method to two Y. pestis data sets from the London and Marseilles outbreaks, for which we obtained highly improved genome assemblies for both genomes, comprised of, respectively, five and six scaffolds with 95 % of the assemblies supported by ancient reads. We analysed the genome evolution between both ancient genomes in terms of genome rearrangements, and observed a high level of synteny conservation between these strains.

Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000123
2017-07-08
2024-04-18
Loading full text...

Full text loading...

/deliver/fulltext/mgen/3/9/mgen000123.html?itemId=/content/journal/mgen/10.1099/mgen.0.000123&mimeType=html&fmt=ahah

References

  1. Rasmussen S, Allentoft ME, Nielsen K, Orlando L, Sikora M et al. Early divergent strains of Yersinia pestis in Eurasia 5,000 years ago. Cell 2015; 163:571–582 [View Article][PubMed]
    [Google Scholar]
  2. Darling AE, Miklós I, Ragan MA. Dynamics of genome rearrangement in bacterial populations. PLoS Genet 2008; 4:e1000128 [View Article][PubMed]
    [Google Scholar]
  3. Chain PS, Carniel E, Larimer FW, Lamerdin J, Stoutland PO et al. Insights into the evolution of Yersinia pestis through whole-genome comparison with Yersinia pseudotuberculosis . Proc Natl Acad Sci USA 2004; 101:13826–13831 [View Article][PubMed]
    [Google Scholar]
  4. Hinnebusch BJ, Chouikha I, Sun YC. Ecological opportunity, evolution, and the emergence of flea-borne plague. Infect Immun 2016; 84:1932–1940 [View Article][PubMed]
    [Google Scholar]
  5. Auerbach RK, Tuanyok A, Probert WS, Kenefic L, Vogler AJ et al. Yersinia pestis evolution on a small timescale: comparison of whole genome sequences from North America. PLoS One 2007; 2:e770 [View Article][PubMed]
    [Google Scholar]
  6. Thomma BP, Seidl MF, Shi-Kunne X, Cook DE, Bolton MD et al. Mind the gap; seven reasons to close fragmented genome assemblies. Fungal Genet Biol 2016; 90:24–30 [View Article][PubMed]
    [Google Scholar]
  7. Fraser CM, Eisen JA, Nelson KE, Paulsen IT, Salzberg SL. The value of complete microbial genome sequencing (you get what you pay for). J Bacteriol 2002; 184:6403–6405 [View Article][PubMed]
    [Google Scholar]
  8. Parkhill J. In defense of complete genomes. Nat Biotechnol 2000; 18:493–494 [View Article][PubMed]
    [Google Scholar]
  9. Klassen JL, Currie CR. Gene fragmentation in bacterial draft genomes: extent, consequences and mitigation. BMC Genomics 2012; 13:14 [View Article][PubMed]
    [Google Scholar]
  10. Salzberg SL, Yorke JA. Beware of mis-assembled genomes. Bioinformatics 2005; 21:4320–4321 [View Article][PubMed]
    [Google Scholar]
  11. Phillippy AM, Schatz MC, Pop M. Genome assembly forensics: finding the elusive mis-assembly. Genome Biol 2008; 9:R55 [View Article][PubMed]
    [Google Scholar]
  12. Gasc C, Peyretaillade E, Peyret P. Sequence capture by hybridization to explore modern and ancient genomic diversity in model and nonmodel organisms. Nucleic Acids Res 2016; 44:4504–4518 [View Article][PubMed]
    [Google Scholar]
  13. Orlando L, Gilbert MT, Willerslev E. Reconstructing ancient genomes and epigenomes. Nat Rev Genet 2015; 16:395–408 [View Article][PubMed]
    [Google Scholar]
  14. Yoshida K, Sasaki E, Kamoun S. Computational analyses of ancient pathogen DNA from herbarium samples: challenges and prospects. Front Plant Sci 2015; 6:771 [View Article][PubMed]
    [Google Scholar]
  15. Hagelberg E, Hofreiter M, Keyser C. Ancient DNA: the first three decades. Philos Trans R Soc B 2015; 370:20130371 [View Article]
    [Google Scholar]
  16. Hofreiter M, Paijmans JL, Goodchild H, Speller CF, Barlow A et al. The future of ancient DNA: technical advances and conceptual shifts. Bioessays 2015; 37:284–293 [View Article][PubMed]
    [Google Scholar]
  17. Pääbo S, Poinar H, Serre D, Jaenicke-Despres V, Hebler J et al. Genetic analyses from ancient DNA. Annu Rev Genet 2004; 38:645–679 [View Article][PubMed]
    [Google Scholar]
  18. Martin MD, Cappellini E, Samaniego JA, Zepeda ML, Campos PF et al. Reconstructing genome evolution in historic samples of the Irish potato famine pathogen. Nat Commun 2013; 4:2172 [View Article][PubMed]
    [Google Scholar]
  19. Yoshida K, Schuenemann VJ, Cano LM, Pais M, Mishra B et al. The rise and fall of the Phytophthora infestans lineage that triggered the Irish potato famine. Elife 2013; 2:e00731 [View Article][PubMed]
    [Google Scholar]
  20. Kay GL, Sergeant MJ, Giuffra V, Bandiera P, Milanese M et al. Recovery of a medieval Brucella melitensis genome using shotgun metagenomics. MBio 2014; 5:e01337-14 [View Article][PubMed]
    [Google Scholar]
  21. Bos KI, Harkins KM, Herbig A, Coscolla M, Weber N et al. Pre-Columbian mycobacterial genomes reveal seals as a source of New World human tuberculosis. Nature 2014; 514:494–497 [View Article][PubMed]
    [Google Scholar]
  22. Schuenemann VJ, Singh P, Mendum TA, Krause-Kyora B, Jäger G et al. Genome-wide comparison of medieval and modern Mycobacterium leprae . Science 2013; 341:179–183 [View Article][PubMed]
    [Google Scholar]
  23. Maixner F, Krause-Kyora B, Turaev D, Herbig A, Hoopmann MR et al. The 5300-year-old Helicobacter pylori genome of the Iceman. Science 2016; 351:162–165 [View Article][PubMed]
    [Google Scholar]
  24. Devault AM, Golding GB, Waglechner N, Enk JM, Kuch M et al. Second-pandemic strain of Vibrio cholerae from the Philadelphia cholera outbreak of 1849. N Engl J Med 2014; 370:334–340 [View Article][PubMed]
    [Google Scholar]
  25. Wagner DM, Klunk J, Harbeck M, Devault A, Waglechner N et al. Yersinia pestis and the plague of Justinian 541-543 AD: a genomic analysis. Lancet Infect Dis 2014; 14:319–326 [View Article][PubMed]
    [Google Scholar]
  26. Bos KI, Schuenemann VJ, Golding GB, Burbano HA, Waglechner N et al. A draft genome of Yersinia pestis from victims of the Black Death. Nature 2011; 478:506–510 [View Article][PubMed]
    [Google Scholar]
  27. Bos KI, Herbig A, Sahl J, Waglechner N, Fourment M et al. Eighteenth century Yersinia pestis genomes reveal the long-term persistence of an historical plague focus. Elife 2016; 5:e12994 [View Article][PubMed]
    [Google Scholar]
  28. Schubert M, Ermini L, der Sarkissian C, Jónsson H, Ginolhac A et al. Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nat Protoc 2014; 9:1056–1082 [View Article][PubMed]
    [Google Scholar]
  29. Peltzer A, Jäger G, Herbig A, Seitz A, Kniep C et al. EAGER: efficient ancient genome reconstruction. Genome Biol 2016; 17:60 [View Article][PubMed]
    [Google Scholar]
  30. Rissman AI, Mau B, Biehl BS, Darling AE, Glasner JD et al. Reordering contigs of draft genomes using the Mauve aligner. Bioinformatics 2009; 25:2071–2073 [View Article][PubMed]
    [Google Scholar]
  31. Rajaraman A, Tannier E, Chauve C. FPSAC: fast phylogenetic scaffolding of ancient contigs. Bioinformatics 2013; 29:2987–2994 [View Article][PubMed]
    [Google Scholar]
  32. Kolmogorov M, Raney B, Paten B, Pham S. Ragout-a reference-assisted assembly tool for bacterial genomes. Bioinformatics 2014; 30:i302–i309 [View Article][PubMed]
    [Google Scholar]
  33. Bosi E, Donati B, Galardini M, Brunetti S, Sagot MF et al. MeDuSa: a multi-draft based scaffolder. Bioinformatics 2015; 31:2443–2451 [View Article][PubMed]
    [Google Scholar]
  34. McNally A, Thomson NR, Reuter S, Wren BW. 'Add, stir and reduce': Yersinia spp. as model bacteria for pathogen evolution. Nat Rev Microbiol 2016; 14:177–190 [View Article][PubMed]
    [Google Scholar]
  35. Fitch WM. Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 1971; 20:406–416 [View Article]
    [Google Scholar]
  36. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009; 25:1754–1760 [View Article][PubMed]
    [Google Scholar]
  37. Dijkstra EW. A note on two problems in connexion with graphs. Numer Math 1959; 1:269–271 [View Article]
    [Google Scholar]
  38. Darmon E, Leach DR. Bacterial genome instability. Microbiol Mol Biol Rev 2014; 78:1–39 [View Article][PubMed]
    [Google Scholar]
  39. Chikhi R, Rizk G. Space-efficient and exact de Bruijn graph representation based on a Bloom filter. Algorithms Mol Biol 2013; 8:22 [View Article][PubMed]
    [Google Scholar]
  40. van Domselaar GH, Stothard P, Shrivastava S, Cruz JA, Guo A et al. BASys: a web server for automated bacterial genome annotation. Nucleic Acids Res 2005; 33:W455–W459 [View Article][PubMed]
    [Google Scholar]
  41. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004; 32:1792–1797 [View Article][PubMed]
    [Google Scholar]
  42. Eddy SR. A new generation of homology search tools based on probabilistic inference. Genome Inform 2009; 23:205–211[PubMed]
    [Google Scholar]
  43. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 2014; 9:e112963 [View Article][PubMed]
    [Google Scholar]
  44. Pääbo S, Poinar H, Serre D, Jaenicke-Despres V, Hebler J et al. Genetic analyses from ancient DNA. Annu Rev Genet 2004; 38:645–679 [View Article][PubMed]
    [Google Scholar]
  45. Bergeron A, Mixtacki J, Stoye J. A unifying view of genome rearrangements. Algorithm Bioinforma 2006; 4175:163–173
    [Google Scholar]
  46. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008; 18:821–829 [View Article][PubMed]
    [Google Scholar]
  47. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M et al. Versatile and open software for comparing large genomes. Genome Biol 2004; 5:R12 [View Article][PubMed]
    [Google Scholar]
  48. Koren S, Treangen TJ, Hill CM, Pop M, Phillippy AM. Automated ensemble assembly and validation of microbial genomes. BMC Bioinformatics 2014; 15:126 [View Article][PubMed]
    [Google Scholar]
  49. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 2012; 19:455–477 [View Article][PubMed]
    [Google Scholar]
  50. Salmela L, Sahlin K, Mäkinen V, Tomescu AI. Gap filling as exact path length problem. J Comput Biol 2016; 23:281–292 [View Article][PubMed]
    [Google Scholar]
  51. Ghodsi M, Hill CM, Astrovskaya I, Lin H, Sommer DD et al. De novo likelihood-based measures for comparing genome assemblies. BMC Res Notes 2013; 6:334 [View Article][PubMed]
    [Google Scholar]
  52. Rahman A, Pachter L. CGAL: computing genome assembly likelihoods. Genome Biol 2013; 14:R8 [View Article][PubMed]
    [Google Scholar]
  53. Adams MD, Bishop B, Wright MS. Quantitative assessment of insertion sequence impact on bacterial genome architecture. Microb Genom 2016; 2:e000062 [View Article][PubMed]
    [Google Scholar]
  54. Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res 2006; 34:D32–D36 [View Article][PubMed]
    [Google Scholar]
  55. Renaud G, Hanghøj K, Willerslev E, Orlando L. Gargammel: a sequence simulator for ancient DNA. Bioinformatics 2017; 33:577–579 [View Article][PubMed]
    [Google Scholar]
  56. Spyrou MA, Tukhbatova RI, Feldman M, Drath J, Kacki S et al. Historical Y. pestis genomes reveal the European Black Death as the source of ancient and modern plague pandemics. Cell Host Microbe 2016; 19:874–881 [View Article][PubMed]
    [Google Scholar]
  57. Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R et al. Circos: an information aesthetic for comparative genomics. Genome Res 2009; 19:1639–1645 [View Article][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000123
Loading
/content/journal/mgen/10.1099/mgen.0.000123
Loading

Data & Media loading...

Supplements

Supplementary File 1

PDF
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error