1887

Abstract

Illumina sequencing allows rapid, cheap and accurate whole genome bacterial analyses, but short reads (<300 bp) do not usually enable complete genome assembly. Long-read sequencing greatly assists with resolving complex bacterial genomes, particularly when combined with short-read Illumina data (hybrid assembly). However, it is not clear how different long-read sequencing methods affect hybrid assembly accuracy. Relative automation of the assembly process is also crucial to facilitating high-throughput complete bacterial genome reconstruction, avoiding multiple bespoke filtering and data manipulation steps. In this study, we compared hybrid assemblies for 20 bacterial isolates, including two reference strains, using Illumina sequencing and long reads from either Oxford Nanopore Technologies (ONT) or SMRT Pacific Biosciences (PacBio) sequencing platforms. We chose isolates from the family , as these frequently have highly plastic, repetitive genetic structures, and complete genome reconstruction for these species is relevant for a precise understanding of the epidemiology of antimicrobial resistance. We assembled genomes using the hybrid assembler Unicycler and compared different read processing strategies, as well as comparing to long-read-only assembly with Flye followed by short-read polishing with Pilon. Hybrid assembly with either PacBio or ONT reads facilitated high-quality genome reconstruction, and was superior to the long-read assembly and polishing approach evaluated with respect to accuracy and completeness. Combining ONT and Illumina reads fully resolved most genomes without additional manual steps, and at a lower consumables cost per isolate in our setting. Automated hybrid assembly is a powerful tool for complete and accurate bacterial genome assembly.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000294
2019-09-01
2024-04-26
Loading full text...

Full text loading...

/deliver/fulltext/mgen/5/9/mgen000294.html?itemId=/content/journal/mgen/10.1099/mgen.0.000294&mimeType=html&fmt=ahah

References

  1. Didelot X, Bowden R, Wilson DJ, Peto TEA, Crook DW. Transforming clinical microbiology with bacterial genome sequencing. Nat Rev Genet 2012; 13:601–612 [View Article]
    [Google Scholar]
  2. Bradley P, Gordon NC, Walker TM, Dunn L, Heys S et al. Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nat Commun 2015; 6:10063 [View Article]
    [Google Scholar]
  3. Didelot X, Walker AS, Peto TE, Crook DW, Wilson DJ. Within-Host evolution of bacterial pathogens. Nat Rev Microbiol 2016; 14:150–162 [View Article]
    [Google Scholar]
  4. Qin J, Li R, Raes J, Arumugam M, Burgdorf KS et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 2010; 464:59–65 [View Article]
    [Google Scholar]
  5. George S, Pankhurst L, Hubbard A, Votintseva A, Stoesser N et al. Resolving plasmid structures in Enterobacteriaceae using the MinION nanopore sequencer: assessment of MinION and MinION/Illumina hybrid data assembly approaches. Microb Genom 2017; 3:e000118 [View Article]
    [Google Scholar]
  6. Logan LK, Weinstein RA. The epidemiology of carbapenem-resistant Enterobacteriaceae: the impact and evolution of a global menace. J Infect Dis 2017; 215:S28–S36 [View Article]
    [Google Scholar]
  7. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 2017; 27:722–736 [View Article]
    [Google Scholar]
  8. Loman NJ, Quick J, Simpson JT. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods 2015; 12:733–735 [View Article]
    [Google Scholar]
  9. Rhoads A, Au KF. Pacbio sequencing and its applications. Genomics Proteomics Bioinformatics 2015; 13:278–289 [View Article]
    [Google Scholar]
  10. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 2016; 17:333–351 [View Article]
    [Google Scholar]
  11. Rang FJ, Kloosterman WP, de Ridder J. From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy. Genome Biol 2018; 19:90 [View Article]
    [Google Scholar]
  12. Risse J, Thomson M, Patrick S, Blakely G, Koutsovoulos G et al. A single chromosome assembly of Bacteroides fragilis strain BE1 from Illumina and MinION nanopore sequencing data. Gigascience 2015; 4:60 [View Article]
    [Google Scholar]
  13. Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 2017; 13:e1005595 [View Article]
    [Google Scholar]
  14. Wick RR, Judd LM, Gorrie CL, Holt KE. Completing bacterial genome assemblies with multiplex MinION sequencing. Microb Genom 2017; 3:e000132 [View Article]
    [Google Scholar]
  15. Quick J, Loman NJ, Duraffour S, Simpson JT, Severi E et al. Real-Time, portable genome sequencing for Ebola surveillance. Nature 2016; 530:228–232 [View Article]
    [Google Scholar]
  16. Bayliss SC, Hunt VL, Yokoyama M, Thorpe HA, Feil EJ. The use of Oxford nanopore native barcoding for complete genome assembly. Gigascience 2017; 6:1–6 [View Article]
    [Google Scholar]
  17. Dilthey A, Meyer SA, Kaasch AJ. Increasing the efficiency of long-read sequencing for hybrid assembly with k-mer-based multiplexing. bioRxiv 2019; 680827:
    [Google Scholar]
  18. Wellcome Sanger Institute NCTC 3000 Project. https://www.sanger.ac.uk/resources/downloads/bacteria/nctc/
  19. Carattoli A. Resistance plasmid families in Enterobacteriaceae. Antimicrob Agents Chemother 2009; 53:2227–2238 [View Article]
    [Google Scholar]
  20. Lamble S, Batty E, Attar M, Buck D, Bowden R et al. Improved workflows for high throughput library preparation using the transposome-based Nextera system. BMC Biotechnol 2013; 13:104 [View Article]
    [Google Scholar]
  21. De Coster W, D’Hert S, Schultz DT, Cruts M, Van Broeckhoven C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 2018; 34:2666–2669 [View Article]
    [Google Scholar]
  22. Wick RR, Schultz MB, Zobel J, Holt KE. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 2015; 31:3350–3352 [View Article]
    [Google Scholar]
  23. Thorvaldsdóttir H, Robinson JT, Mesirov JP, Viewer IG. Integrative genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 2013; 14:178–192 [View Article]
    [Google Scholar]
  24. Wick R, Holt K. 2019; Benchmarking of long-read assembly tools for bacterial whole genomes. https://github.com/rrwick/Long-read-assembler-comparison
  25. Kolmogorov M, Yuan J, Lin Y, Pevzner PA, long Aof. Error-Prone reads using repeat graphs. Nature Biotechnology 2019; 37:540
    [Google Scholar]
  26. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 2014; 9:e112963 [View Article]
    [Google Scholar]
  27. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 2015; 25:1043–1055 [View Article]
    [Google Scholar]
  28. Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics 2010; 11:538 [View Article]
    [Google Scholar]
  29. Watson M. 2018; A simple test for uncorrected insertions and deletions (indels) in bacterial genomes. http://www.opiniomics.org/a-simple-test-for-uncorrected-insertions-and-deletions-indels-in-bacterial-genomes/
  30. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 2014; 30:2068–2069 [View Article]
    [Google Scholar]
  31. Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods 2015; 12:59–60 [View Article]
    [Google Scholar]
  32. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 2015; 31:3691–3693 [View Article]
    [Google Scholar]
  33. Clark SC, Egan R, Frazier PI, Wang Z. Ale: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies. Bioinformatics 2013; 29:435–443 [View Article]
    [Google Scholar]
  34. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods 2012; 9:357–359 [View Article]
    [Google Scholar]
  35. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M et al. Versatile and open software for comparing large genomes. Genome Biol 2004; 5:R12 [View Article]
    [Google Scholar]
  36. Hunt M, Kikuchi T, Sanders M, Newbold C, Berriman M et al. REAPR: a universal tool for genome assembly evaluation. Genome Biol 2013; 14:R47 [View Article]
    [Google Scholar]
  37. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 2018; 34:3094–3100 [View Article]
    [Google Scholar]
  38. Rabsch W, Helm RA, Eisenstark A. Diversity of phage types among archived cultures of the Demerec collection of Salmonella enterica serovar typhimurium strains. Appl Environ Microbiol 2004; 70:664–669 [View Article]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000294
Loading
/content/journal/mgen/10.1099/mgen.0.000294
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF

Supplementary material 2

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error