1887

Abstract

To benchmark algorithms for automated plasmid sequence reconstruction from short-read sequencing data, we selected 42 publicly available complete bacterial genome sequences spanning 12 genera, containing 148 plasmids. We predicted plasmids from short-read data with four programs (PlasmidSPAdes, Recycler, cBar and PlasmidFinder) and compared the outcome to the reference sequences. PlasmidSPAdes reconstructs plasmids based on coverage differences in the assembly graph. It reconstructed most of the reference plasmids (recall=0.82), but approximately a quarter of the predicted plasmid contigs were false positives (precision=0.75). PlasmidSPAdes merged 84 % of the predictions from genomes with multiple plasmids into a single bin. Recycler searches the assembly graph for sub-graphs corresponding to circular sequences and correctly predicted small plasmids, but failed with long plasmids (recall=0.12, precision=0.30). cBar, which applies pentamer frequency analysis to detect plasmid-derived contigs, showed a recall and precision of 0.76 and 0.62, respectively. However, cBar categorizes contigs as plasmid-derived and does not bin the different plasmids. PlasmidFinder, which searches for replicons, had the highest precision (1.0), but was restricted by the contents of its database and the contig length obtained from de novo assembly (recall=0.36). PlasmidSPAdes and Recycler detected putative small plasmids (<10 kbp), which were also predicted as plasmids by cBar, but were absent in the original assembly. This study shows that it is possible to automatically predict small plasmids. Prediction of large plasmids (>50 kbp) containing repeated sequences remains challenging and limits the high-throughput analysis of plasmids from short-read whole-genome sequencing data.

Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000128
2017-08-18
2024-03-28
Loading full text...

Full text loading...

/deliver/fulltext/mgen/3/10/mgen000128.html?itemId=/content/journal/mgen/10.1099/mgen.0.000128&mimeType=html&fmt=ahah

References

  1. Smalla K, Jechalke S, Top EM. Plasmid detection, characterization, and ecology. Microbiol Spectr 2015; 3:PLAS-0038-2014 [View Article][PubMed]
    [Google Scholar]
  2. Conlan S, Thomas PJ, Deming C, Park M, Lau AF et al. Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae. Sci Transl Med 2014; 6:254ra126 [View Article][PubMed]
    [Google Scholar]
  3. De Toro M, Garcilláon-Barcia MP, De La Cruz F. Plasmid diversity and adaptation analyzed by massive sequencing of Escherichia coli plasmids. Microbiol Spectr 2014; 2:PLAS–0031 [View Article][PubMed]
    [Google Scholar]
  4. Carattoli A, Zankari E, García-Fernández A, Voldby Larsen M, Lund O et al. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 2014; 58:3895–3903 [View Article][PubMed]
    [Google Scholar]
  5. Zhou F, Xu Y. cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data. Bioinformatics 2010; 26:2051–2052 [View Article][PubMed]
    [Google Scholar]
  6. Lanza VF, de Toro M, Garcillán-Barcia MP, Mora A, Blanco J et al. Plasmid flux in Escherichia coli ST131 sublineages, analyzed by plasmid constellation network (PLACNET), a new method for plasmid reconstruction from whole genome sequences. PLoS Genet 2014; 10:e1004766 [View Article][PubMed]
    [Google Scholar]
  7. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003; 13:2498–2504 [View Article][PubMed]
    [Google Scholar]
  8. de Been M, Lanza VF, de Toro M, Scharringa J, Dohmen W et al. Dissemination of cephalosporin resistance genes between Escherichia coli strains from farm animals and humans by specific plasmid lineages. PLoS Genet 2014; 10:e1004776 [View Article][PubMed]
    [Google Scholar]
  9. Rozov R, Brown Kav A, Bogumil D, Shterzer N, Halperin E et al. Recycler: an algorithm for detecting plasmids from de novo assembly graphs. Bioinformatics 2017; 33:475–482 [View Article][PubMed]
    [Google Scholar]
  10. Antipov D, Hartwick N, Shen M, Raiko M, Lapidus A et al. PlasmidSPAdes: assembling plasmids from whole genome sequencing data. Bioinformatics 2016; 32:3380–3387 [View Article][PubMed]
    [Google Scholar]
  11. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 2012; 19:455–477 [View Article][PubMed]
    [Google Scholar]
  12. Prjibelski AD, Vasilinetc I, Bankevich A, Gurevich A, Krivosheeva T et al. ExSPAnder: a universal repeat resolver for DNA fragment assembly. Bioinformatics 2014; 30:i293–i301 [View Article][PubMed]
    [Google Scholar]
  13. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 2013; 29:1072–1075 [View Article][PubMed]
    [Google Scholar]
  14. Harrison PW, Lower RP, Kim NK, Young JP. Introducing the bacterial 'chromid': not a chromosome, not a plasmid. Trends Microbiol 2010; 18:141–148 [View Article][PubMed]
    [Google Scholar]
  15. Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 2017; 13:e1005595 [View Article][PubMed]
    [Google Scholar]
  16. Forde BM, Ben Zakour NL, Stanton-Cook M, Phan MD, Totsika M et al. The complete genome sequence of Escherichia coli EC958: a high quality reference sequence for the globally disseminated multidrug resistant E. coli O25b:H4-ST131 clone. PLoS One 2014; 9:e104400 [View Article][PubMed]
    [Google Scholar]
  17. Koren S, Phillippy AM. One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly. Curr Opin Microbiol 2015; 23:110–120 [View Article][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000128
Loading
/content/journal/mgen/10.1099/mgen.0.000128
Loading

Data & Media loading...

Supplements

Supplementary File 1

PDF

Supplementary File 2

Supplementary File 3

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error