1887

Abstract

The ability to distinguish different circulating pathogen clones from each other is a fundamental requirement to understand the epidemiology of infectious diseases. Phylogenetic analysis of genomic data can provide a powerful platform to identify lineages within bacterial populations, and thus inform outbreak investigation and transmission dynamics. However, resolving differences between pathogens associated with low-variant (LV) populations carrying low median pairwise single nucleotide variant (SNV) distances remains a major challenge. Here we present rPinecone, an R package designed to define sub-lineages within closely related LV populations. rPinecone uses a root-to-tip directional approach to define sub-lineages within a phylogenetic tree according to SNV distance from the ancestral node. The utility of this software was demonstrated using both simulated outbreaks and real genomic data of two LV populations: a hospital outbreak of methicillin-resistant Staphylococcus aureus and endemic Salmonella Typhi from rural Cambodia. rPinecone identified the transmission branches of the hospital outbreak and geographically confined lineages in Cambodia. Sub-lineages identified by rPinecone in both analyses were phylogenetically robust. It is anticipated that rPinecone can be used to discriminate between lineages of bacteria from LV populations where other methods fail, enabling a deeper understanding of infectious disease epidemiology for public health purposes.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000264
2019-03-28
2024-04-25
Loading full text...

Full text loading...

/deliver/fulltext/mgen/5/4/mgen000264.html?itemId=/content/journal/mgen/10.1099/mgen.0.000264&mimeType=html&fmt=ahah

References

  1. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics 2000; 155:945–959[PubMed]
    [Google Scholar]
  2. Corander J, Waldmann P, Sillanpää MJ. Bayesian analysis of genetic differentiation between populations. Genetics 2003; 163:367–374[PubMed]
    [Google Scholar]
  3. Cheng L, Connor TR, Sirén J, Aanensen DM, Corander J. Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol Biol Evol 2013; 30:1224–1228 [View Article][PubMed]
    [Google Scholar]
  4. Pham Thanh D, Thompson CN, Rabaa MA, Sona S, Sopheary S et al. The molecular and spatial epidemiology of typhoid fever in rural Cambodia. PLoS Negl Trop Dis 2016; 10:e0004785 [View Article][PubMed]
    [Google Scholar]
  5. Harris SR, Cartwright EJ, Török ME, Holden MT, Brown NM et al. Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study. Lancet Infect Dis 2013; 13:130–136 [View Article][PubMed]
    [Google Scholar]
  6. Jombart T, Cori A, Didelot X, Cauchemez S, Fraser C et al. Bayesian reconstruction of disease outbreaks by combining epidemiologic and genomic data. PLoS Comput Biol 2014; 10:e1003457 [View Article][PubMed]
    [Google Scholar]
  7. Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu CH et al. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol 2014; 10:e1003537 [View Article][PubMed]
    [Google Scholar]
  8. Strehl A, Ghosh J. Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 2003; 3:583–617
    [Google Scholar]
  9. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J et al. The sequence alignment/map format and SAMtools. Bioinformatics 2009; 25:2078–2079 [View Article][PubMed]
    [Google Scholar]
  10. Arndt D, Grant JR, Marcu A, Sajed T, Pon A et al. PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res 2016; 44:W16–W21 [View Article][PubMed]
    [Google Scholar]
  11. Zhou Y, Liang Y, Lynch KH, Dennis JJ, Wishart DS. PHAST: a fast phage search tool. Nucleic Acids Res 2011; 39:W347–W352 (Web Server issue) [View Article][PubMed]
    [Google Scholar]
  12. Croucher NJ, Page AJ, Connor TR, Delaney AJ, Keane JA et al. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res 2015; 43:e15 [View Article][PubMed]
    [Google Scholar]
  13. Page AJ, Taylor B, Delaney AJ, Soares J, Seemann T et al. SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments. Microb Genom 2016; 2:e000056 [View Article][PubMed]
    [Google Scholar]
  14. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 2014; 30:1312–1313 [View Article][PubMed]
    [Google Scholar]
  15. Pupko T, Pe'er I, Shamir R, Graur D. A fast algorithm for joint reconstruction of ancestral amino acid sequences. Mol Biol Evol 2000; 17:890–896 [View Article][PubMed]
    [Google Scholar]
  16. Wong VK, Baker S, Connor TR, Pickard D, Page AJ et al. An extended genotyping framework for Salmonella enterica serovar Typhi, the cause of human typhoid. Nat Commun 2016; 7:12827 [View Article][PubMed]
    [Google Scholar]
  17. Corander J, Waldmann P, Marttinen P, Sillanpää MJ. BAPS 2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics 2004; 20:2363–2369 [View Article][PubMed]
    [Google Scholar]
  18. Argimón S, Abudahab K, Goater RJ, Fedosejev A, Bhai J et al. Microreact: visualizing and sharing data for genomic epidemiology and phylogeography. Microb Genom 2016; 2:e000093 [View Article][PubMed]
    [Google Scholar]
  19. Paradis E, Claude J, Strimmer K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 2004; 20:289–290 [View Article][PubMed]
    [Google Scholar]
  20. Prosperi MC, Ciccozzi M, Fanti I, Saladini F, Pecorari M et al. A novel methodology for large-scale phylogeny partition. Nat Commun 2011; 2:321 [View Article][PubMed]
    [Google Scholar]
  21. Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res 2016; 44:W242–W245 [View Article][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000264
Loading
/content/journal/mgen/10.1099/mgen.0.000264
Loading

Data & Media loading...

Supplements

Supplementary File 1

PDF

Supplementary File 2

Supplementary File 3

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error