1887

Abstract

The standard workhorse for genomic analysis of the evolution of bacterial populations is phylogenetic modelling of mutations in the core genome. However, a notable amount of information about evolutionary and transmission processes in diverse populations can be lost unless the accessory genome is also taken into consideration. Here, we introduce panini (Pangenome Neighbour Identification for Bacterial Populations), a computationally scalable method for identifying the neighbours for each isolate in a data set using unsupervised machine learning with stochastic neighbour embedding based on the t-SNE (t-distributed stochastic neighbour embedding) algorithm. panini is browser-based and integrates with the Microreact platform for rapid online visualization and exploration of both core and accessory genome evolutionary signals, together with relevant epidemiological, geographical, temporal and other metadata. Several case studies with single- and multi-clone pneumococcal populations are presented to demonstrate the ability to identify biologically important signals from gene content data. panini is available at http://panini.pathogen.watch and code at http://gitlab.com/cgps/panini.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000220
2018-11-22
2024-03-29
Loading full text...

Full text loading...

/deliver/fulltext/mgen/5/4/mgen000220.html?itemId=/content/journal/mgen/10.1099/mgen.0.000220&mimeType=html&fmt=ahah

References

  1. Harris SR, Feil EJ, Holden MT, Quail MA, Nickerson EK et al. Evolution of MRSA during hospital transmission and intercontinental spread. Science 2010; 327:469–474 [View Article][PubMed]
    [Google Scholar]
  2. Croucher NJ, Finkelstein JA, Pelton SI, Mitchell PK, Lee GM et al. Population genomics of post-vaccine changes in pneumococcal epidemiology. Nat Genet 2013; 45:656–663 [View Article][PubMed]
    [Google Scholar]
  3. Chewapreecha C, Harris SR, Croucher NJ, Turner C, Marttinen P et al. Dense genomic sampling identifies highways of pneumococcal recombination. Nat Genet 2014; 46:305–309 [View Article][PubMed]
    [Google Scholar]
  4. Aanensen DM, Feil EJ, Holden MT, Dordel J, Yeats CA et al. Whole-genome sequencing for routine pathogen surveillance in public health: a population snapshot of invasive Staphylococcus aureus in Europe. MBio 2016; 7:e00444-16 [View Article][PubMed]
    [Google Scholar]
  5. Argimón S, Abudahab K, Goater RJ, Fedosejev A, Bhai J et al. Microreact: visualizing and sharing data for genomic epidemiology and phylogeography. Microb Genom 2016; 2:e000093 [View Article][PubMed]
    [Google Scholar]
  6. Marttinen P, Croucher NJ, Gutmann MU, Corander J, Hanage WP. Recombination produces coherent bacterial species clusters in both core and accessory genomes. Microb Genom 2015; 1:e000038 [View Article][PubMed]
    [Google Scholar]
  7. McNally A, Oren Y, Kelly D, Pascoe B, Dunn S et al. Combined analysis of variation in core, accessory and regulatory genome regions provides a super-resolution view into the evolution of bacterial populations. PLoS Genet 2016; 12:e1006280 [View Article][PubMed]
    [Google Scholar]
  8. van der Maaten L, Hinton G. Visualizing data using t-SNE. JMLR 20082579–2605
    [Google Scholar]
  9. van der Maaten L. Accelerating t-SNE using tree-based algorithms. JMLR 20143221–3245
    [Google Scholar]
  10. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 2015; 31:3691–3693 [View Article][PubMed]
    [Google Scholar]
  11. Croucher NJ, Hanage WP, Harris SR, McGee L, van der Linden M et al. Variable recombination dynamics during the emergence, transmission and 'disarming' of a multidrug-resistant pneumococcal clone. BMC Biol 2014; 12:49 [View Article][PubMed]
    [Google Scholar]
  12. Croucher NJ, Chewapreecha C, Hanage WP, Harris SR, McGee L et al. Evidence for soft selective sweeps in the evolution of pneumococcal multidrug resistance and vaccine escape. Genome Biol Evol 2014; 6:1589–1602 [View Article][PubMed]
    [Google Scholar]
  13. Makendi C, Page AJ, Wren BW, Le Thi Phuong T, Clare S et al. A phylogenetic and phenotypic analysis of Salmonella enterica serovar Weltevreden, an emerging agent of diarrheal disease in tropical regions. PLoS Negl Trop Dis 2016; 10:e0004446 [View Article][PubMed]
    [Google Scholar]
  14. Finkelstein JA, Huang SS, Daniel J, Rifas-Shiman SL, Kleinman K et al. Antibiotic-resistant Streptococcus pneumoniae in the heptavalent pneumococcal conjugate vaccine era: predictors of carriage in a multicommunity sample. Pediatrics 2003; 112:862–869 [View Article][PubMed]
    [Google Scholar]
  15. Corander J, Marttinen P, Sirén J, Tang J. Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations. BMC Bioinformatics 2008; 9:539 [View Article][PubMed]
    [Google Scholar]
  16. Croucher NJ, Coupland PG, Stevenson AE, Callendrello A, Bentley SD et al. Diversification of bacterial genome content through distinct mechanisms over different timescales. Nat Commun 2014; 5:5471 [View Article][PubMed]
    [Google Scholar]
  17. Croucher NJ, Mostowy R, Wymant C, Turner P, Bentley SD et al. Horizontal DNA transfer mechanisms of bacteria as weapons of intragenomic conflict. PLoS Biol 2016; 14:e1002394 [View Article][PubMed]
    [Google Scholar]
  18. Brown JS, Gilliland SM, Ruiz-Albert J, Holden DW. Characterization of pit, a Streptococcus pneumoniae iron uptake ABC transporter. Infect Immun 2002; 70:4389–4398 [View Article][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000220
Loading
/content/journal/mgen/10.1099/mgen.0.000220
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error