SlideShare a Scribd company logo
1 of 35
Toolbox for bacterial
population analysis
using NGS
INTRODUCTION OF BACTERIAL POPULATION GENOMICS AND EVOLUTION
MIRKO ROSSI
ASS. PROF. ENVIRONMENTAL HYGIENE, FACULTY OF VETERINARY MEDICINE
I’m a vet and not a bioinformatics.. I’m a good example of end-user!
I do not want to teach population genetics today … just give you some tips how to do it using
NGS in bacteria
If you are interested in bacterial population analysis … we are organizing an ad hoc course in
Spring ..
There are several more software/pipelines.. These are the ones I like/I know/I apply
If you want the slides send an Email to me mirko.rossi@helsinki.fi
If you are a MSc in bioinformatics and interested in thesis in applied bioinformatics in public
health microbiology and pathogen surveillance please contact me ..
Bacterial population
A group of individuals of the same species
POPULATIONS, not individuals, evolve
Population and community are two different concepts … WE ARE SPEAKING OF INDIVIDUALS OF
THE SAME SPECIES!!!! … although the definition of species in bacteriology is quite vague
Population genomics attend to understand the population by whole genome analysis a sample
of it  investigating the variation of a subset of individual members of the population
“Sequence data is ideal for this, as the differences between individuals are often tiny (i.e. there
is very little variation) since they belong to a single population, and DNA sequence data allows
us to detect single nucleotide changes (ie provides high resolution)” (Kate Hold)
The sample is a subset of the population
4
Population
Universe
Reality
State of nature
Truth
parameters
Sample
Finite, random
noise
error
perturbation
statistics
Statistical inference: Extract maximum information from
sample in order to draw conclusions about population
Inductive not deductive
Source John Bunge
How many samples do I need to
sequence?
It depends on your question!
Accuracy is important.. but big numbers help!
Draft genomes are enough. Closing a genome is a waste of time and money!
good draft 100 €/s  closed > 3000 €/s
Include in your analysis as much diversity as possible (time, space, phenotypes,...)
Sequence as much as you can … just stop before you get broke!!
1000 strains < 100 000 €
Bacterial population… different levels
Population of H. pylori living in a single stomach Population of H. pylori circulating globally
What do we want to measure?
Genetic Drift
◦ the change in the gene pool of a small population due to chance
Natural Selection
◦ Allele increasing fitness will accumulate in the population
◦ Cause ADAPTATION of Populations
Gene Flow
◦ is genetic exchange due to the migration of individuals between populations
How do we measure (using NGS)?
Identify variants:
◦SNPapproach
◦Gene-by-geneapproach
Define which part of the gene pool is common in all the individuals of the population (core)
and which part is not (accessory)
Use of phylogenetic frameworks for reconstructing genealogy and non-phylogenetic
clustering methods for inferring population structure
Applications
Outbreak determination
Pathogen transmission
Understanding epidemics
Pathogen surveillance
Understanding evolution of bacteria
….
@jennifergardy
Identifying variants: SNP approaches
sample
NGS
WGS
reads
Mapping to reference
VCF/Fasta File with SNPs
• Needs a reference strain
• Monomorphic (Clonal) species
• Recombination/Horizontal gene transfer is a
problem
• Difficult to create a nomenclature
Source J. Carriço
Identifying variants: Gene-by-gene
sample
NGS
WGS
reads
• No need for reference strain
• Buffers recombination effect
• Simpler to create a nomenclature
• Population structure of non-monomorphic
species
• Multiple Schemas can be defined for a single
species
assembly
contigs
Central nomenclature server:
Schemas, Allele definitions and identifiers
Output :Allelic Profile
Source J. Carriço
Sequence
platforms
Loman et al., 2012 Nature Review
Microbiology
… I’m just using Illumina
For both de novo and re-sequencing
At the moment Illumina gives the
best benefit-cost ratio:
• High throughput
• Accuracy
• Possibility for multiplex
• Reasonable work flow time
• Easy accessible
For small genomes (1 to 2 Mb) it is
nowadays possible to sequence at
~90 euro/sample with minimum x40
coverage
I have the reads for each strain.. OK, and now?
An overview of main programs, platforms and approaches … sometime it is a question of style!
I want some results from reads…
You can always map your reads against a close reference genome using ”classical” short reads
aligners and extract SNPs: BWA for example
Here just a (long) list http://omictools.com/read-alignment-c83-p1.html
Now you just need to decide the reference genome
Note that you might need to select more than one reference genome to tune your analysis
…Be aware that there are available software designed specifically for
bacterial genomes
Assembly-free analyses
SNP CALLING AND CORE GENOME ALIGNMENTS - REFERENCE BASED MAPPING
Snippy
◦ One-by-one
◦ a set results using the same reference to
generate a core SNP alignment
◦ A lot of output files
◦ Variants: SNPs, MNPs, INDELs, MIX
Input Requirements
◦ a reference genome in FASTA or GENBANK
format (can be in multiple contigs)
◦ query sequence read files in FASTQ or FASTA
format (can be .gz compressed) format
Wombac
◦ Fast and “dirty”´; several samples in a run
◦ Computations can re-used for building new trees
◦ looks for substitution SNPs, not indels, and it may
miss some SNPs
Input Requirements
◦ a reference genome in FASTA or GENBANK format
(can be in multiple contigs)
◦ query sequences in
◦ a folder containing FASTQ short reads: eg. R1.fq.fz R2.fq.gz
◦ a multi-FASTA file: eg. contigs.fa or NC_273461.fna
◦ a .tar.gz file containing FASTA contig files: eg.
Ecoli_K12mut.contig.tar.gz (from EBI/NCBI)
https://github.com/tseemann/wombachttps://github.com/tseemann/snippy
@torstenseemann
Assembly-free analyses
SHORT READ SEQUENCE TYPING
Srst2
◦ design specifically for bacterial genomes
◦ Query Illumina sequence data, against an MLST database and/or a database
of gene sequences
◦ Report the presence of STs (allele designation) and/or reference genes
Input Requirements
◦ Query: illumina reads (fastq.gz format, but other options)
◦ A fasta reference sequence database to match to:
◦ For MLST, this means a fasta file of all allele sequences. If you want to assign STs, you also need a
tab-delim file which defines the ST profiles as a combination of alleles.
◦ For resistance/virulence genes, this means a fasta file of all the resistance genes/alleles that you
want to screen for, clustered into gene groups.
https://github.com/katholt/srst2
@DrKatHolt
Stand-alone pipeline for SNP variant
Nullarbor
◦ Clean reads
◦ Species identification  k-mer analysis against known genome database (Kraken)
◦ De novo assembly
◦ Annotation
◦ MLST
◦ Resistome
◦ SNP Variants
https://github.com/tseemann/nullarbor
@torstenseemann
… or you might prefer assemble your
genome!
When you know little or nothing of your dataset (it is not possible to select a
reference genome)
In case of deep comparative genomics when you also are interest in the accessory
genome (genes absence in your reference)
To extract the pangenome
Because having all your dataset assembled will facilitate downstream
applications
To develop common NOMENCLATURE
The never
ending
nomenclature
story…
Source J. Carriço
Assembly short reads
REFERENCE BASED ASSEMBLY
Mira (best assembler … for geeks since 1999 )
◦ multi-pass assembler/mapper for small genomes
(up to 150 Mb)
◦ has full overview on the whole project at any time
of the assembly, using all available data and
learning from mistakes
◦ Marks places of interest with tags so that these can
be found quickly in finishing programs
◦ can do also de novo and hybrid assembling
Input Requirements
◦ various formats (CAF, FASTA, FASTQ or PHD) from
Sanger, 454, Ion Torrent, illumina
DE NOVO ASSEMBLY
Spades (a very good assembler for lazy people)
◦ is intended for both standard isolates and single-
cell MDA bacteria assemblies
◦ It does its work and very well
◦ Simple to run
spades.py --careful -1 R1.fastq.gz -2 R2.fastq.gz –o output folder
◦ Can use Nanopore and PacBio for hydrid
assembly
Andrey’s lecture from WBG2014
https://docs.google.com/presentation/d/1wjrJGKhQQEHDwHF5OhQQyKnj5_c7
duTAQjcDsBHTkWQ/edit#slide=id.g47b5b1626_0793
http://sourceforge.net/projects/mira-assembler/ http://bioinf.spbau.ru/spades
@BaCh_mira
Pangenome alignment
(up to 50 strains)
MUGSY
Genomes should be very similar
Mugsy (also Mauve) alignment generated a
multiple block local alignment
Alignment format is in MAF
MAUVE
Large-scale evolutionary events
It can align more divergent strains than Mugsy:
as little as 50% nucleotide identity
It aligns the pan-genome
Complete genome alignment in the eXtended
Multi-FastA (XMFA)
List groups of genes that are predicted to be
positionally orthologous
GUI available
http://mugsy.sourceforge.net/
http://darlinglab.org/mauve/
Core genome alignment
PARSNP
Designed to align the core genome of hundreds to thousands of bacterial genomes within a few
minutes to few hours
Very very similar strains… it use MUMi to select the nearest genomes only the ones with
distance <= 0.01 are included, all others are discarded.
Input can be both draft assemblies and finished genomes, and output includes variant (SNP)
calls, core genome phylogeny ad multi-alignments
Results are visualized using a GUI
https://harvest.readthedocs.org/en/latest/content/parsnp.html
Gene-by-gene: pangenome,
coregenome, accessory genome
assembly
Structural
annotation
Ortholog
clustering
Prodigal
Prokka
RAST
OrthAgogue
Roary
Structural annotation
PRODIGAL
Gene finders
Very fast  3000 genomes in ~ a week (8 cpu
16 Gb RAM)
Prodigal can be run in one step on a single
genomic sequence or on a draft genome
containing many sequences.
It does not need to be supplied with any
knowledge of the organism, as it learns all the
properties it needs to on its own.
PROKKA
Structural and functional annotation
Fast automatic annotation  in multi-core <
15 min
Several dependencies  tedious to install (… I
told you I’m very lazy!)
http://www.slideshare.net/torstenseemann/p
rokka-rapid-bacterial-genome-annotation-
abphm-2013?related=1
https://github.com/hyattpd/prodigal/wiki https://github.com/tseemann/prokka
Ortholog clustering
ORTHAGOGUE
high speed estimation of homology relations
within and between species in massive data
sets
easy to use and offers flexibility through a r
Input = all-against-all BLAST tabular output;
range of optional parameters
Output = mcl file
-u -o XX  ignore e-value, use BLAST score,
esclude protein with overlap < XX
ROARY
high speed stand alone pan genome pipeline
128 samples can be analysed in under 1 hour
using 1 GB of RAM and a single processor
Input = GFF3 format produced by Prokka
Roary –e –mafft *.gff
FastTree –nt –gtr core_gene_alignment.aln >
my_tree.newick
Output = several files
https://code.google.com/p/orthagogue/ http://sanger-pathogens.github.io/Roary/
Gene-by-gene: pangenome,
coregenome, accessory genome
Ortholog
clustering
results
ad hoc
scripts
Core Genome
Accessory Genome
Pangenome
Phylogeny
RAxML
Fastree
BEAST
Everything included
in Roary but not in
OrthAgogue
Population
structure
BAPS
STRUCTURE
Recombination BRATNEXTGEN
GUBBINS
cgMLST and wgMLST
Strain 1
Strain 2
Strain 3
Strain 4
Strain 5
Strain 6
L1 L2L2 L3L4 L5 L6L7 L8 L9
Core Genome -> cgMLST Accessory genome
Core Genome+ Accessory Genome = PanGenome -> wgMLST
Source J. Carriço
@jacarrico
cgMLST and wgMLST
Open source
BACTERIAL ISOLATE GENOME SEQUENCE DATABASE
◦ Jolley & Maiden 2010, BMC Bioinformatics 11:595 - http://pubmlst.org/software/database/bigsdb/
◦ PROs: Freely available, open-source, handles thousands of genomes, has several schemas implemented
for MLSTfor several bacterial species, and some extended MLST and core genome MLST (mainly Neisseria
sp. but soon to be expanded)
◦ CONs: Requires Perl knowledge to install and maintain
Source J. Carriço
@jacarrico
cgMLST and wgMLST
Commercial software
RIDOM SEQSPHERE+
◦ http://www.ridom.com/seqsphere/
◦ with client server solutions from assembly to allele calling and visualization for core genome MLST
(MLST+/ cgMLST)
APPLIED MATHS - BIONUMERICS 7.5
◦ http://www.applied-maths.com/news/bionumerics-version-75-released
◦ Commercial software with client server solutions from assembly to allele calling and visualization for
whole genome MLST (wgMLST)
Source J. Carriço
@jacarrico
cgMLST with Genome Profiler
Index alleles of the loci that shared by the bacterial isolates implementing both BLASTN and
BLASTX
Transforms WGS data into allele profile data
Using a reference genome  it attempted to account for gene paralogy using conserved gene
neighborhoods
http://jcm.asm.org/content/53/5/1765.abstract
cgMLST with Genome Profiler
Input files
◦ reference genome in gbk format (even in multi-gbk format from RAST) or a multi-FASTA file the allele
sequences
◦ Query genomes in FASTA format (complete or draft – in contigs)
If you run the data for the first time, you use one of the genome as reference to built a new
cgMLST scheme (ad hoc mode):
◦ perl GeP.pl -r NC_017282.gbk -g genome_list.txt
Data can be run with the cgMLST scheme created previously by GeP:
◦ perl GeP.pl -g genome_list.txt –o
Or you could use a multi-Fasta file of the the allele sequences (nt) as reference (in this case all
possible paralogs are excluded - a fix number of 999999999 will be assigned to expect-d)
◦ perl GeP.pl -r NC_017282.ffn -g genome_list.txt -n
cgMLST with Genome Profiler
Output files:
◦ output.txt  records the information of all the loci in each of the test genome sequences
◦ difference_matrix.html  contains a summary of the analysis and a matrix of pairwise
differences between the allelic profiles of the samples.
◦ Splitstree.nex  allele profile of the isolates in NEXUS format, which can be opened in
Splitstree 4
◦ allele_profile.txt  matrix of allele profile (input file of STRUCTURE and BAPS)
◦ core_genomes.fas  alignment of the core genome in FASTA format
https://www.dropbox.com/sh/02pt21410hla1rf/AADGNL7W6Uxsb5cAR0kffSaUa?dl=0
Infering recombination events
GUBBINS
Iteratively identifies loci containing elevated
densities of base substitutions while
concurrently constructing a phylogeny based
on the putative point mutations outside of
these regions
Run in only a few hours on alignments of
hundreds of bacterial genome sequences.
BRATNEXTGEN
Bayesian analysis of recombinations in whole-
genome DNA sequence data
Use a GUI
Divides the genome into segments, then for
each segment, detects genetically distinct
clusters of isolates and estimates the
probabilities of recombination events
Run efficiently on a desktop computer .. I
tested up to 100 .. Results after O/N
https://github.com/sanger-pathogens/Gubbins
http://www.helsinki.fi/bsg/software/BRAT-NextGen/
https://www.dropbox.com/s/gppp5xs2pkw87ms/BratNextGen_manual.pdf?dl=0
Phylogeny (phylogeography) visualization
A directory for tree visualization
http://www.informatik.uni-rostock.de/~hs162/treeposter/poster.html
My favorite tree editor/viewer
http://itol.embl.de/
A very nice tool for phylogeography
http://microreact.org/showcase/

More Related Content

What's hot

Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Li Shen
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencingDenis C. Bauer
 
Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014LutzFr
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowBrian Krueger
 
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...Joseph Hughes
 
ECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialThomas Keane
 
Semiconductor Sequencing Applications for Plant Sciences
Semiconductor Sequencing Applications for Plant SciencesSemiconductor Sequencing Applications for Plant Sciences
Semiconductor Sequencing Applications for Plant SciencesThermo Fisher Scientific
 
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...EMC
 
Lab in a Suitcase and Other Adventures with Nanopore Sequencing
Lab in a Suitcase and Other Adventures with Nanopore SequencingLab in a Suitcase and Other Adventures with Nanopore Sequencing
Lab in a Suitcase and Other Adventures with Nanopore Sequencingscalene
 
Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016Surya Saha
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngsDin Apellidos
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingSajad Rafatiyan
 
RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities Paolo Dametto
 

What's hot (20)

Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencing
 
Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014
 
Biotech autumn2012-02-ngs2
Biotech autumn2012-02-ngs2Biotech autumn2012-02-ngs2
Biotech autumn2012-02-ngs2
 
Ngs intro_v6_public
 Ngs intro_v6_public Ngs intro_v6_public
Ngs intro_v6_public
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can Know
 
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
 
ECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing Tutorial
 
Semiconductor Sequencing Applications for Plant Sciences
Semiconductor Sequencing Applications for Plant SciencesSemiconductor Sequencing Applications for Plant Sciences
Semiconductor Sequencing Applications for Plant Sciences
 
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
 
Lab in a Suitcase and Other Adventures with Nanopore Sequencing
Lab in a Suitcase and Other Adventures with Nanopore SequencingLab in a Suitcase and Other Adventures with Nanopore Sequencing
Lab in a Suitcase and Other Adventures with Nanopore Sequencing
 
Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 
Ngs part i 2013
Ngs part i 2013Ngs part i 2013
Ngs part i 2013
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
 
RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities
 

Viewers also liked

What can we do with microbial WGS data? - t.seemann - mc gill summer 2016 - ...
What can we do with microbial WGS data?  - t.seemann - mc gill summer 2016 - ...What can we do with microbial WGS data?  - t.seemann - mc gill summer 2016 - ...
What can we do with microbial WGS data? - t.seemann - mc gill summer 2016 - ...Torsten Seemann
 
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015Torsten Seemann
 
The Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsThe Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsExternalEvents
 
Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety? Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety? FAO
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...nist-spin
 
Metagenomics sequencing
Metagenomics sequencingMetagenomics sequencing
Metagenomics sequencingcdgenomics525
 
Aug2015 deanna church analytical validation
Aug2015 deanna church analytical validationAug2015 deanna church analytical validation
Aug2015 deanna church analytical validationGenomeInABottle
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsJoão André Carriço
 
Whole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyWhole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyPhilip Ashton
 
Genome Wide Methodologies and Future Perspectives
 Genome Wide Methodologies and Future Perspectives Genome Wide Methodologies and Future Perspectives
Genome Wide Methodologies and Future PerspectivesBrian Krueger
 
The Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingThe Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingEmiliano De Cristofaro
 
Innovative NGS Library Construction Technology
Innovative NGS Library Construction TechnologyInnovative NGS Library Construction Technology
Innovative NGS Library Construction TechnologyQIAGEN
 
DNA Sequencing from Single Cell
DNA Sequencing from Single CellDNA Sequencing from Single Cell
DNA Sequencing from Single CellQIAGEN
 
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceGenomeInABottle
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesTools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesSurya Saha
 

Viewers also liked (20)

What can we do with microbial WGS data? - t.seemann - mc gill summer 2016 - ...
What can we do with microbial WGS data?  - t.seemann - mc gill summer 2016 - ...What can we do with microbial WGS data?  - t.seemann - mc gill summer 2016 - ...
What can we do with microbial WGS data? - t.seemann - mc gill summer 2016 - ...
 
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for HarmonizationEU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
 
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
 
Poster ESHG
Poster ESHGPoster ESHG
Poster ESHG
 
The Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsThe Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groups
 
Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety? Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety?
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Rossen eccmid2015v1.5
Rossen eccmid2015v1.5Rossen eccmid2015v1.5
Rossen eccmid2015v1.5
 
Metagenomics sequencing
Metagenomics sequencingMetagenomics sequencing
Metagenomics sequencing
 
Aug2015 deanna church analytical validation
Aug2015 deanna church analytical validationAug2015 deanna church analytical validation
Aug2015 deanna church analytical validation
 
Proposal for 2016 survey of WGS capacity in EU/EEA Member States
Proposal for 2016 survey of WGS capacity in EU/EEA Member StatesProposal for 2016 survey of WGS capacity in EU/EEA Member States
Proposal for 2016 survey of WGS capacity in EU/EEA Member States
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
Whole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyWhole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiology
 
Genome Wide Methodologies and Future Perspectives
 Genome Wide Methodologies and Future Perspectives Genome Wide Methodologies and Future Perspectives
Genome Wide Methodologies and Future Perspectives
 
The Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingThe Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome Sequencing
 
Innovative NGS Library Construction Technology
Innovative NGS Library Construction TechnologyInnovative NGS Library Construction Technology
Innovative NGS Library Construction Technology
 
DNA Sequencing from Single Cell
DNA Sequencing from Single CellDNA Sequencing from Single Cell
DNA Sequencing from Single Cell
 
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesTools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
 

Similar to Bacterial Population Genomics Using Next-Generation Sequencing

CS Lecture 2017 04-11 from Data to Precision Medicine
CS Lecture 2017 04-11 from Data to Precision MedicineCS Lecture 2017 04-11 from Data to Precision Medicine
CS Lecture 2017 04-11 from Data to Precision MedicineGabe Rudy
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 
whole-genome-sequencing-guide-small-genomes.pdf.pdf
whole-genome-sequencing-guide-small-genomes.pdf.pdfwhole-genome-sequencing-guide-small-genomes.pdf.pdf
whole-genome-sequencing-guide-small-genomes.pdf.pdfCRISTIANALONSORODRIG1
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...OECD Environment
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious DiseaseJoão André Carriço
 
Knowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsKnowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsGolden Helix Inc
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Integrated DNA Technologies
 
Browsing Genes, Variation and Regulation data with Ensembl
Browsing Genes, Variation and Regulation data with EnsemblBrowsing Genes, Variation and Regulation data with Ensembl
Browsing Genes, Variation and Regulation data with EnsemblDenise Carvalho-Silva, PhD
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingJonathan Eisen
 
ECCMID 2015 - So I have sequenced my genome ... what now?
ECCMID 2015 - So I have sequenced my genome ... what now?ECCMID 2015 - So I have sequenced my genome ... what now?
ECCMID 2015 - So I have sequenced my genome ... what now?Nick Loman
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshopc.titus.brown
 
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014Torsten Seemann
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMonica Munoz-Torres
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS
 
Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Torsten Seemann
 

Similar to Bacterial Population Genomics Using Next-Generation Sequencing (20)

CS Lecture 2017 04-11 from Data to Precision Medicine
CS Lecture 2017 04-11 from Data to Precision MedicineCS Lecture 2017 04-11 from Data to Precision Medicine
CS Lecture 2017 04-11 from Data to Precision Medicine
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Ensembl Browser Workshop
Ensembl Browser WorkshopEnsembl Browser Workshop
Ensembl Browser Workshop
 
whole-genome-sequencing-guide-small-genomes.pdf.pdf
whole-genome-sequencing-guide-small-genomes.pdf.pdfwhole-genome-sequencing-guide-small-genomes.pdf.pdf
whole-genome-sequencing-guide-small-genomes.pdf.pdf
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious Disease
 
Knowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsKnowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and Variants
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
 
Browsing Genes, Variation and Regulation data with Ensembl
Browsing Genes, Variation and Regulation data with EnsemblBrowsing Genes, Variation and Regulation data with Ensembl
Browsing Genes, Variation and Regulation data with Ensembl
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meeting
 
2014 ucl
2014 ucl2014 ucl
2014 ucl
 
ECCMID 2015 - So I have sequenced my genome ... what now?
ECCMID 2015 - So I have sequenced my genome ... what now?ECCMID 2015 - So I have sequenced my genome ... what now?
ECCMID 2015 - So I have sequenced my genome ... what now?
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshop
 
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ss
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013
 

More from Mirko Rossi

Exploring the role of Campylobacter. coli GT-42 enzymes on LOS biosynthesis
Exploring the role of Campylobacter. coli GT-42 enzymes on LOS biosynthesisExploring the role of Campylobacter. coli GT-42 enzymes on LOS biosynthesis
Exploring the role of Campylobacter. coli GT-42 enzymes on LOS biosynthesisMirko Rossi
 
Bacteria in Drinking water
Bacteria in Drinking waterBacteria in Drinking water
Bacteria in Drinking waterMirko Rossi
 
Lecture swimming water_vs2016
Lecture swimming water_vs2016Lecture swimming water_vs2016
Lecture swimming water_vs2016Mirko Rossi
 
Parasites and Drinking water
Parasites and Drinking waterParasites and Drinking water
Parasites and Drinking waterMirko Rossi
 
Microbiological quality of drinking water
Microbiological quality of drinking water Microbiological quality of drinking water
Microbiological quality of drinking water Mirko Rossi
 
Genomic epidemiology of Campylobacter jejuni
Genomic epidemiology of Campylobacter jejuniGenomic epidemiology of Campylobacter jejuni
Genomic epidemiology of Campylobacter jejuniMirko Rossi
 
Campylobacter jejuni as foodborne pathogen
Campylobacter jejuni as foodborne pathogenCampylobacter jejuni as foodborne pathogen
Campylobacter jejuni as foodborne pathogenMirko Rossi
 

More from Mirko Rossi (8)

Exploring the role of Campylobacter. coli GT-42 enzymes on LOS biosynthesis
Exploring the role of Campylobacter. coli GT-42 enzymes on LOS biosynthesisExploring the role of Campylobacter. coli GT-42 enzymes on LOS biosynthesis
Exploring the role of Campylobacter. coli GT-42 enzymes on LOS biosynthesis
 
Bacteria in Drinking water
Bacteria in Drinking waterBacteria in Drinking water
Bacteria in Drinking water
 
Lecture swimming water_vs2016
Lecture swimming water_vs2016Lecture swimming water_vs2016
Lecture swimming water_vs2016
 
Pest Control
Pest ControlPest Control
Pest Control
 
Parasites and Drinking water
Parasites and Drinking waterParasites and Drinking water
Parasites and Drinking water
 
Microbiological quality of drinking water
Microbiological quality of drinking water Microbiological quality of drinking water
Microbiological quality of drinking water
 
Genomic epidemiology of Campylobacter jejuni
Genomic epidemiology of Campylobacter jejuniGenomic epidemiology of Campylobacter jejuni
Genomic epidemiology of Campylobacter jejuni
 
Campylobacter jejuni as foodborne pathogen
Campylobacter jejuni as foodborne pathogenCampylobacter jejuni as foodborne pathogen
Campylobacter jejuni as foodborne pathogen
 

Recently uploaded

6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPRPirithiRaju
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxtuking87
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxzaydmeerab121
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书zdzoqco
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11GelineAvendao
 
projectile motion, impulse and moment
projectile  motion, impulse  and  momentprojectile  motion, impulse  and  moment
projectile motion, impulse and momentdonamiaquintan2
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGiovaniTrinidad
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxzeus70441
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clonechaudhary charan shingh university
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 

Recently uploaded (20)

6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptx
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
 
projectile motion, impulse and moment
projectile  motion, impulse  and  momentprojectile  motion, impulse  and  moment
projectile motion, impulse and moment
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptx
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptx
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clone
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 

Bacterial Population Genomics Using Next-Generation Sequencing

  • 1. Toolbox for bacterial population analysis using NGS INTRODUCTION OF BACTERIAL POPULATION GENOMICS AND EVOLUTION MIRKO ROSSI ASS. PROF. ENVIRONMENTAL HYGIENE, FACULTY OF VETERINARY MEDICINE
  • 2. I’m a vet and not a bioinformatics.. I’m a good example of end-user! I do not want to teach population genetics today … just give you some tips how to do it using NGS in bacteria If you are interested in bacterial population analysis … we are organizing an ad hoc course in Spring .. There are several more software/pipelines.. These are the ones I like/I know/I apply If you want the slides send an Email to me mirko.rossi@helsinki.fi If you are a MSc in bioinformatics and interested in thesis in applied bioinformatics in public health microbiology and pathogen surveillance please contact me ..
  • 3. Bacterial population A group of individuals of the same species POPULATIONS, not individuals, evolve Population and community are two different concepts … WE ARE SPEAKING OF INDIVIDUALS OF THE SAME SPECIES!!!! … although the definition of species in bacteriology is quite vague Population genomics attend to understand the population by whole genome analysis a sample of it  investigating the variation of a subset of individual members of the population “Sequence data is ideal for this, as the differences between individuals are often tiny (i.e. there is very little variation) since they belong to a single population, and DNA sequence data allows us to detect single nucleotide changes (ie provides high resolution)” (Kate Hold)
  • 4. The sample is a subset of the population 4 Population Universe Reality State of nature Truth parameters Sample Finite, random noise error perturbation statistics Statistical inference: Extract maximum information from sample in order to draw conclusions about population Inductive not deductive Source John Bunge
  • 5. How many samples do I need to sequence? It depends on your question! Accuracy is important.. but big numbers help! Draft genomes are enough. Closing a genome is a waste of time and money! good draft 100 €/s  closed > 3000 €/s Include in your analysis as much diversity as possible (time, space, phenotypes,...) Sequence as much as you can … just stop before you get broke!! 1000 strains < 100 000 €
  • 6. Bacterial population… different levels Population of H. pylori living in a single stomach Population of H. pylori circulating globally
  • 7. What do we want to measure? Genetic Drift ◦ the change in the gene pool of a small population due to chance Natural Selection ◦ Allele increasing fitness will accumulate in the population ◦ Cause ADAPTATION of Populations Gene Flow ◦ is genetic exchange due to the migration of individuals between populations
  • 8. How do we measure (using NGS)? Identify variants: ◦SNPapproach ◦Gene-by-geneapproach Define which part of the gene pool is common in all the individuals of the population (core) and which part is not (accessory) Use of phylogenetic frameworks for reconstructing genealogy and non-phylogenetic clustering methods for inferring population structure
  • 9. Applications Outbreak determination Pathogen transmission Understanding epidemics Pathogen surveillance Understanding evolution of bacteria …. @jennifergardy
  • 10. Identifying variants: SNP approaches sample NGS WGS reads Mapping to reference VCF/Fasta File with SNPs • Needs a reference strain • Monomorphic (Clonal) species • Recombination/Horizontal gene transfer is a problem • Difficult to create a nomenclature Source J. Carriço
  • 11. Identifying variants: Gene-by-gene sample NGS WGS reads • No need for reference strain • Buffers recombination effect • Simpler to create a nomenclature • Population structure of non-monomorphic species • Multiple Schemas can be defined for a single species assembly contigs Central nomenclature server: Schemas, Allele definitions and identifiers Output :Allelic Profile Source J. Carriço
  • 12. Sequence platforms Loman et al., 2012 Nature Review Microbiology
  • 13. … I’m just using Illumina For both de novo and re-sequencing At the moment Illumina gives the best benefit-cost ratio: • High throughput • Accuracy • Possibility for multiplex • Reasonable work flow time • Easy accessible For small genomes (1 to 2 Mb) it is nowadays possible to sequence at ~90 euro/sample with minimum x40 coverage
  • 14. I have the reads for each strain.. OK, and now? An overview of main programs, platforms and approaches … sometime it is a question of style!
  • 15. I want some results from reads… You can always map your reads against a close reference genome using ”classical” short reads aligners and extract SNPs: BWA for example Here just a (long) list http://omictools.com/read-alignment-c83-p1.html Now you just need to decide the reference genome Note that you might need to select more than one reference genome to tune your analysis …Be aware that there are available software designed specifically for bacterial genomes
  • 16. Assembly-free analyses SNP CALLING AND CORE GENOME ALIGNMENTS - REFERENCE BASED MAPPING Snippy ◦ One-by-one ◦ a set results using the same reference to generate a core SNP alignment ◦ A lot of output files ◦ Variants: SNPs, MNPs, INDELs, MIX Input Requirements ◦ a reference genome in FASTA or GENBANK format (can be in multiple contigs) ◦ query sequence read files in FASTQ or FASTA format (can be .gz compressed) format Wombac ◦ Fast and “dirty”´; several samples in a run ◦ Computations can re-used for building new trees ◦ looks for substitution SNPs, not indels, and it may miss some SNPs Input Requirements ◦ a reference genome in FASTA or GENBANK format (can be in multiple contigs) ◦ query sequences in ◦ a folder containing FASTQ short reads: eg. R1.fq.fz R2.fq.gz ◦ a multi-FASTA file: eg. contigs.fa or NC_273461.fna ◦ a .tar.gz file containing FASTA contig files: eg. Ecoli_K12mut.contig.tar.gz (from EBI/NCBI) https://github.com/tseemann/wombachttps://github.com/tseemann/snippy @torstenseemann
  • 17. Assembly-free analyses SHORT READ SEQUENCE TYPING Srst2 ◦ design specifically for bacterial genomes ◦ Query Illumina sequence data, against an MLST database and/or a database of gene sequences ◦ Report the presence of STs (allele designation) and/or reference genes Input Requirements ◦ Query: illumina reads (fastq.gz format, but other options) ◦ A fasta reference sequence database to match to: ◦ For MLST, this means a fasta file of all allele sequences. If you want to assign STs, you also need a tab-delim file which defines the ST profiles as a combination of alleles. ◦ For resistance/virulence genes, this means a fasta file of all the resistance genes/alleles that you want to screen for, clustered into gene groups. https://github.com/katholt/srst2 @DrKatHolt
  • 18. Stand-alone pipeline for SNP variant Nullarbor ◦ Clean reads ◦ Species identification  k-mer analysis against known genome database (Kraken) ◦ De novo assembly ◦ Annotation ◦ MLST ◦ Resistome ◦ SNP Variants https://github.com/tseemann/nullarbor @torstenseemann
  • 19. … or you might prefer assemble your genome! When you know little or nothing of your dataset (it is not possible to select a reference genome) In case of deep comparative genomics when you also are interest in the accessory genome (genes absence in your reference) To extract the pangenome Because having all your dataset assembled will facilitate downstream applications To develop common NOMENCLATURE
  • 21. Assembly short reads REFERENCE BASED ASSEMBLY Mira (best assembler … for geeks since 1999 ) ◦ multi-pass assembler/mapper for small genomes (up to 150 Mb) ◦ has full overview on the whole project at any time of the assembly, using all available data and learning from mistakes ◦ Marks places of interest with tags so that these can be found quickly in finishing programs ◦ can do also de novo and hybrid assembling Input Requirements ◦ various formats (CAF, FASTA, FASTQ or PHD) from Sanger, 454, Ion Torrent, illumina DE NOVO ASSEMBLY Spades (a very good assembler for lazy people) ◦ is intended for both standard isolates and single- cell MDA bacteria assemblies ◦ It does its work and very well ◦ Simple to run spades.py --careful -1 R1.fastq.gz -2 R2.fastq.gz –o output folder ◦ Can use Nanopore and PacBio for hydrid assembly Andrey’s lecture from WBG2014 https://docs.google.com/presentation/d/1wjrJGKhQQEHDwHF5OhQQyKnj5_c7 duTAQjcDsBHTkWQ/edit#slide=id.g47b5b1626_0793 http://sourceforge.net/projects/mira-assembler/ http://bioinf.spbau.ru/spades @BaCh_mira
  • 22. Pangenome alignment (up to 50 strains) MUGSY Genomes should be very similar Mugsy (also Mauve) alignment generated a multiple block local alignment Alignment format is in MAF MAUVE Large-scale evolutionary events It can align more divergent strains than Mugsy: as little as 50% nucleotide identity It aligns the pan-genome Complete genome alignment in the eXtended Multi-FastA (XMFA) List groups of genes that are predicted to be positionally orthologous GUI available http://mugsy.sourceforge.net/ http://darlinglab.org/mauve/
  • 23. Core genome alignment PARSNP Designed to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours Very very similar strains… it use MUMi to select the nearest genomes only the ones with distance <= 0.01 are included, all others are discarded. Input can be both draft assemblies and finished genomes, and output includes variant (SNP) calls, core genome phylogeny ad multi-alignments Results are visualized using a GUI https://harvest.readthedocs.org/en/latest/content/parsnp.html
  • 24. Gene-by-gene: pangenome, coregenome, accessory genome assembly Structural annotation Ortholog clustering Prodigal Prokka RAST OrthAgogue Roary
  • 25. Structural annotation PRODIGAL Gene finders Very fast  3000 genomes in ~ a week (8 cpu 16 Gb RAM) Prodigal can be run in one step on a single genomic sequence or on a draft genome containing many sequences. It does not need to be supplied with any knowledge of the organism, as it learns all the properties it needs to on its own. PROKKA Structural and functional annotation Fast automatic annotation  in multi-core < 15 min Several dependencies  tedious to install (… I told you I’m very lazy!) http://www.slideshare.net/torstenseemann/p rokka-rapid-bacterial-genome-annotation- abphm-2013?related=1 https://github.com/hyattpd/prodigal/wiki https://github.com/tseemann/prokka
  • 26. Ortholog clustering ORTHAGOGUE high speed estimation of homology relations within and between species in massive data sets easy to use and offers flexibility through a r Input = all-against-all BLAST tabular output; range of optional parameters Output = mcl file -u -o XX  ignore e-value, use BLAST score, esclude protein with overlap < XX ROARY high speed stand alone pan genome pipeline 128 samples can be analysed in under 1 hour using 1 GB of RAM and a single processor Input = GFF3 format produced by Prokka Roary –e –mafft *.gff FastTree –nt –gtr core_gene_alignment.aln > my_tree.newick Output = several files https://code.google.com/p/orthagogue/ http://sanger-pathogens.github.io/Roary/
  • 27. Gene-by-gene: pangenome, coregenome, accessory genome Ortholog clustering results ad hoc scripts Core Genome Accessory Genome Pangenome Phylogeny RAxML Fastree BEAST Everything included in Roary but not in OrthAgogue Population structure BAPS STRUCTURE Recombination BRATNEXTGEN GUBBINS
  • 28. cgMLST and wgMLST Strain 1 Strain 2 Strain 3 Strain 4 Strain 5 Strain 6 L1 L2L2 L3L4 L5 L6L7 L8 L9 Core Genome -> cgMLST Accessory genome Core Genome+ Accessory Genome = PanGenome -> wgMLST Source J. Carriço @jacarrico
  • 29. cgMLST and wgMLST Open source BACTERIAL ISOLATE GENOME SEQUENCE DATABASE ◦ Jolley & Maiden 2010, BMC Bioinformatics 11:595 - http://pubmlst.org/software/database/bigsdb/ ◦ PROs: Freely available, open-source, handles thousands of genomes, has several schemas implemented for MLSTfor several bacterial species, and some extended MLST and core genome MLST (mainly Neisseria sp. but soon to be expanded) ◦ CONs: Requires Perl knowledge to install and maintain Source J. Carriço @jacarrico
  • 30. cgMLST and wgMLST Commercial software RIDOM SEQSPHERE+ ◦ http://www.ridom.com/seqsphere/ ◦ with client server solutions from assembly to allele calling and visualization for core genome MLST (MLST+/ cgMLST) APPLIED MATHS - BIONUMERICS 7.5 ◦ http://www.applied-maths.com/news/bionumerics-version-75-released ◦ Commercial software with client server solutions from assembly to allele calling and visualization for whole genome MLST (wgMLST) Source J. Carriço @jacarrico
  • 31. cgMLST with Genome Profiler Index alleles of the loci that shared by the bacterial isolates implementing both BLASTN and BLASTX Transforms WGS data into allele profile data Using a reference genome  it attempted to account for gene paralogy using conserved gene neighborhoods http://jcm.asm.org/content/53/5/1765.abstract
  • 32. cgMLST with Genome Profiler Input files ◦ reference genome in gbk format (even in multi-gbk format from RAST) or a multi-FASTA file the allele sequences ◦ Query genomes in FASTA format (complete or draft – in contigs) If you run the data for the first time, you use one of the genome as reference to built a new cgMLST scheme (ad hoc mode): ◦ perl GeP.pl -r NC_017282.gbk -g genome_list.txt Data can be run with the cgMLST scheme created previously by GeP: ◦ perl GeP.pl -g genome_list.txt –o Or you could use a multi-Fasta file of the the allele sequences (nt) as reference (in this case all possible paralogs are excluded - a fix number of 999999999 will be assigned to expect-d) ◦ perl GeP.pl -r NC_017282.ffn -g genome_list.txt -n
  • 33. cgMLST with Genome Profiler Output files: ◦ output.txt  records the information of all the loci in each of the test genome sequences ◦ difference_matrix.html  contains a summary of the analysis and a matrix of pairwise differences between the allelic profiles of the samples. ◦ Splitstree.nex  allele profile of the isolates in NEXUS format, which can be opened in Splitstree 4 ◦ allele_profile.txt  matrix of allele profile (input file of STRUCTURE and BAPS) ◦ core_genomes.fas  alignment of the core genome in FASTA format https://www.dropbox.com/sh/02pt21410hla1rf/AADGNL7W6Uxsb5cAR0kffSaUa?dl=0
  • 34. Infering recombination events GUBBINS Iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions Run in only a few hours on alignments of hundreds of bacterial genome sequences. BRATNEXTGEN Bayesian analysis of recombinations in whole- genome DNA sequence data Use a GUI Divides the genome into segments, then for each segment, detects genetically distinct clusters of isolates and estimates the probabilities of recombination events Run efficiently on a desktop computer .. I tested up to 100 .. Results after O/N https://github.com/sanger-pathogens/Gubbins http://www.helsinki.fi/bsg/software/BRAT-NextGen/ https://www.dropbox.com/s/gppp5xs2pkw87ms/BratNextGen_manual.pdf?dl=0
  • 35. Phylogeny (phylogeography) visualization A directory for tree visualization http://www.informatik.uni-rostock.de/~hs162/treeposter/poster.html My favorite tree editor/viewer http://itol.embl.de/ A very nice tool for phylogeography http://microreact.org/showcase/