SlideShare una empresa de Scribd logo
1 de 25
Scaffolding using long nanopore reads
and more
Hans Jansen
Christiaan Henkel
senior scientist
Dutch SME at Bioscience Park in Leiden, the Netherlands
• High throughput drug screens, and toxicity assays in zebrafish larvae
• Fish fertility (eel, pike perch, sole) to aid sustainable aquaculture
• Sequencing (genomes, transcriptomes)
• Bioinformatics
ZF-screens B.V.
Genome projects
Common carp (Cyprinus carpio)
High troughput screening model
Genome and transcriptomes
European and Japanese eel (Anguilla anguilla and Anguilla japonica)
Completing the life cycle in aquaculture
Genome and transcriptomes
King cobra (Ophiophagus hannah)
Evolution and toxins
Genome and transcriptomes
But the quality of these genomes can be improved
But MAP is much more. It is about being a community and a playground to test new
applications. As Gordon Sanghera (CEO of ONT) said "MAP will never end. There will
always be a MAP“.
So if you think you're application can benefit from nanopore sensing then come join
MAP and play with us.
Visible as a web portal with information from ONT and social media like system with
blog possibilities, comment, likes, and a forum to ask advice.
MinION Access Program
We entered when MAP started.
Our first MinION arrived in April 2014 and the first kits in June.
Since then run 30 Flow Cells.
MAPpers competition
Topped the leaderboard on read length and yield so we now have three MinION's.
MinION Access Program and ZF-genomics
Longest 2D read: 93.5 Kbp
Longest template read: 120 Kbp (231 Kbp)
Highest yield: 1.32 Gevents
R7
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Base pairs sequenced (Mbp)
Runs
template and 2D yield over the past year
template
2D
R7.3R6
Scaffolding genomes using long reads
or
How to untangle the assembly graph
Cheap short read sequencing technology has been used to generate many draft genomes
repeatunique sequence in unique sequence out
Draft genomes made with short read data suffer from a fundamental problem.
Reads that are shorter than the length of a repeat can’t connect the unique sequence in with
the unique sequence out
Genomic sequences
Short reads
repeatunique sequence in unique sequence out
Long reads can help to resolve repeat area’s in the assembly graph
And the resulting contigs will now look like this:
Untangle
1. Short read correction Quake (not for small genomes)
2. Short read assembly Velvet
3. MinION read alignment to Velvet contigs LAST
4. Link filtering and contig tiling Untangle script
5. Path detachment around repeats Untangle script
6. Bubble popping Untangle script
7. Delete unconfirmed connections Untangle script
8. Contig extraction Untangle script
Assembly and scaffolding strategy
Task Software
Agrobacterium strain NCPPB 1771
Agrobacteria are the cause of crown gall
disease, a tumorous growth of plant tissue.
Agrobacteria transfer part of their (plasmid)
DNA to their host and this feature is used
widely in plant research to genetically
modify plants.
Agrobacteria have two chromosomes, and
carry several plasmids. This strain also
carries active transposons.
NCPPB 1771 assembly graph
25× transposon →
(1160 bp)
8× transposon →
(873 bp)
4× rRNA →
(6.4 Kb)
271 nodes, 311 connections
154 contigs
N50 = 198 Kb
Sum = 5.87 Mb
• Alignment: LAST with optimized settings
• Links: alignment filtering and contig tiling
• 7328 reads aligned to contigs
• 438 reads aligned to multiple contigs
• 585 links between contigs
• 13158 reads on R6 and R7 chemistry
• 73.8 Mb total yield (template and 2D)
• 5–85970 nt length, typical ~12 Kb
MinION sequencing and scaffolding
Links between nodes are specific
Means link is confirmed by PCR
Final assembly graph after scaffolding
• 271 nodes + 312 connections → 49 nodes + 5 connections
• 154 contigs → ~8 contigs
• Complete chromosome 2 (1.2 Mb), pTi (190 Kb), cryptic megaplasmid (746 Kb)
• Slight residual fragmentation of chromosome 1
MinION Analysis and Reference Consortium
MARC is a consortium within MAP that seeks to establish sources of variation,
optimize protocols and analysis.
It is open science. Data is shared in the consortium and will be made available
through ENA.
~100 people have signed up. ~7 experimental groups and ~4 analysis groups
are actively working.
Managed by weekly TC.
Different phases in MARC
Phase 1 is about being as standard as possible and establish variation in the
system and between sites.
This is done by 5 labs in the Netherlands, UK (2), USA ( east and west coast).
Phase 2 is all about tweaking the protocol. Things like DNA isolation, shearing
(or not), running scripts, DNA modifications will be addressed in this phase.
Phase 3 is about examples of applications.
MinION Analysis and Reference Consortium
MinION Analysis and Reference Consortium
In phase 1 the 5 participating labs received Escherichia coli str. K-12 substr. MG1655.
Performed DNA isolation, library prep, and sequencing according to a detailed protocol.
Per lab a total of 4 libraries with 2 different kits were prepared and run.
This provides a excellent data set to understand sources of variance in ONT data.
5e+04
1e+05
40000 50000 60000 70000 80000 90000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Total Traces
5e+04
1e+05
40000 50000 60000 70000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Template Reads
20000
40000
60000
20000 30000 40000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Complement Reads
10000
20000
30000
40000
20000 30000 40000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
2D Reads
Read Counts
Read Length Statistics
4000
4500
5000
5500
2000 3000 4000 5000 6000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Template Mean
3500
4000
4500
5000
5500
2000 4000 6000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Template Median
4000
4500
5000
5500
3500 4000 4500
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Template STDEV
4000
4500
5000
5500
6000
2000 3000 4000 5000 6000 7000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Complement Mean
4000
5000
2000 4000 6000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Complement Median
3250
3500
3750
3000 3500 4000 4500
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Complement STDEV
4500
5000
5500
6000
6500
2000 4000 6000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
2D Mean
4000
5000
6000
2000 4000 6000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
2D Median
3000
3500
4000
2500 3000 3500 4000
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
2D STDEV
60
65
70
75
40 50 60 70
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Template % aligned
72
76
80
84
60 70
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
Complement % aligned
92
93
94
95
85 90
Run2
Run1
Sample
CSH
UCSC
UEA
WTCHG
ZF
2D % aligned
60
61
62
63
60.0 62.5 65.0 67.5 70.0
Run2
Run1
Sample
CSH
UCSC
UEA
ZF
Template 4 Sites
70
72
74
76
78
72 74 76
Run2
Run1
Sample
CSH
UCSC
UEA
ZF
Complement 4 Sites
91.5
92.0
92.5
93.0
92.5 93.0 93.5 94.0
Run2
Run1
Sample
CSH
UCSC
UEA
ZF
2D 4 Sites
Read Alignments
With the data of the first 10 runs analyzed we can already see that read length has a
stronger lab effect than base pair identity to the reference.
Another set of 10 phase 1 runs is currently being analyzed and will give a clearer
picture on variability.
Experiments for phase 2 will start shortly, while in parallel phase 3 experiments and
analysis are being done.
Conclusions and perspectives
The king cobra genome
Rapid expansion of the 3 FTx gen family in
the king cobra
London Calling 2015
Highlights from Clive Brown’s talk
• Improvements to the basecaller . There’s still room for improvement.
• Read until (and barcoding).
• Fast mode on the MinION MkI (500 bp/sec instead of 30)
• New 3000 channel ASIC with crumpet chip design to separate ASIC and fluidics part.
• MinION MkII and PromethION will have this new ASIC.
• Library prep on beads to reduce amounts of DNA needed (lower ng to pg).
• Direct RNA sequencing.
• Simplified sample preparation and VolTRAX.
• Pricing will be “pay as you go”. Initial payment for hardware include some hrs sequencing.
• MkI $270 and 3 hrs sequencing (~3 Gbp in fast mode).
Acknowledgements
Prof. Dr. Paul Hooykaas, Leiden University
Christiaan Henkel
senior scientist
Leiden University
Ron Dirks (CEO of ZF-screens B.V.)
All members of the MARC consortium
Ewan Birney, EMBL-EBI
Justin O’Grady, UEA
Sara Goodwin, CSHL
David Buck, WTCHG Oxford
Vadim Zalunin, EMBL-EBI
Miten Jain, UCSC
Matt Loose, Nottingham
Jared Simpson, OICR, Toronto

Más contenido relacionado

La actualidad más candente

A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
mkim8
 
140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal
GenomeInABottle
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approach
Hong ChangBum
 

La actualidad más candente (20)

Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
 
BioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing ProductsBioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing Products
 
ECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing Tutorial
 
Nano Pore sequencing
Nano Pore sequencingNano Pore sequencing
Nano Pore sequencing
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
Exome Sequencing
Exome SequencingExome Sequencing
Exome Sequencing
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
 
NGS: bioinformatic challenges
NGS: bioinformatic challengesNGS: bioinformatic challenges
NGS: bioinformatic challenges
 
Rnaseq basics ngs_application1
Rnaseq basics ngs_application1Rnaseq basics ngs_application1
Rnaseq basics ngs_application1
 
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesTools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
 
Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencing
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
 
ASM Microbe 2017: Reaching the Parts Other Methods Can't: Long Reads for Micr...
ASM Microbe 2017: Reaching the Parts Other Methods Can't: Long Reads for Micr...ASM Microbe 2017: Reaching the Parts Other Methods Can't: Long Reads for Micr...
ASM Microbe 2017: Reaching the Parts Other Methods Can't: Long Reads for Micr...
 
next generation sequencing (recent collection2018)
next generation sequencing (recent collection2018) next generation sequencing (recent collection2018)
next generation sequencing (recent collection2018)
 
140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal
 
NGS - Basic principles and sequencing platforms
NGS - Basic principles and sequencing platformsNGS - Basic principles and sequencing platforms
NGS - Basic principles and sequencing platforms
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approach
 
Coding & Best Practice in Programming in the NGS era
Coding & Best Practice in Programming in the NGS eraCoding & Best Practice in Programming in the NGS era
Coding & Best Practice in Programming in the NGS era
 

Destacado (13)

Base Map 520 Baker Street by Allison Lower
Base Map 520 Baker Street by Allison LowerBase Map 520 Baker Street by Allison Lower
Base Map 520 Baker Street by Allison Lower
 
Role 12-07-06
Role 12-07-06Role 12-07-06
Role 12-07-06
 
Fudge brownies
Fudge browniesFudge brownies
Fudge brownies
 
La música i les noves tecnologies
La música i les noves tecnologiesLa música i les noves tecnologies
La música i les noves tecnologies
 
Ppt komputer fix
Ppt komputer fixPpt komputer fix
Ppt komputer fix
 
Selected-Works
Selected-WorksSelected-Works
Selected-Works
 
MySQL best practices at Trovit
MySQL best practices at TrovitMySQL best practices at Trovit
MySQL best practices at Trovit
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Climate change and food
Climate change and foodClimate change and food
Climate change and food
 
G901 05 2013a
G901 05 2013aG901 05 2013a
G901 05 2013a
 
C.V.civil,,,,, m'omena yahya . 2014
C.V.civil,,,,, m'omena yahya . 2014C.V.civil,,,,, m'omena yahya . 2014
C.V.civil,,,,, m'omena yahya . 2014
 
20160308 dtl ngs_focus_group_meeting_slideshare
20160308 dtl ngs_focus_group_meeting_slideshare20160308 dtl ngs_focus_group_meeting_slideshare
20160308 dtl ngs_focus_group_meeting_slideshare
 
Baker Street Homestead
Baker Street HomesteadBaker Street Homestead
Baker Street Homestead
 

Similar a BioSB meeting 2015

20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
sesejun
 
CALS_Stewards_of_Future_2015_Yow_IsoSeq
CALS_Stewards_of_Future_2015_Yow_IsoSeqCALS_Stewards_of_Future_2015_Yow_IsoSeq
CALS_Stewards_of_Future_2015_Yow_IsoSeq
Ashley Yow
 

Similar a BioSB meeting 2015 (20)

whole-genome-sequencing-guide-small-genomes.pdf.pdf
whole-genome-sequencing-guide-small-genomes.pdf.pdfwhole-genome-sequencing-guide-small-genomes.pdf.pdf
whole-genome-sequencing-guide-small-genomes.pdf.pdf
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
 
40 Years of Genome Assembly: Are We Done Yet?
40 Years of Genome Assembly: Are We Done Yet?40 Years of Genome Assembly: Are We Done Yet?
40 Years of Genome Assembly: Are We Done Yet?
 
26072016 uc davis_small
26072016 uc davis_small26072016 uc davis_small
26072016 uc davis_small
 
2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyond
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
CALS_Stewards_of_Future_2015_Yow_IsoSeq
CALS_Stewards_of_Future_2015_Yow_IsoSeqCALS_Stewards_of_Future_2015_Yow_IsoSeq
CALS_Stewards_of_Future_2015_Yow_IsoSeq
 
Cloud bioinformatics 2
Cloud bioinformatics 2Cloud bioinformatics 2
Cloud bioinformatics 2
 
Aug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomicsAug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomics
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Generations of sequencing technologies.
Generations of sequencing technologies. Generations of sequencing technologies.
Generations of sequencing technologies.
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can Know
 

Último

Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 

Último (20)

IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 

BioSB meeting 2015

  • 1. Scaffolding using long nanopore reads and more Hans Jansen Christiaan Henkel senior scientist
  • 2. Dutch SME at Bioscience Park in Leiden, the Netherlands • High throughput drug screens, and toxicity assays in zebrafish larvae • Fish fertility (eel, pike perch, sole) to aid sustainable aquaculture • Sequencing (genomes, transcriptomes) • Bioinformatics ZF-screens B.V.
  • 3. Genome projects Common carp (Cyprinus carpio) High troughput screening model Genome and transcriptomes European and Japanese eel (Anguilla anguilla and Anguilla japonica) Completing the life cycle in aquaculture Genome and transcriptomes King cobra (Ophiophagus hannah) Evolution and toxins Genome and transcriptomes But the quality of these genomes can be improved
  • 4. But MAP is much more. It is about being a community and a playground to test new applications. As Gordon Sanghera (CEO of ONT) said "MAP will never end. There will always be a MAP“. So if you think you're application can benefit from nanopore sensing then come join MAP and play with us. Visible as a web portal with information from ONT and social media like system with blog possibilities, comment, likes, and a forum to ask advice. MinION Access Program
  • 5. We entered when MAP started. Our first MinION arrived in April 2014 and the first kits in June. Since then run 30 Flow Cells. MAPpers competition Topped the leaderboard on read length and yield so we now have three MinION's. MinION Access Program and ZF-genomics
  • 6. Longest 2D read: 93.5 Kbp Longest template read: 120 Kbp (231 Kbp) Highest yield: 1.32 Gevents R7 0 50 100 150 200 250 300 350 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Base pairs sequenced (Mbp) Runs template and 2D yield over the past year template 2D R7.3R6
  • 7. Scaffolding genomes using long reads or How to untangle the assembly graph
  • 8. Cheap short read sequencing technology has been used to generate many draft genomes repeatunique sequence in unique sequence out Draft genomes made with short read data suffer from a fundamental problem. Reads that are shorter than the length of a repeat can’t connect the unique sequence in with the unique sequence out Genomic sequences Short reads
  • 9. repeatunique sequence in unique sequence out Long reads can help to resolve repeat area’s in the assembly graph And the resulting contigs will now look like this: Untangle
  • 10. 1. Short read correction Quake (not for small genomes) 2. Short read assembly Velvet 3. MinION read alignment to Velvet contigs LAST 4. Link filtering and contig tiling Untangle script 5. Path detachment around repeats Untangle script 6. Bubble popping Untangle script 7. Delete unconfirmed connections Untangle script 8. Contig extraction Untangle script Assembly and scaffolding strategy Task Software
  • 11. Agrobacterium strain NCPPB 1771 Agrobacteria are the cause of crown gall disease, a tumorous growth of plant tissue. Agrobacteria transfer part of their (plasmid) DNA to their host and this feature is used widely in plant research to genetically modify plants. Agrobacteria have two chromosomes, and carry several plasmids. This strain also carries active transposons.
  • 12. NCPPB 1771 assembly graph 25× transposon → (1160 bp) 8× transposon → (873 bp) 4× rRNA → (6.4 Kb) 271 nodes, 311 connections 154 contigs N50 = 198 Kb Sum = 5.87 Mb
  • 13. • Alignment: LAST with optimized settings • Links: alignment filtering and contig tiling • 7328 reads aligned to contigs • 438 reads aligned to multiple contigs • 585 links between contigs • 13158 reads on R6 and R7 chemistry • 73.8 Mb total yield (template and 2D) • 5–85970 nt length, typical ~12 Kb MinION sequencing and scaffolding
  • 14. Links between nodes are specific Means link is confirmed by PCR
  • 15. Final assembly graph after scaffolding • 271 nodes + 312 connections → 49 nodes + 5 connections • 154 contigs → ~8 contigs • Complete chromosome 2 (1.2 Mb), pTi (190 Kb), cryptic megaplasmid (746 Kb) • Slight residual fragmentation of chromosome 1
  • 16. MinION Analysis and Reference Consortium MARC is a consortium within MAP that seeks to establish sources of variation, optimize protocols and analysis. It is open science. Data is shared in the consortium and will be made available through ENA. ~100 people have signed up. ~7 experimental groups and ~4 analysis groups are actively working. Managed by weekly TC.
  • 17. Different phases in MARC Phase 1 is about being as standard as possible and establish variation in the system and between sites. This is done by 5 labs in the Netherlands, UK (2), USA ( east and west coast). Phase 2 is all about tweaking the protocol. Things like DNA isolation, shearing (or not), running scripts, DNA modifications will be addressed in this phase. Phase 3 is about examples of applications. MinION Analysis and Reference Consortium
  • 18. MinION Analysis and Reference Consortium In phase 1 the 5 participating labs received Escherichia coli str. K-12 substr. MG1655. Performed DNA isolation, library prep, and sequencing according to a detailed protocol. Per lab a total of 4 libraries with 2 different kits were prepared and run. This provides a excellent data set to understand sources of variance in ONT data.
  • 19. 5e+04 1e+05 40000 50000 60000 70000 80000 90000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Total Traces 5e+04 1e+05 40000 50000 60000 70000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Template Reads 20000 40000 60000 20000 30000 40000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Complement Reads 10000 20000 30000 40000 20000 30000 40000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF 2D Reads Read Counts
  • 20. Read Length Statistics 4000 4500 5000 5500 2000 3000 4000 5000 6000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Template Mean 3500 4000 4500 5000 5500 2000 4000 6000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Template Median 4000 4500 5000 5500 3500 4000 4500 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Template STDEV 4000 4500 5000 5500 6000 2000 3000 4000 5000 6000 7000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Complement Mean 4000 5000 2000 4000 6000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Complement Median 3250 3500 3750 3000 3500 4000 4500 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Complement STDEV 4500 5000 5500 6000 6500 2000 4000 6000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF 2D Mean 4000 5000 6000 2000 4000 6000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF 2D Median 3000 3500 4000 2500 3000 3500 4000 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF 2D STDEV
  • 21. 60 65 70 75 40 50 60 70 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Template % aligned 72 76 80 84 60 70 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF Complement % aligned 92 93 94 95 85 90 Run2 Run1 Sample CSH UCSC UEA WTCHG ZF 2D % aligned 60 61 62 63 60.0 62.5 65.0 67.5 70.0 Run2 Run1 Sample CSH UCSC UEA ZF Template 4 Sites 70 72 74 76 78 72 74 76 Run2 Run1 Sample CSH UCSC UEA ZF Complement 4 Sites 91.5 92.0 92.5 93.0 92.5 93.0 93.5 94.0 Run2 Run1 Sample CSH UCSC UEA ZF 2D 4 Sites Read Alignments
  • 22. With the data of the first 10 runs analyzed we can already see that read length has a stronger lab effect than base pair identity to the reference. Another set of 10 phase 1 runs is currently being analyzed and will give a clearer picture on variability. Experiments for phase 2 will start shortly, while in parallel phase 3 experiments and analysis are being done. Conclusions and perspectives
  • 23. The king cobra genome Rapid expansion of the 3 FTx gen family in the king cobra
  • 24. London Calling 2015 Highlights from Clive Brown’s talk • Improvements to the basecaller . There’s still room for improvement. • Read until (and barcoding). • Fast mode on the MinION MkI (500 bp/sec instead of 30) • New 3000 channel ASIC with crumpet chip design to separate ASIC and fluidics part. • MinION MkII and PromethION will have this new ASIC. • Library prep on beads to reduce amounts of DNA needed (lower ng to pg). • Direct RNA sequencing. • Simplified sample preparation and VolTRAX. • Pricing will be “pay as you go”. Initial payment for hardware include some hrs sequencing. • MkI $270 and 3 hrs sequencing (~3 Gbp in fast mode).
  • 25. Acknowledgements Prof. Dr. Paul Hooykaas, Leiden University Christiaan Henkel senior scientist Leiden University Ron Dirks (CEO of ZF-screens B.V.) All members of the MARC consortium Ewan Birney, EMBL-EBI Justin O’Grady, UEA Sara Goodwin, CSHL David Buck, WTCHG Oxford Vadim Zalunin, EMBL-EBI Miten Jain, UCSC Matt Loose, Nottingham Jared Simpson, OICR, Toronto

Notas del editor

  1. Excuse me if I may sound like a ONT salesperson, but the truth is nanopore sensing is a very powerful method to measure many different things and it will show up on many different places in your life over the next decade or two.