SlideShare una empresa de Scribd logo
1 de 47
ION TORRENT DATA ANALYSIS
SPECIFICALLY THE 400BP CHIP
By Ronak Shah
OUTLINE
 Sedum album Illumina Reference Stats
 400-bp reads analysis
 BWASW
 TMAP
 Error Corrected 100bp, 200bp and 400bp reads
analysis
 BWASW
 TMAP
 Assembly
 Original 400bp data
 Clipped data
 Clipped and quality trimmed
 All Chips Original data
 All Chips Error Corrected data
5/18/2012
2
IonTorrentDataAnalysis@MonsantoCo
SEDUM ALBUM REFERENCE
Made Using Illumina Hiseq Data
5/18/2012IonTorrentDataAnalysis@MonsantoCo
3
ILLUMINA INPUT DATA: 20BN BASE PAIR,
180MN READS, 148X COVERAGE
Read
Read
Type
Read
Len
(Mean)
Insert
Size
Total bases
Total
( read pairs)
Total
( reads)
Estimated
Coverage
(X)
Hiseq Paired End 100 400 3,168,468,000 15,842,340 31,684,680 22
Hiseq Paired End 100 400 3,320,637,800 16,603,189 33,206,378 23
Hiseq Paired End 100 400 3,238,613,000 16,193,065 32,386,130 23
Hiseq
Short
Overlap
merged
178 335 2,211,006,384 12,401,099 16
Hiseq
Short
Overlap
merged
178 335 2,209,098,067 12,391,800 16
Hiseq
Short
Overlap
merged
178 335 2,178,151,877 12,215,609 15
Hiseq
Short
Overlap
unmerged
100 335 1,550,423,600 15,504,236 11
Hiseq
Short
Overlap
unmerged
100 335 1,551,440,000 15,514,400 11
Hiseq
Short
Overlap
unmerged
100 335 1,519,735,600 15,197,356 11
Total 20,947,574,328 48,638,594 180,501,688 148
CLC-Bio
Input
20,947,574,328 180,501,688
5/18/2012
4
IonTorrentDataAnalysis@MonsantoCo
SEDUM ILLUMINA REFERENCE
FLOWCYTOMETRY GENOME SIZE: 142MB ESTIMATED GENOME SIZE: 180MB CURRENT
GENOME SIZE: 255MB N50(SCAFFOLD): 2.8KB N50(CONTIG): 1.6KB
Scaffolding Stats Contigs Stats
5/18/2012
5
IonTorrentDataAnalysis@MonsantoCo
Number of scaffolds 219,455
Total size of scaffolds 267,197,078
Total scaffold length as percentage of
assumed genome size
2
Longest scaffold 124,757
Shortest scaffold 200
Number of scaffolds > 1K nt 63,451 28.90%
Number of scaffolds > 10K nt 2,498 1.10%
Number of scaffolds > 100K nt 1 0.00%
Mean scaffold size 1,218
Median scaffold size 464
N50 scaffold length 2,848
Percentage of assembly in scaffolded
contigs
52.40%
Percentage of assembly in
unscaffolded contigs
47.60%
Average number of contigs per
scaffold
1.3
Average length of break (>25 Ns)
between contigs in scaffold
162
Number of
contigs
292,607
Number of
contigs in
scaffolds
113,984
Number of
contigs not in
scaffolds
178,623
Total size of
contigs
255,346,080
Longest contig 56,108
Shortest contig 176
Number of
contigs > 1K nt
67,774 23.20%
Number of
contigs > 10K nt
641 0.20%
Mean contig
size
873
Median contig
size
412
N50 contig
length
1,615
ION TORRENT 400 BP CHIP
READ ANALYSIS
5/18/2012IonTorrentDataAnalysis@MonsantoCo
6
ION TORRENT INPUT DATA: 5BN BASE
PAIR, 19MN READS, 37X COVERAGE
Read Read Type
Read
Len
(Mean)
Total bases Total reads
Estimated
Coverage
(X)
Ion Torrent 400bp chip 286 897,163,323 3,130,643 6
Ion Torrent 400bp chip 241 931,376,271 3,850,295 7
Ion Torrent 400bp chip 269 1,113,089,592 4,126,822 8
Ion Torrent 400bp chip 252 1,098,412,220 4,350,400 8
Ion Torrent 400bp chip 274 1,207,920,840 4,408,077 9
Total 5,247,962,246 19,866,237 37
5/18/2012
7
IonTorrentDataAnalysis@MonsantoCo
ALIGNERS USED: BWASW AND TMAP
 Parameters used in both aligners were default.
 Where for both:
 Mismatch penalty:3
 Gap open penalty: 5
 Gap extension penalty:2
5/18/2012
8
IonTorrentDataAnalysis@MonsantoCo
MERGED BWA RESULTS: 25% INSERTION
RATE; 33% DELETION RATE; 85%MISTMATCH
Mapping Results
reads 23,277,245
mapped reads 21,124,134
mapped bases 3,622,712,040
perfectly mapped 3,143,328
len max 433
len mean 171
len stdev 82
mapq mean 95
mapq stdev 87
snp rate 4
ins rate 25
del rate 33
pct mismatch 85
base qual mean 22
base qual stdev 9
5/18/2012
9
IonTorrentDataAnalysis@MonsantoCo
MERGED BWA RESULTS: 91% READS MAPPED
 Total Number of Reads: 23.3M
 Number of Reads Mapped:21.1M
 Percentage of Reads Mapped: 91%
5/18/2012
10
IonTorrentDataAnalysis@MonsantoCo
MERGED BWA RESULTS: BASE QUALITY
DECREASES FROM 100 BP
 Mean Base Quality
5/18/2012
11
IonTorrentDataAnalysis@MonsantoCo
Quality
keeps on
dropping
after 100bp
MERGED BWA RESULTS: BASE QUALITY
DECREASES FROM 100 BP
 Per Base Quality
5/18/2012
12
IonTorrentDataAnalysis@MonsantoCo
MERGED BWA RESULTS: LOW ERRORS AT THE
START; HIGH ERRORS AT THE END
 Error Profiles:
 The profiles indicate that the Mismatch, Insertion
and Deletion are really high and they tend to be low
at the start of the sequence and keep on increasing
gradually as the sequence gets longer.
5/18/2012
13
IonTorrentDataAnalysis@MonsantoCo
MERGED BWA RESULTS: HIGH MISMATCH,
HIGH INSERTION; HIGH DELETION
 Error Profiles
5/18/2012
14
IonTorrentDataAnalysis@MonsantoCo
MERGED BWA RESULTS: HIGH MISMATCH,
HIGH INSERTION; HIGH DELETION
 Error Profiles
5/18/2012
15
IonTorrentDataAnalysis@MonsantoCo
MERGED BWA RESULTS: OVER REPRESENTATION
BETWEEN 150-450 BP
 K-mer Profile
 There is over representation of K-mers from position
150 to 450.
5/18/2012
16
IonTorrentDataAnalysis@MonsantoCo
MERGED TMAP RESULTS : 28% INSERTION
34% DELETION; 88% MISMATCH
Mapping Results
reads 19,866,237
mapped reads 17,795,383
mapped bases 3,381,672,736
perfectly mapped 2,053,578
len max 433
len mean 190
len stdev 79
maq mean 14
maq stdev 10
snp rate 5%
ins rate 28%
del rate 34%
pct mismatch 88%
base qual mean 22
base qual stdev 9
5/18/2012
17
IonTorrentDataAnalysis@MonsantoCo
MERGED TMAP RESULTS: 90% READ MAPPED
 Total Number of Reads: 17.8M
 Number of Reads Mapped:19.9M
 Percentage of Reads Mapped: 90%
5/18/2012
18
IonTorrentDataAnalysis@MonsantoCo
MERGED TMAP RESULTS: BASE QUALITY
DECREASES FROM 100 BP
 Mean Base Quality
5/18/2012
19
IonTorrentDataAnalysis@MonsantoCo
Quality
keeps on
dropping
after 100
bp
MERGED TMAP RESULTS: BASE QUALITY
DECREASES FROM 100 BP
 Per Base Quality
5/18/2012
20
IonTorrentDataAnalysis@MonsantoCo
MERGED TMAP RESULTS: HIGH MISMATCH,
HIGH INSERTION; HIGH DELETION
 Error Profiles
5/18/2012
21
IonTorrentDataAnalysis@MonsantoCo
MERGED TMAP RESULTS: HIGH MISMATCH,
HIGH INSERTION; HIGH DELETION
 Error Profiles
5/18/2012
22
IonTorrentDataAnalysis@MonsantoCo
MERGED TMAP RESULTS: OVER
REPRESENTATION BETWEEN 150-450 BP
 K-mer Profile
 There is over representation of K-mers from position
150 to 450.
5/18/2012
23
IonTorrentDataAnalysis@MonsantoCo
ERROR CORRECTED 100BP,
200BP AND 400BP READS
ANALYSIS
5/18/2012IonTorrentDataAnalysis@MonsantoCo
24
ION TORRENT INPUT DATA: 13BN BASE
PAIR, 72MN READS, 95X COVERAGE
Read Read Type
Read
Len
(Mean)
Total bases
Total
reads
Estimated
Coverage
(X)
Ion Torrent
ORG
100bp,
200bp
400bp chip
187 13,521,610,812 72,058,773 95
Ion Torrent
Corrected
100bp,
200bp
400bp chip
187 13,479,341,388
72,058,773
95
5/18/2012
25
IonTorrentDataAnalysis@MonsantoCo
ORG BWA RESULTS: 21% INSERTION; 27%
DELETION; 81% MISMATCH
CORRECTED BWA RESULTS: 10% INSERTION; 15%
DELETION; 70% MISMATCH
Corrected BWA Mapping Results
reads 79,986,695
mapped reads 75,639,986
mapped bases 14,695,848,107
perfectly mapped 23,006,719
len max 678
len mean 194
len stdev 83
mapq mean 100
mapq stdev 88
snp rate 2%
ins rate 10%
del rate 15%
pct mismatch 70%
base qual mean 20
base qual stdev 6
5/18/2012
26
IonTorrentDataAnalysis@MonsantoCo
ORG BWA Mapping Results
reads 80,098,562
mapped reads 71611630
mapped bases 10,456,280,566
perfectly mapped 13,729,260
len max 433
len mean 146
len stdev 66
mapq mean 97
mapq stdev 86
snp rate 3.2%
ins rate 21%
del rate 27%
pct mismatch 81%
base qual mean 21
base qual stdev 6
ORG BWA RESULTS: 89% READ MAPPED
CORRECTED BWA RESULTS: 95% READS MAPPED
5/18/2012
27
IonTorrentDataAnalysis@MonsantoCo
ORG BWA Mapping Results
Total Number of Reads 80.9M
Number of Reads Mapped 71.6M
Percentage of Reads Mapped 89%
Corrected BWA Mapping Results
Total Number of Reads 80.0M
Number of Reads Mapped 75.6M
Percentage of Reads Mapped 95%
CORRECTED BWA RESULTS: BASE QUALITY
DECREASE FROM 100
5/18/2012
28
IonTorrentDataAnalysis@MonsantoCo
Quality
keeps on
dropping
after 100
bp
CORRECTED BWA RESULTS : BASE QUALITY
DECREASES FROM 100 BP
 Per Base Quality
5/18/2012
29
IonTorrentDataAnalysis@MonsantoCo
CORRECTED BWA RESULTS: HIGH MISMATCH; HIGH INSERTION;
HIGH DELETION; BUT 10% SMALLER THEN ORG READS
 Error Profiles
5/18/2012
30
IonTorrentDataAnalysis@MonsantoCo
CORRECTED BWA RESULTS: HIGH MISMATCH; HIGH INSERTION;
HIGH DELETION; BUT 10% SMALLER THEN ORG READS
 Error Profiles
5/18/2012
31
IonTorrentDataAnalysis@MonsantoCo
CORRECTED BWA RESULTS: OVER
REPRESENTATION BETWEEN 150-450 BP
 K-mer Profile
 There is over representation of K-mers from position
250 to 450.
5/18/2012
32
IonTorrentDataAnalysis@MonsantoCo
ORG TMAP RESULTS: 20% INSERTION; 23%
DELETION; 84% MISMATCH
CORRECTED TMAP RESULTS: 13% INSERTION;
18% DELETION; 74% MISMATCH
Corrected TMAP Mapping Results
reads 72,058,773
mapped reads 68,116,303
mapped bases 12,763,573,084
perfectly mapped 18,029,367
len max 678
len mean 187
len stdev 81
mapq mean 13
mapq stdev 10
snp rate 3
ins rate 13
del rate 18
pct mismatch 74
base qual mean 20
base qual stdev 6
5/18/2012
33
IonTorrentDataAnalysis@MonsantoCo
ORG TMAP Mapping Results
reads 72,058,773
mapped reads 65,224,903
mapped bases 12,211,168,843
perfectly mapped 10,436,368
len max 638
len mean 187
len stdev 81
mapq mean 14
mapq stdev 10
snp rate 3
ins rate 20
del rate 23
pct mismatch 84
base qual mean 20
base qual stdev 6
ORG TMAP RESULTS: 89% READ MAPPED
CORRECTED TMAP RESULTS: 95% READS
MAPPED
5/18/2012
34
IonTorrentDataAnalysis@MonsantoCo
ORG BWA Mapping Results
Total Number of Reads 72.1M
Number of Reads Mapped 65.2M
Percentage of Reads Mapped 91%
Corrected BWA Mapping Results
Total Number of Reads 72.1M
Number of Reads Mapped 68.1M
Percentage of Reads Mapped 95%
CORRECTED TMAP RESULTS: BASE QUALITY
DECREASE FROM 100
5/18/2012
35
IonTorrentDataAnalysis@MonsantoCo
Quality
keeps on
dropping
after 200
bp
CORRECTED TMAP RESULTS : BASE
QUALITY DECREASES FROM 100 BP
 Per Base Quality
5/18/2012
36
IonTorrentDataAnalysis@MonsantoCo
CORRECTED TMAP RESULTS: HIGH MISMATCH; HIGH INSERTION;
HIGH DELETION; BUT 10% SMALLER THEN ORG READS
 Error Profiles
5/18/2012
37
IonTorrentDataAnalysis@MonsantoCo
CORRECTED TMAP RESULTS: HIGH MISMATCH; HIGH INSERTION;
HIGH DELETION; BUT 10% SMALLER THEN ORG READS
 Error Profiles
5/18/2012
38
IonTorrentDataAnalysis@MonsantoCo
CORRECTED TMAP RESULTS : OVER
REPRESENTATION BETWEEN 150-450 BP
 K-mer Profile
 There is over representation of K-mers from position
150 to 450.
5/18/2012
39
IonTorrentDataAnalysis@MonsantoCo
ASSEMBLY
5/18/2012IonTorrentDataAnalysis@MonsantoCo
40
400 BP READS: N50 421BP; MAX CONTIG 4.6KB;
TOTAL BASES 201MB
400 Bp Reads Assembly Stats
Number of contigs 51,7835
Total size of contigs 201,990,292
Longest contig 4,684
Shortest contig 23
Number of contigs > 1K nt 11,939 2.30%
Number of contigs > 10K nt 0 0.00%
Mean contig size 390
Median contig size 329
N50 contig length 421
5/18/2012
41
IonTorrentDataAnalysis@MonsantoCo
400 BP READS: N50 426BP; MAX CONTIG 4.2KB;
TOTAL BASES 201MB
400 Bp Reads clipped at length 450 Assembly Stats
Number of contigs 509,308
Total size of contigs 201,527,141
Longest contig 4,272
Shortest contig 23
Number of contigs > 1K nt 13,781 2.70%
Number of contigs > 10K nt 0 0.00%
Mean contig size 396
Median contig size 331
N50 contig length 426
5/18/2012
42
IonTorrentDataAnalysis@MonsantoCo
• Reads Clipped at length 450
400 BP READS: N50 430BP; MAX CONTIG 5.4KB;
TOTAL BASES 192MB
400 Bp Reads clipped at length 450 qual 15 Assembly Stats
Number of contigs 478,037
Total size of contigs 192,109,210
Longest contig 5,378
Shortest contig 23
Number of contigs > 1K nt 16,737 3.50%
Number of contigs > 10K nt 0 0.00%
Mean contig size 402
Median contig size 324
N50 contig length 430
5/18/2012
43
IonTorrentDataAnalysis@MonsantoCo
• Reads Clipped at length 450 with minimum quality of 15
ORG READS: N50 397BP; MAX CONTIG 5KB;
TOTAL BASES 185MB
Org Reads Assembly Stats
Number of contigs 486,255
Total size of contigs 185,584,458
Longest contig 5,878
Shortest contig 24
Number of contigs > 1K nt 15,386 3.20%
Number of contigs > 10K nt 0 0.00%
Mean contig size 382
Median contig size 299
N50 contig length 397
5/18/2012
44
IonTorrentDataAnalysis@MonsantoCo
ERROR CORRECTED READS: N50 550BP;
MAX CONTIG 28KB; TOTAL BASES 203MB
Error Corrected Reads Assembly Stats
Number of contigs 424,264
Total size of contigs 203,921,151
Longest contig 28,009
Shortest contig 24
Number of contigs > 1K nt 33,025 7.80%
Number of contigs > 10K nt 43 0.00%
Mean contig size 481
Median contig size 328
N50 contig length 550
5/18/2012
45
IonTorrentDataAnalysis@MonsantoCo
QUESTIONS
5/18/2012
46
IonTorrentDataAnalysis@MonsantoCo
ACKNOWLEDGEMENTS
 Todd Michael
 Randall Kerstetter
 Shiaw-Pyng Yang
 Ryan Richt
 Xuefeng Zhou
5/18/2012
47
IonTorrentDataAnalysis@MonsantoCo

Más contenido relacionado

Destacado

Geiger muller counting system
Geiger muller counting systemGeiger muller counting system
Geiger muller counting systemGaurav Bhati
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 
New Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewNew Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewPaolo Dametto
 
Geiger muller counter
Geiger muller counterGeiger muller counter
Geiger muller counterBritto Samuel
 
NGS technologies - platforms and applications
NGS technologies - platforms and applicationsNGS technologies - platforms and applications
NGS technologies - platforms and applicationsAGRF_Ltd
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Nathan Olson
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.mkim8
 

Destacado (11)

Geiger–Müller Counter
Geiger–Müller CounterGeiger–Müller Counter
Geiger–Müller Counter
 
Geiger muller counting system
Geiger muller counting systemGeiger muller counting system
Geiger muller counting system
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
New Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewNew Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overview
 
Geiger muller counter
Geiger muller counterGeiger muller counter
Geiger muller counter
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
NGS technologies - platforms and applications
NGS technologies - platforms and applicationsNGS technologies - platforms and applications
NGS technologies - platforms and applications
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 

Similar a Ion torrent data analysis

GaN-on-Silicon Transistor Comparison 2018 Structural, Process & Costing Repor...
GaN-on-Silicon Transistor Comparison 2018 Structural, Process & Costing Repor...GaN-on-Silicon Transistor Comparison 2018 Structural, Process & Costing Repor...
GaN-on-Silicon Transistor Comparison 2018 Structural, Process & Costing Repor...Yole Developpement
 
Broadcom AFEM8200 MBHB PAMiD
Broadcom AFEM8200 MBHB PAMiDBroadcom AFEM8200 MBHB PAMiD
Broadcom AFEM8200 MBHB PAMiDsystem_plus
 
Solution on Portable Blood Pressure Monitor System
Solution on Portable Blood Pressure Monitor SystemSolution on Portable Blood Pressure Monitor System
Solution on Portable Blood Pressure Monitor SystemPremier Farnell
 
Broadcom AFEM-8072 – Mid and High Band LTE RF Front-End Module (FEM) - teardo...
Broadcom AFEM-8072 – Mid and High Band LTE RF Front-End Module (FEM) - teardo...Broadcom AFEM-8072 – Mid and High Band LTE RF Front-End Module (FEM) - teardo...
Broadcom AFEM-8072 – Mid and High Band LTE RF Front-End Module (FEM) - teardo...system_plus
 
Broadcom AFEM-8100 System-in-Package in the Apple iPhone 11 Series
Broadcom AFEM-8100 System-in-Package in the Apple iPhone 11 SeriesBroadcom AFEM-8100 System-in-Package in the Apple iPhone 11 Series
Broadcom AFEM-8100 System-in-Package in the Apple iPhone 11 Seriessystem_plus
 
PAM4 Analysis and Measurement Webinar Slidedeck
PAM4 Analysis and Measurement Webinar SlidedeckPAM4 Analysis and Measurement Webinar Slidedeck
PAM4 Analysis and Measurement Webinar Slidedeckteledynelecroy
 
PAM4 Analysis and Measurement Considerations Webinar
PAM4 Analysis and Measurement Considerations WebinarPAM4 Analysis and Measurement Considerations Webinar
PAM4 Analysis and Measurement Considerations WebinarHilary Lustig
 
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...Baptiste Mayjonade
 
Abdul Mozid: New developments in wire based antegrade approach
Abdul Mozid: New developments in wire based antegrade approachAbdul Mozid: New developments in wire based antegrade approach
Abdul Mozid: New developments in wire based antegrade approachEuro CTO Club
 
2-Axis Gyroscopes for Optical Image Stabilization: STMicroelectronics L2G2IS ...
2-Axis Gyroscopes for Optical Image Stabilization: STMicroelectronics L2G2IS ...2-Axis Gyroscopes for Optical Image Stabilization: STMicroelectronics L2G2IS ...
2-Axis Gyroscopes for Optical Image Stabilization: STMicroelectronics L2G2IS ...Yole Developpement
 
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...Lawrence kok
 
STMicroelectronics LPS22HB Nano Pressure Sensor 2016 teardown reverse costin...
STMicroelectronics LPS22HB Nano Pressure Sensor 2016 teardown reverse costin...STMicroelectronics LPS22HB Nano Pressure Sensor 2016 teardown reverse costin...
STMicroelectronics LPS22HB Nano Pressure Sensor 2016 teardown reverse costin...Yole Developpement
 
Fuel nano additive technology
Fuel  nano additive technologyFuel  nano additive technology
Fuel nano additive technologytabirsir
 
EPC2045 100V GaN-on-Silicon Transistor 2017 teardown reverse costing report p...
EPC2045 100V GaN-on-Silicon Transistor 2017 teardown reverse costing report p...EPC2045 100V GaN-on-Silicon Transistor 2017 teardown reverse costing report p...
EPC2045 100V GaN-on-Silicon Transistor 2017 teardown reverse costing report p...Yole Developpement
 
PWM Controller for Power Supplies
PWM Controller for Power SuppliesPWM Controller for Power Supplies
PWM Controller for Power SuppliesPremier Farnell
 
Bose Automotive Audio Amplifier
Bose Automotive Audio AmplifierBose Automotive Audio Amplifier
Bose Automotive Audio Amplifiersystem_plus
 
Circuit Theory 2: Filters Project Report
Circuit Theory 2: Filters Project ReportCircuit Theory 2: Filters Project Report
Circuit Theory 2: Filters Project ReportMichael Sandy
 
Design of Filter Circuits using MATLAB, Multisim, and Excel
Design of Filter Circuits using MATLAB, Multisim, and ExcelDesign of Filter Circuits using MATLAB, Multisim, and Excel
Design of Filter Circuits using MATLAB, Multisim, and ExcelDavid Sandy
 
Advances In Digital Automation Within Refining
Advances In Digital Automation Within RefiningAdvances In Digital Automation Within Refining
Advances In Digital Automation Within RefiningJim Cahill
 

Similar a Ion torrent data analysis (20)

GaN-on-Silicon Transistor Comparison 2018 Structural, Process & Costing Repor...
GaN-on-Silicon Transistor Comparison 2018 Structural, Process & Costing Repor...GaN-on-Silicon Transistor Comparison 2018 Structural, Process & Costing Repor...
GaN-on-Silicon Transistor Comparison 2018 Structural, Process & Costing Repor...
 
Broadcom AFEM8200 MBHB PAMiD
Broadcom AFEM8200 MBHB PAMiDBroadcom AFEM8200 MBHB PAMiD
Broadcom AFEM8200 MBHB PAMiD
 
Solution on Portable Blood Pressure Monitor System
Solution on Portable Blood Pressure Monitor SystemSolution on Portable Blood Pressure Monitor System
Solution on Portable Blood Pressure Monitor System
 
Broadcom AFEM-8072 – Mid and High Band LTE RF Front-End Module (FEM) - teardo...
Broadcom AFEM-8072 – Mid and High Band LTE RF Front-End Module (FEM) - teardo...Broadcom AFEM-8072 – Mid and High Band LTE RF Front-End Module (FEM) - teardo...
Broadcom AFEM-8072 – Mid and High Band LTE RF Front-End Module (FEM) - teardo...
 
Broadcom AFEM-8100 System-in-Package in the Apple iPhone 11 Series
Broadcom AFEM-8100 System-in-Package in the Apple iPhone 11 SeriesBroadcom AFEM-8100 System-in-Package in the Apple iPhone 11 Series
Broadcom AFEM-8100 System-in-Package in the Apple iPhone 11 Series
 
PAM4 Analysis and Measurement Webinar Slidedeck
PAM4 Analysis and Measurement Webinar SlidedeckPAM4 Analysis and Measurement Webinar Slidedeck
PAM4 Analysis and Measurement Webinar Slidedeck
 
PAM4 Analysis and Measurement Considerations Webinar
PAM4 Analysis and Measurement Considerations WebinarPAM4 Analysis and Measurement Considerations Webinar
PAM4 Analysis and Measurement Considerations Webinar
 
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
 
Abdul Mozid: New developments in wire based antegrade approach
Abdul Mozid: New developments in wire based antegrade approachAbdul Mozid: New developments in wire based antegrade approach
Abdul Mozid: New developments in wire based antegrade approach
 
2-Axis Gyroscopes for Optical Image Stabilization: STMicroelectronics L2G2IS ...
2-Axis Gyroscopes for Optical Image Stabilization: STMicroelectronics L2G2IS ...2-Axis Gyroscopes for Optical Image Stabilization: STMicroelectronics L2G2IS ...
2-Axis Gyroscopes for Optical Image Stabilization: STMicroelectronics L2G2IS ...
 
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
 
STMicroelectronics LPS22HB Nano Pressure Sensor 2016 teardown reverse costin...
STMicroelectronics LPS22HB Nano Pressure Sensor 2016 teardown reverse costin...STMicroelectronics LPS22HB Nano Pressure Sensor 2016 teardown reverse costin...
STMicroelectronics LPS22HB Nano Pressure Sensor 2016 teardown reverse costin...
 
Fuel nano additive technology
Fuel  nano additive technologyFuel  nano additive technology
Fuel nano additive technology
 
EPC2045 100V GaN-on-Silicon Transistor 2017 teardown reverse costing report p...
EPC2045 100V GaN-on-Silicon Transistor 2017 teardown reverse costing report p...EPC2045 100V GaN-on-Silicon Transistor 2017 teardown reverse costing report p...
EPC2045 100V GaN-on-Silicon Transistor 2017 teardown reverse costing report p...
 
Honors ~ DNA 1011
Honors ~ DNA 1011Honors ~ DNA 1011
Honors ~ DNA 1011
 
PWM Controller for Power Supplies
PWM Controller for Power SuppliesPWM Controller for Power Supplies
PWM Controller for Power Supplies
 
Bose Automotive Audio Amplifier
Bose Automotive Audio AmplifierBose Automotive Audio Amplifier
Bose Automotive Audio Amplifier
 
Circuit Theory 2: Filters Project Report
Circuit Theory 2: Filters Project ReportCircuit Theory 2: Filters Project Report
Circuit Theory 2: Filters Project Report
 
Design of Filter Circuits using MATLAB, Multisim, and Excel
Design of Filter Circuits using MATLAB, Multisim, and ExcelDesign of Filter Circuits using MATLAB, Multisim, and Excel
Design of Filter Circuits using MATLAB, Multisim, and Excel
 
Advances In Digital Automation Within Refining
Advances In Digital Automation Within RefiningAdvances In Digital Automation Within Refining
Advances In Digital Automation Within Refining
 

Más de Ronak Shah

Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...Ronak Shah
 
Comparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detectionComparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detectionRonak Shah
 
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...Ronak Shah
 
Detecting clinically actionable somatic structural aberrations from targeted ...
Detecting clinically actionable somatic structural aberrations from targeted ...Detecting clinically actionable somatic structural aberrations from targeted ...
Detecting clinically actionable somatic structural aberrations from targeted ...Ronak Shah
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Ronak Shah
 
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Ronak Shah
 

Más de Ronak Shah (6)

Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
 
Comparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detectionComparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detection
 
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
 
Detecting clinically actionable somatic structural aberrations from targeted ...
Detecting clinically actionable somatic structural aberrations from targeted ...Detecting clinically actionable somatic structural aberrations from targeted ...
Detecting clinically actionable somatic structural aberrations from targeted ...
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...
 
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
 

Último

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Último (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Ion torrent data analysis

  • 1. ION TORRENT DATA ANALYSIS SPECIFICALLY THE 400BP CHIP By Ronak Shah
  • 2. OUTLINE  Sedum album Illumina Reference Stats  400-bp reads analysis  BWASW  TMAP  Error Corrected 100bp, 200bp and 400bp reads analysis  BWASW  TMAP  Assembly  Original 400bp data  Clipped data  Clipped and quality trimmed  All Chips Original data  All Chips Error Corrected data 5/18/2012 2 IonTorrentDataAnalysis@MonsantoCo
  • 3. SEDUM ALBUM REFERENCE Made Using Illumina Hiseq Data 5/18/2012IonTorrentDataAnalysis@MonsantoCo 3
  • 4. ILLUMINA INPUT DATA: 20BN BASE PAIR, 180MN READS, 148X COVERAGE Read Read Type Read Len (Mean) Insert Size Total bases Total ( read pairs) Total ( reads) Estimated Coverage (X) Hiseq Paired End 100 400 3,168,468,000 15,842,340 31,684,680 22 Hiseq Paired End 100 400 3,320,637,800 16,603,189 33,206,378 23 Hiseq Paired End 100 400 3,238,613,000 16,193,065 32,386,130 23 Hiseq Short Overlap merged 178 335 2,211,006,384 12,401,099 16 Hiseq Short Overlap merged 178 335 2,209,098,067 12,391,800 16 Hiseq Short Overlap merged 178 335 2,178,151,877 12,215,609 15 Hiseq Short Overlap unmerged 100 335 1,550,423,600 15,504,236 11 Hiseq Short Overlap unmerged 100 335 1,551,440,000 15,514,400 11 Hiseq Short Overlap unmerged 100 335 1,519,735,600 15,197,356 11 Total 20,947,574,328 48,638,594 180,501,688 148 CLC-Bio Input 20,947,574,328 180,501,688 5/18/2012 4 IonTorrentDataAnalysis@MonsantoCo
  • 5. SEDUM ILLUMINA REFERENCE FLOWCYTOMETRY GENOME SIZE: 142MB ESTIMATED GENOME SIZE: 180MB CURRENT GENOME SIZE: 255MB N50(SCAFFOLD): 2.8KB N50(CONTIG): 1.6KB Scaffolding Stats Contigs Stats 5/18/2012 5 IonTorrentDataAnalysis@MonsantoCo Number of scaffolds 219,455 Total size of scaffolds 267,197,078 Total scaffold length as percentage of assumed genome size 2 Longest scaffold 124,757 Shortest scaffold 200 Number of scaffolds > 1K nt 63,451 28.90% Number of scaffolds > 10K nt 2,498 1.10% Number of scaffolds > 100K nt 1 0.00% Mean scaffold size 1,218 Median scaffold size 464 N50 scaffold length 2,848 Percentage of assembly in scaffolded contigs 52.40% Percentage of assembly in unscaffolded contigs 47.60% Average number of contigs per scaffold 1.3 Average length of break (>25 Ns) between contigs in scaffold 162 Number of contigs 292,607 Number of contigs in scaffolds 113,984 Number of contigs not in scaffolds 178,623 Total size of contigs 255,346,080 Longest contig 56,108 Shortest contig 176 Number of contigs > 1K nt 67,774 23.20% Number of contigs > 10K nt 641 0.20% Mean contig size 873 Median contig size 412 N50 contig length 1,615
  • 6. ION TORRENT 400 BP CHIP READ ANALYSIS 5/18/2012IonTorrentDataAnalysis@MonsantoCo 6
  • 7. ION TORRENT INPUT DATA: 5BN BASE PAIR, 19MN READS, 37X COVERAGE Read Read Type Read Len (Mean) Total bases Total reads Estimated Coverage (X) Ion Torrent 400bp chip 286 897,163,323 3,130,643 6 Ion Torrent 400bp chip 241 931,376,271 3,850,295 7 Ion Torrent 400bp chip 269 1,113,089,592 4,126,822 8 Ion Torrent 400bp chip 252 1,098,412,220 4,350,400 8 Ion Torrent 400bp chip 274 1,207,920,840 4,408,077 9 Total 5,247,962,246 19,866,237 37 5/18/2012 7 IonTorrentDataAnalysis@MonsantoCo
  • 8. ALIGNERS USED: BWASW AND TMAP  Parameters used in both aligners were default.  Where for both:  Mismatch penalty:3  Gap open penalty: 5  Gap extension penalty:2 5/18/2012 8 IonTorrentDataAnalysis@MonsantoCo
  • 9. MERGED BWA RESULTS: 25% INSERTION RATE; 33% DELETION RATE; 85%MISTMATCH Mapping Results reads 23,277,245 mapped reads 21,124,134 mapped bases 3,622,712,040 perfectly mapped 3,143,328 len max 433 len mean 171 len stdev 82 mapq mean 95 mapq stdev 87 snp rate 4 ins rate 25 del rate 33 pct mismatch 85 base qual mean 22 base qual stdev 9 5/18/2012 9 IonTorrentDataAnalysis@MonsantoCo
  • 10. MERGED BWA RESULTS: 91% READS MAPPED  Total Number of Reads: 23.3M  Number of Reads Mapped:21.1M  Percentage of Reads Mapped: 91% 5/18/2012 10 IonTorrentDataAnalysis@MonsantoCo
  • 11. MERGED BWA RESULTS: BASE QUALITY DECREASES FROM 100 BP  Mean Base Quality 5/18/2012 11 IonTorrentDataAnalysis@MonsantoCo Quality keeps on dropping after 100bp
  • 12. MERGED BWA RESULTS: BASE QUALITY DECREASES FROM 100 BP  Per Base Quality 5/18/2012 12 IonTorrentDataAnalysis@MonsantoCo
  • 13. MERGED BWA RESULTS: LOW ERRORS AT THE START; HIGH ERRORS AT THE END  Error Profiles:  The profiles indicate that the Mismatch, Insertion and Deletion are really high and they tend to be low at the start of the sequence and keep on increasing gradually as the sequence gets longer. 5/18/2012 13 IonTorrentDataAnalysis@MonsantoCo
  • 14. MERGED BWA RESULTS: HIGH MISMATCH, HIGH INSERTION; HIGH DELETION  Error Profiles 5/18/2012 14 IonTorrentDataAnalysis@MonsantoCo
  • 15. MERGED BWA RESULTS: HIGH MISMATCH, HIGH INSERTION; HIGH DELETION  Error Profiles 5/18/2012 15 IonTorrentDataAnalysis@MonsantoCo
  • 16. MERGED BWA RESULTS: OVER REPRESENTATION BETWEEN 150-450 BP  K-mer Profile  There is over representation of K-mers from position 150 to 450. 5/18/2012 16 IonTorrentDataAnalysis@MonsantoCo
  • 17. MERGED TMAP RESULTS : 28% INSERTION 34% DELETION; 88% MISMATCH Mapping Results reads 19,866,237 mapped reads 17,795,383 mapped bases 3,381,672,736 perfectly mapped 2,053,578 len max 433 len mean 190 len stdev 79 maq mean 14 maq stdev 10 snp rate 5% ins rate 28% del rate 34% pct mismatch 88% base qual mean 22 base qual stdev 9 5/18/2012 17 IonTorrentDataAnalysis@MonsantoCo
  • 18. MERGED TMAP RESULTS: 90% READ MAPPED  Total Number of Reads: 17.8M  Number of Reads Mapped:19.9M  Percentage of Reads Mapped: 90% 5/18/2012 18 IonTorrentDataAnalysis@MonsantoCo
  • 19. MERGED TMAP RESULTS: BASE QUALITY DECREASES FROM 100 BP  Mean Base Quality 5/18/2012 19 IonTorrentDataAnalysis@MonsantoCo Quality keeps on dropping after 100 bp
  • 20. MERGED TMAP RESULTS: BASE QUALITY DECREASES FROM 100 BP  Per Base Quality 5/18/2012 20 IonTorrentDataAnalysis@MonsantoCo
  • 21. MERGED TMAP RESULTS: HIGH MISMATCH, HIGH INSERTION; HIGH DELETION  Error Profiles 5/18/2012 21 IonTorrentDataAnalysis@MonsantoCo
  • 22. MERGED TMAP RESULTS: HIGH MISMATCH, HIGH INSERTION; HIGH DELETION  Error Profiles 5/18/2012 22 IonTorrentDataAnalysis@MonsantoCo
  • 23. MERGED TMAP RESULTS: OVER REPRESENTATION BETWEEN 150-450 BP  K-mer Profile  There is over representation of K-mers from position 150 to 450. 5/18/2012 23 IonTorrentDataAnalysis@MonsantoCo
  • 24. ERROR CORRECTED 100BP, 200BP AND 400BP READS ANALYSIS 5/18/2012IonTorrentDataAnalysis@MonsantoCo 24
  • 25. ION TORRENT INPUT DATA: 13BN BASE PAIR, 72MN READS, 95X COVERAGE Read Read Type Read Len (Mean) Total bases Total reads Estimated Coverage (X) Ion Torrent ORG 100bp, 200bp 400bp chip 187 13,521,610,812 72,058,773 95 Ion Torrent Corrected 100bp, 200bp 400bp chip 187 13,479,341,388 72,058,773 95 5/18/2012 25 IonTorrentDataAnalysis@MonsantoCo
  • 26. ORG BWA RESULTS: 21% INSERTION; 27% DELETION; 81% MISMATCH CORRECTED BWA RESULTS: 10% INSERTION; 15% DELETION; 70% MISMATCH Corrected BWA Mapping Results reads 79,986,695 mapped reads 75,639,986 mapped bases 14,695,848,107 perfectly mapped 23,006,719 len max 678 len mean 194 len stdev 83 mapq mean 100 mapq stdev 88 snp rate 2% ins rate 10% del rate 15% pct mismatch 70% base qual mean 20 base qual stdev 6 5/18/2012 26 IonTorrentDataAnalysis@MonsantoCo ORG BWA Mapping Results reads 80,098,562 mapped reads 71611630 mapped bases 10,456,280,566 perfectly mapped 13,729,260 len max 433 len mean 146 len stdev 66 mapq mean 97 mapq stdev 86 snp rate 3.2% ins rate 21% del rate 27% pct mismatch 81% base qual mean 21 base qual stdev 6
  • 27. ORG BWA RESULTS: 89% READ MAPPED CORRECTED BWA RESULTS: 95% READS MAPPED 5/18/2012 27 IonTorrentDataAnalysis@MonsantoCo ORG BWA Mapping Results Total Number of Reads 80.9M Number of Reads Mapped 71.6M Percentage of Reads Mapped 89% Corrected BWA Mapping Results Total Number of Reads 80.0M Number of Reads Mapped 75.6M Percentage of Reads Mapped 95%
  • 28. CORRECTED BWA RESULTS: BASE QUALITY DECREASE FROM 100 5/18/2012 28 IonTorrentDataAnalysis@MonsantoCo Quality keeps on dropping after 100 bp
  • 29. CORRECTED BWA RESULTS : BASE QUALITY DECREASES FROM 100 BP  Per Base Quality 5/18/2012 29 IonTorrentDataAnalysis@MonsantoCo
  • 30. CORRECTED BWA RESULTS: HIGH MISMATCH; HIGH INSERTION; HIGH DELETION; BUT 10% SMALLER THEN ORG READS  Error Profiles 5/18/2012 30 IonTorrentDataAnalysis@MonsantoCo
  • 31. CORRECTED BWA RESULTS: HIGH MISMATCH; HIGH INSERTION; HIGH DELETION; BUT 10% SMALLER THEN ORG READS  Error Profiles 5/18/2012 31 IonTorrentDataAnalysis@MonsantoCo
  • 32. CORRECTED BWA RESULTS: OVER REPRESENTATION BETWEEN 150-450 BP  K-mer Profile  There is over representation of K-mers from position 250 to 450. 5/18/2012 32 IonTorrentDataAnalysis@MonsantoCo
  • 33. ORG TMAP RESULTS: 20% INSERTION; 23% DELETION; 84% MISMATCH CORRECTED TMAP RESULTS: 13% INSERTION; 18% DELETION; 74% MISMATCH Corrected TMAP Mapping Results reads 72,058,773 mapped reads 68,116,303 mapped bases 12,763,573,084 perfectly mapped 18,029,367 len max 678 len mean 187 len stdev 81 mapq mean 13 mapq stdev 10 snp rate 3 ins rate 13 del rate 18 pct mismatch 74 base qual mean 20 base qual stdev 6 5/18/2012 33 IonTorrentDataAnalysis@MonsantoCo ORG TMAP Mapping Results reads 72,058,773 mapped reads 65,224,903 mapped bases 12,211,168,843 perfectly mapped 10,436,368 len max 638 len mean 187 len stdev 81 mapq mean 14 mapq stdev 10 snp rate 3 ins rate 20 del rate 23 pct mismatch 84 base qual mean 20 base qual stdev 6
  • 34. ORG TMAP RESULTS: 89% READ MAPPED CORRECTED TMAP RESULTS: 95% READS MAPPED 5/18/2012 34 IonTorrentDataAnalysis@MonsantoCo ORG BWA Mapping Results Total Number of Reads 72.1M Number of Reads Mapped 65.2M Percentage of Reads Mapped 91% Corrected BWA Mapping Results Total Number of Reads 72.1M Number of Reads Mapped 68.1M Percentage of Reads Mapped 95%
  • 35. CORRECTED TMAP RESULTS: BASE QUALITY DECREASE FROM 100 5/18/2012 35 IonTorrentDataAnalysis@MonsantoCo Quality keeps on dropping after 200 bp
  • 36. CORRECTED TMAP RESULTS : BASE QUALITY DECREASES FROM 100 BP  Per Base Quality 5/18/2012 36 IonTorrentDataAnalysis@MonsantoCo
  • 37. CORRECTED TMAP RESULTS: HIGH MISMATCH; HIGH INSERTION; HIGH DELETION; BUT 10% SMALLER THEN ORG READS  Error Profiles 5/18/2012 37 IonTorrentDataAnalysis@MonsantoCo
  • 38. CORRECTED TMAP RESULTS: HIGH MISMATCH; HIGH INSERTION; HIGH DELETION; BUT 10% SMALLER THEN ORG READS  Error Profiles 5/18/2012 38 IonTorrentDataAnalysis@MonsantoCo
  • 39. CORRECTED TMAP RESULTS : OVER REPRESENTATION BETWEEN 150-450 BP  K-mer Profile  There is over representation of K-mers from position 150 to 450. 5/18/2012 39 IonTorrentDataAnalysis@MonsantoCo
  • 41. 400 BP READS: N50 421BP; MAX CONTIG 4.6KB; TOTAL BASES 201MB 400 Bp Reads Assembly Stats Number of contigs 51,7835 Total size of contigs 201,990,292 Longest contig 4,684 Shortest contig 23 Number of contigs > 1K nt 11,939 2.30% Number of contigs > 10K nt 0 0.00% Mean contig size 390 Median contig size 329 N50 contig length 421 5/18/2012 41 IonTorrentDataAnalysis@MonsantoCo
  • 42. 400 BP READS: N50 426BP; MAX CONTIG 4.2KB; TOTAL BASES 201MB 400 Bp Reads clipped at length 450 Assembly Stats Number of contigs 509,308 Total size of contigs 201,527,141 Longest contig 4,272 Shortest contig 23 Number of contigs > 1K nt 13,781 2.70% Number of contigs > 10K nt 0 0.00% Mean contig size 396 Median contig size 331 N50 contig length 426 5/18/2012 42 IonTorrentDataAnalysis@MonsantoCo • Reads Clipped at length 450
  • 43. 400 BP READS: N50 430BP; MAX CONTIG 5.4KB; TOTAL BASES 192MB 400 Bp Reads clipped at length 450 qual 15 Assembly Stats Number of contigs 478,037 Total size of contigs 192,109,210 Longest contig 5,378 Shortest contig 23 Number of contigs > 1K nt 16,737 3.50% Number of contigs > 10K nt 0 0.00% Mean contig size 402 Median contig size 324 N50 contig length 430 5/18/2012 43 IonTorrentDataAnalysis@MonsantoCo • Reads Clipped at length 450 with minimum quality of 15
  • 44. ORG READS: N50 397BP; MAX CONTIG 5KB; TOTAL BASES 185MB Org Reads Assembly Stats Number of contigs 486,255 Total size of contigs 185,584,458 Longest contig 5,878 Shortest contig 24 Number of contigs > 1K nt 15,386 3.20% Number of contigs > 10K nt 0 0.00% Mean contig size 382 Median contig size 299 N50 contig length 397 5/18/2012 44 IonTorrentDataAnalysis@MonsantoCo
  • 45. ERROR CORRECTED READS: N50 550BP; MAX CONTIG 28KB; TOTAL BASES 203MB Error Corrected Reads Assembly Stats Number of contigs 424,264 Total size of contigs 203,921,151 Longest contig 28,009 Shortest contig 24 Number of contigs > 1K nt 33,025 7.80% Number of contigs > 10K nt 43 0.00% Mean contig size 481 Median contig size 328 N50 contig length 550 5/18/2012 45 IonTorrentDataAnalysis@MonsantoCo
  • 47. ACKNOWLEDGEMENTS  Todd Michael  Randall Kerstetter  Shiaw-Pyng Yang  Ryan Richt  Xuefeng Zhou 5/18/2012 47 IonTorrentDataAnalysis@MonsantoCo