SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
SV Detection via 
Anchored Assembly 
How can we best call structural variants? 
Becky Drees,Jeremy Bruestle, Cheinan Marks
SV Detection via Anchored Assembly 
Brief Description of Anchored Assembly Method 
Testing vs GIAB Variant Set & Validated SV Sets 
How Do We Describe SVs from Detected Breakpoints? 
Please do not distribute without permission. 
!
Input data 
Any Species 
with a draft genome 
Existing NGS Data 
No special library prep 
~20x per ploidy 
Please do not distribute without permission.
Step 1: Read Correction 
A* error correction 
1000 2000 3000 4000 5000 
0 
K-mer Quality Score Distribution 
0 200 400 600 800 1000 1200 
K-mer Count 
Please do not distribute without permission. 
Total K-mer Quality Score 
! 
• Similar to Euler or Quake 
• Corrects the read without 
using reference 
information 
• Reduces error from 1% to 
0.01%
Step 2: Remove Reference Matches 
Please do not distribute without permission. 
! 
• Remove reads that are an 
exact match to reference 
• Significantly reduces the 
complexity of the graph 
• Reduces required 
memory usage (40GB for 
whole human genome)
Step 3: Read Overlap Graph 
Read overlap 
assembly 
R7 R8 
R3 R6 R9 
8 9 8 9 
Please do not distribute without permission. 
! 
• Construct a read overlap 
graph with the remaining 
reads 
• Provides more context 
than a kmer-based de 
Bruijn graph 
7 7 7 
7 
7 
8 
7 
R1 R2 
R3 R5
Step 4: Anchoring 
Please do not distribute without permission. 
! 
• Anchor assemblies to 
reference coordinates 
• Provide breakpoint 
information while keeping 
reference bias low 
Anchoring
Step 5: Variant Validation 
Variant validation 
T T A G A T A A C A 
Please do not distribute without permission. 
! 
• Assemble variant sequence 
from read overlap graph 
• Computes minimal cost 
variation (similar to Smith- 
Waterman) 
• Calls variants and QC to 
remove likely false positives 
A A T G A C T T A G . . A 
G A C T T A G A T A 
A C 
C T T A G A T A A C 
A T T 
A G A T A A C A T T 
G 
G A T A A C A T T G 
G A C T T A G A T A A C A T T G 
T A G 
Reference 
Assembled 
R2 
R3 
R4 
R5 
R6
NA12878 SNP Detection vs GIAB 
Please do not distribute without permission. 
Anchored)Assembly)only) 
13,307) 
Genome)in)a)Bo8le)only) 
144,463) 
! 
2,596,897) 
Sensi@vity:))95%) 
Precision:))99.5%)
NA12878 Indel Detection vs GIAB 
Please do not distribute without permission.
NA12878 SV Insertions 
Chr. Mills 
Pindel 
50x 
AA 
50x AA 
200x 
1 247579917 
2 2576951 n n 
2 78558069 n n n 
2 187143096 n 
2 191002548 n n n 
3 43972635 n n n 
3 100737223 n n n 
3 100868475 n n n 
3 195823764 n n n 
5 78035993 n n n 
7 1528948 n n n 
7 2089876 
8 22717662 n n n 
9 97387403 n 
9 137361862 n 
12 103954170 n n 
13 76345722 n n n 
13 113760939 
13 114103496 n n 
15 26060663 n n 
15 92686723 n 
17 39240782 
17 77134774 n 
18 74794821 n n 
18 76182038 n n n 
19 1278240 n n n 
19 2247173 n n n 
20 55992535 n n 
21 39080014 n n 
X 94894756 n n 
Mills et al. Eichler Lab, U. Washington, Sanger validated 
Please do not distribute without permission.
NA12878 SV Deletions 
Please do not distribute without permission.
How to describe SVs from breakpoints? 
#CHROM 
POS 
ID 
REF 
ALT 
QUAL 
FILTER 
1 
1500000 
bnd_A 
T 
T[1:1501108[ 
100 
PASS 
INFO 
FORMAT 
SAMPLE 
DP=26;NS=1;SVTYPE=BND;MATEID=bnd_B;AID=1234 
DP:ED:OV 
26:72:89 
#CHROM 
POS 
ID 
REF 
ALT 
QUAL 
FILTER 
1 
1501108 
bnd_B 
G 
]1:1500000]G 
100 
PASS 
INFO 
FORMAT 
SAMPLE 
DP=26;NS=1;SVTYPE=BND;MATEID=bnd_A;AID=1234 
DP:ED:OV 
26:72:89 
Please do not distribute without permission. 
As breakend records: 
As SV events:
How to describe SVs from breakpoints? 
Assembled breakpoints can reveal variation that is hard to categorize 
• Different events can produce similar breakpoints 
• Multiple breakpoints can represent a single rearrangement event 
Please do not distribute without permission. 
CHR$1$ 
bnd_K$ bnd_L$ bnd_M$ bnd_N$ 
200000$ 190000$ 197000$200231$
How to describe SVs from breakpoints? 
A single breakpoint can contain multiple sequence changes: 
! 
• Inserted sequence at deletion breakpoints 
• Deleted or duplicated sequence at insert breakpoints 
• Deleted or duplicated sequence at inversion breakpoints 
deleted sequence duplicated sequence 
Please do not distribute without permission. 
CHR$1$ 
1700000$ 1704100$ 
1700100$ 1704250$ 
Inverted(sequence(
How to describe SVs from breakpoints? 
Many assemblies anchor to multiple genome locations 
• Variation in duplicated genome regions 
• Variation in repetitive elements 
• Transposons 
anchors to multiple places 
Please do not distribute without permission. 
CHR$1$ 
Alu$ 
unique anchor
Contact 
• More information 
• Trial on own data 
! 
becky@spiralgenetics.com 
niranjan@spiralgenetics.com 
! 
info@spiralgenetics.com 
Please do not distribute without permission.
Questions? 
Please do not distribute without permission.
Anchored Assembly SNP Distribution 
Please do not distribute without permission.
Anchored Assembly SV Distribution 
Please do not distribute without permission.

Más contenido relacionado

Similar a Aug2014 spiral genetics anchored assembly

OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomics
USC
 
Friedman test Stat
Friedman test Stat Friedman test Stat
Friedman test Stat
Kate Malda
 
Final Presentation- Fabri-Kal Summer 2015
Final Presentation- Fabri-Kal Summer 2015Final Presentation- Fabri-Kal Summer 2015
Final Presentation- Fabri-Kal Summer 2015
Matthew Schomisch
 
SAR ADC's and industrial Applications
SAR ADC's and industrial Applications SAR ADC's and industrial Applications
SAR ADC's and industrial Applications
ilker Şin
 

Similar a Aug2014 spiral genetics anchored assembly (20)

OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomics
 
resampling techniques in machine learning
resampling techniques in machine learningresampling techniques in machine learning
resampling techniques in machine learning
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Shunt Calibration For Dummies
Shunt Calibration For DummiesShunt Calibration For Dummies
Shunt Calibration For Dummies
 
Friedman test Stat
Friedman test Stat Friedman test Stat
Friedman test Stat
 
Final Presentation- Fabri-Kal Summer 2015
Final Presentation- Fabri-Kal Summer 2015Final Presentation- Fabri-Kal Summer 2015
Final Presentation- Fabri-Kal Summer 2015
 
An Analysis of Convolution for Inference
An Analysis of Convolution for InferenceAn Analysis of Convolution for Inference
An Analysis of Convolution for Inference
 
Design Basics on Power Amplifiers
Design Basics on Power Amplifiers Design Basics on Power Amplifiers
Design Basics on Power Amplifiers
 
SAR ADC's and industrial Applications
SAR ADC's and industrial Applications SAR ADC's and industrial Applications
SAR ADC's and industrial Applications
 
Self healing data
Self healing dataSelf healing data
Self healing data
 
Ohaus T Indicators
Ohaus T IndicatorsOhaus T Indicators
Ohaus T Indicators
 
jpg image processing nagham salim_as.ppt
jpg image processing nagham salim_as.pptjpg image processing nagham salim_as.ppt
jpg image processing nagham salim_as.ppt
 
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUsC-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
 
Ttl interface-7-inch-1024x600-all-view-angle-lcd-d
Ttl interface-7-inch-1024x600-all-view-angle-lcd-dTtl interface-7-inch-1024x600-all-view-angle-lcd-d
Ttl interface-7-inch-1024x600-all-view-angle-lcd-d
 
7 segment display.ppt
7 segment display.ppt7 segment display.ppt
7 segment display.ppt
 
Manual de serviço TV LCD/LED PANASONIC TC-L47 E5B chassis LA35.
Manual de serviço TV LCD/LED PANASONIC TC-L47 E5B chassis LA35.Manual de serviço TV LCD/LED PANASONIC TC-L47 E5B chassis LA35.
Manual de serviço TV LCD/LED PANASONIC TC-L47 E5B chassis LA35.
 
GS8208 led datasheet
GS8208 led datasheetGS8208 led datasheet
GS8208 led datasheet
 
Data Mining Lecture_4.pptx
Data Mining Lecture_4.pptxData Mining Lecture_4.pptx
Data Mining Lecture_4.pptx
 
Exome Sequencing
Exome SequencingExome Sequencing
Exome Sequencing
 
"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net
 

Más de GenomeInABottle

Más de GenomeInABottle (20)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 

Último

🌹Attapur⬅️ Vip Call Girls Hyderabad 📱9352852248 Book Well Trand Call Girls In...
🌹Attapur⬅️ Vip Call Girls Hyderabad 📱9352852248 Book Well Trand Call Girls In...🌹Attapur⬅️ Vip Call Girls Hyderabad 📱9352852248 Book Well Trand Call Girls In...
🌹Attapur⬅️ Vip Call Girls Hyderabad 📱9352852248 Book Well Trand Call Girls In...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
chetankumar9855
 
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Sheetaleventcompany
 
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
adilkhan87451
 

Último (20)

Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
 
🌹Attapur⬅️ Vip Call Girls Hyderabad 📱9352852248 Book Well Trand Call Girls In...
🌹Attapur⬅️ Vip Call Girls Hyderabad 📱9352852248 Book Well Trand Call Girls In...🌹Attapur⬅️ Vip Call Girls Hyderabad 📱9352852248 Book Well Trand Call Girls In...
🌹Attapur⬅️ Vip Call Girls Hyderabad 📱9352852248 Book Well Trand Call Girls In...
 
Call Girls Madurai Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Madurai Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Madurai Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Madurai Just Call 9630942363 Top Class Call Girl Service Available
 
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
 
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
 
Andheri East ) Call Girls in Mumbai Phone No 9004268417 Elite Escort Service ...
Andheri East ) Call Girls in Mumbai Phone No 9004268417 Elite Escort Service ...Andheri East ) Call Girls in Mumbai Phone No 9004268417 Elite Escort Service ...
Andheri East ) Call Girls in Mumbai Phone No 9004268417 Elite Escort Service ...
 
Call Girls Kolkata Kalikapur 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
Call Girls Kolkata Kalikapur 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...Call Girls Kolkata Kalikapur 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
Call Girls Kolkata Kalikapur 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
 
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
 
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
 
Call Girls Mysore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mysore Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Mysore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mysore Just Call 8250077686 Top Class Call Girl Service Available
 
Independent Call Girls Service Mohali Sector 116 | 6367187148 | Call Girl Ser...
Independent Call Girls Service Mohali Sector 116 | 6367187148 | Call Girl Ser...Independent Call Girls Service Mohali Sector 116 | 6367187148 | Call Girl Ser...
Independent Call Girls Service Mohali Sector 116 | 6367187148 | Call Girl Ser...
 
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
 
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
 
Coimbatore Call Girls in Coimbatore 7427069034 genuine Escort Service Girl 10...
Coimbatore Call Girls in Coimbatore 7427069034 genuine Escort Service Girl 10...Coimbatore Call Girls in Coimbatore 7427069034 genuine Escort Service Girl 10...
Coimbatore Call Girls in Coimbatore 7427069034 genuine Escort Service Girl 10...
 
Saket * Call Girls in Delhi - Phone 9711199012 Escorts Service at 6k to 50k a...
Saket * Call Girls in Delhi - Phone 9711199012 Escorts Service at 6k to 50k a...Saket * Call Girls in Delhi - Phone 9711199012 Escorts Service at 6k to 50k a...
Saket * Call Girls in Delhi - Phone 9711199012 Escorts Service at 6k to 50k a...
 
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
 
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
 
Top Rated Call Girls Kerala ☎ 8250092165👄 Delivery in 20 Mins Near Me
Top Rated Call Girls Kerala ☎ 8250092165👄 Delivery in 20 Mins Near MeTop Rated Call Girls Kerala ☎ 8250092165👄 Delivery in 20 Mins Near Me
Top Rated Call Girls Kerala ☎ 8250092165👄 Delivery in 20 Mins Near Me
 
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
 

Aug2014 spiral genetics anchored assembly

  • 1. SV Detection via Anchored Assembly How can we best call structural variants? Becky Drees,Jeremy Bruestle, Cheinan Marks
  • 2. SV Detection via Anchored Assembly Brief Description of Anchored Assembly Method Testing vs GIAB Variant Set & Validated SV Sets How Do We Describe SVs from Detected Breakpoints? Please do not distribute without permission. !
  • 3. Input data Any Species with a draft genome Existing NGS Data No special library prep ~20x per ploidy Please do not distribute without permission.
  • 4. Step 1: Read Correction A* error correction 1000 2000 3000 4000 5000 0 K-mer Quality Score Distribution 0 200 400 600 800 1000 1200 K-mer Count Please do not distribute without permission. Total K-mer Quality Score ! • Similar to Euler or Quake • Corrects the read without using reference information • Reduces error from 1% to 0.01%
  • 5. Step 2: Remove Reference Matches Please do not distribute without permission. ! • Remove reads that are an exact match to reference • Significantly reduces the complexity of the graph • Reduces required memory usage (40GB for whole human genome)
  • 6. Step 3: Read Overlap Graph Read overlap assembly R7 R8 R3 R6 R9 8 9 8 9 Please do not distribute without permission. ! • Construct a read overlap graph with the remaining reads • Provides more context than a kmer-based de Bruijn graph 7 7 7 7 7 8 7 R1 R2 R3 R5
  • 7. Step 4: Anchoring Please do not distribute without permission. ! • Anchor assemblies to reference coordinates • Provide breakpoint information while keeping reference bias low Anchoring
  • 8. Step 5: Variant Validation Variant validation T T A G A T A A C A Please do not distribute without permission. ! • Assemble variant sequence from read overlap graph • Computes minimal cost variation (similar to Smith- Waterman) • Calls variants and QC to remove likely false positives A A T G A C T T A G . . A G A C T T A G A T A A C C T T A G A T A A C A T T A G A T A A C A T T G G A T A A C A T T G G A C T T A G A T A A C A T T G T A G Reference Assembled R2 R3 R4 R5 R6
  • 9. NA12878 SNP Detection vs GIAB Please do not distribute without permission. Anchored)Assembly)only) 13,307) Genome)in)a)Bo8le)only) 144,463) ! 2,596,897) Sensi@vity:))95%) Precision:))99.5%)
  • 10. NA12878 Indel Detection vs GIAB Please do not distribute without permission.
  • 11. NA12878 SV Insertions Chr. Mills Pindel 50x AA 50x AA 200x 1 247579917 2 2576951 n n 2 78558069 n n n 2 187143096 n 2 191002548 n n n 3 43972635 n n n 3 100737223 n n n 3 100868475 n n n 3 195823764 n n n 5 78035993 n n n 7 1528948 n n n 7 2089876 8 22717662 n n n 9 97387403 n 9 137361862 n 12 103954170 n n 13 76345722 n n n 13 113760939 13 114103496 n n 15 26060663 n n 15 92686723 n 17 39240782 17 77134774 n 18 74794821 n n 18 76182038 n n n 19 1278240 n n n 19 2247173 n n n 20 55992535 n n 21 39080014 n n X 94894756 n n Mills et al. Eichler Lab, U. Washington, Sanger validated Please do not distribute without permission.
  • 12. NA12878 SV Deletions Please do not distribute without permission.
  • 13. How to describe SVs from breakpoints? #CHROM POS ID REF ALT QUAL FILTER 1 1500000 bnd_A T T[1:1501108[ 100 PASS INFO FORMAT SAMPLE DP=26;NS=1;SVTYPE=BND;MATEID=bnd_B;AID=1234 DP:ED:OV 26:72:89 #CHROM POS ID REF ALT QUAL FILTER 1 1501108 bnd_B G ]1:1500000]G 100 PASS INFO FORMAT SAMPLE DP=26;NS=1;SVTYPE=BND;MATEID=bnd_A;AID=1234 DP:ED:OV 26:72:89 Please do not distribute without permission. As breakend records: As SV events:
  • 14. How to describe SVs from breakpoints? Assembled breakpoints can reveal variation that is hard to categorize • Different events can produce similar breakpoints • Multiple breakpoints can represent a single rearrangement event Please do not distribute without permission. CHR$1$ bnd_K$ bnd_L$ bnd_M$ bnd_N$ 200000$ 190000$ 197000$200231$
  • 15. How to describe SVs from breakpoints? A single breakpoint can contain multiple sequence changes: ! • Inserted sequence at deletion breakpoints • Deleted or duplicated sequence at insert breakpoints • Deleted or duplicated sequence at inversion breakpoints deleted sequence duplicated sequence Please do not distribute without permission. CHR$1$ 1700000$ 1704100$ 1700100$ 1704250$ Inverted(sequence(
  • 16. How to describe SVs from breakpoints? Many assemblies anchor to multiple genome locations • Variation in duplicated genome regions • Variation in repetitive elements • Transposons anchors to multiple places Please do not distribute without permission. CHR$1$ Alu$ unique anchor
  • 17. Contact • More information • Trial on own data ! becky@spiralgenetics.com niranjan@spiralgenetics.com ! info@spiralgenetics.com Please do not distribute without permission.
  • 18. Questions? Please do not distribute without permission.
  • 19. Anchored Assembly SNP Distribution Please do not distribute without permission.
  • 20. Anchored Assembly SV Distribution Please do not distribute without permission.