SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
Genome in a Bottle: So you’ve
sequenced a genome – how well did
you do?
February 2015
Justin Zook, Marc Salit, and the Genome
in a Bottle Consortium
Whole genome sequencing technologies
disagree about 100,000’s of variants
3,198,316
(80.05%)
125,574
(3.14%)
Platform
#1
Platform
#2
Platform #3
230,311
(5.76%)
121,440
(3.04%)
208,038
(5.21%)
71,944
(1.80%)
39,604
(0.99%)
# SNPs
(% of SNPs detected
by any platform)
Bioinformatics programs also disagree
O’Rawe et al. Genome Medicine 2013, 5:28
NIST-hosted
Genome in a Bottle Consortium
• Infrastructure for performance
assessment of NGS
– support science-based regulatory
oversight
• No widely accepted set of metrics
to characterize the fidelity of
variant calls from NGS…
• Genome in a Bottle Consortium is
developing standards to address
this…
– well-characterized human genomes
as Reference Materials (RMs)
• characterized and disseminated by NIST
– tools and methods to use these RMs
• Global Alliance for Genomics and
Health Benchmarking Team
http://genomeinabottle.org
Genome in a Bottle
Consortium Development
• NIST met with sequencing
technology developers to assess
standards needs
– Stanford, June 2011
• Open, exploratory workshop
– ASHG, Montreal, Canada
– October 2011
• Small, invitational workshop at
NIST to develop consortium for
human genome reference
materials
– FDA, NCBI, NHGRI, NCI, CDC, Wash
U, Broad, technology developers,
clinical labs, CAP, PGP, Partners,
ABRF, others
– developed draft work plan
– April 2012
• Open, public meetings of GIAB
– August 2012 at NIST
– March 2013 at Xgen
– August 2013 at NIST
– January 2014 at Stanford
– August 2014 at NIST
– January 2015 at Stanford
• Website
– www.genomeinabottle.org
Others working in this space…
Well-characterized genomes
• Illumina Platinum Genomes
• CDC GeT-RM
• Korean Genome Project
• Human Longevity, Inc.
• Hyditaform mole haploid
cell line
• Genome Reference
Consortium
Performance Metrics
• Global Alliance for
Genomics and Health
Benchmarking Team
• NCBI/CDC GeT-RM Browser
• GCAT website
NIST Plays a Role in the First FDA Authorization for
Next-Generation Sequencer
November 20, 2013
Measurement Process
Sample
gDNA isolation
Library Prep
Sequencing
Alignment/Mapping
Variant Calling
Confidence Estimates
Downstream Analysis
• gDNA reference
materials will be
developed to
characterize
performance of a part
of process
– materials will be
certified for their
variants against a
reference sequence,
with confidence
estimates
genericmeasurementprocess
Analytical
steps
Pre-Analytical
steps
Clinical
Interpretation
• NIST worked with GIAB
to select genomes
• Current genomes
– NA12878 HapMap
sample as Pilot sample
• part of 17-member
pedigree
– 2 trios from PGP
• Ashkenazim
• Asian
12889 12890 12891 12892
12877 12878
12879 12880 12881 12882 12883 12884 12885 1288712886 12888 12893
CEPH Utah Pedigree 1463
Putting “Genomes” in Bottles
11 children
NIST Human Genome RMs in the
pipeline
• All 10 ug samples of DNA
isolated from multistage large
growth cell cultures
– all are intended to act as stable,
homogeneous references
suitable for use in regulated
applications
– all genomes also available from
Coriell repository
• Pilot Genome
– ~8400 tubes
• Ashkenazim Jewish Trio
– ~10000 son; ~2500 each parent
• Asian Trio
– ~10000 son; parents not yet
planned as NIST RM
Goals for Data to Accompany RM
• ~0 false positive AND false negative calls in
confident regions
• Include as much of the genome as possible in
the confident regions (i.e., don’t just take the
intersection)
• Avoid bias towards any particular platform
– take advantage of strengths of each platform
• Avoid bias towards any particular
bioinformatics algorithms
11
Pilot Genome: Integrate 12 14
Datasets from 5 platforms
12
Dataset#1Dataset#2Dataset#3
Annotation #1
Histogram
(e.g., coverage)
Dataset#1Dataset#2Dataset#3
Annotation #2
Histogram
(e.g., strand bias)
Site A
Site B
Potential
Bias
Site C
Dataset Site A Site B Site C
Dataset #1 0/0 0/0 1/1
Dataset #2 0/1 0/1 1/1
Dataset #3 0/0 0/1 1/1
Integration 0/0 0/1 Uncer-
tain
Candidate
variants
Concordant
variants
Find
characteristics
of bias
Arbitrate using
evidence of
bias
Confidence
Level
Integration Methods to Establish
Benchmark Variant Calls
Integration Methods to Establish
Benchmark Variant Calls
Candidate variants
Concordant variants
Find characteristics of bias
Arbitrate using evidence of
bias
Confidence Level Zook et al., Nature Biotechnology, 2014.
Assigning confidence to genotypes
High-confidence sites
• Sequencing/bioinformatics
methods agree or we
understand the biases
causing disagreement
• At least some methods have
no evidence of bias
• Inherited as expected
Less confident sites
• In a region known to be
difficult for current
technologies
• State reasons for lower
confidence
• If a site is near a low
confidence site, make it low
confidence
Challenges with assessing
performance
• All variant types are not
equal
• All regions of the genome
are not equal
• Labeling difficult variants
as uncertain leads to
higher apparent accuracy
when assessing
performance
• Genotypes fall in 3+
categories (not
positive/negative)
– standard diagnostic
accuracy measures not
well posed
16
Challenge in variant comparison: Complex
variants have multiple correct representations
BWA
ssaha2
CGTools
Novo-
align
Ref:
T
insertion
TCTCT
insertion
17
FP SNPs FP MNPs FP indels
Traditional
comparison
0.38%
(610)
100%
(915)
6.5%
(733)
Comparison
with
realignment
0.15%
(249)
4.2%
(38)
2.6%
(298)
Global Alliance for Genomics and Health
Benchmarking Task Team
• Formed June 2014 to develop
methods and tools for comparing
variant calls to a benchmark
• Developed standardized definitions
for performance metrics like TP, FP,
and FN.
• Initial focus on germline SNPs/indels
• Developing benchmarking tools
• Comparison engine
• Pluggable web interface with
modules for:
• Reporting/calculation of metrics
• Visualization/user interface
• Working with Genome in a Bottle
Consortium to host data and calls
from their well-characterized
genomes www.bioplanet.com/gcat
Example User Interface
Stratifying Performance
• Measure performance for
different types of variants in
different sequence contexts
– Types of variants
• SNPs
• indels of different sizes
• complex variants
• structural variants
– Sequence contexts
• Homopolymers,
• STRs
• Duplications
– Functional context
• Exome vs genome, etc
– Data characteristics
• Coverage
• Mapping quality
• Challenge of smaller gene
panels vs genome
sequencing
– one RM may not have a
sufficient number of
examples of different classes
of variants or sequence
contexts
– likely need more samples
with specific types of variants
NCBI/CDC GeT-RM Browser
• http://www.ncbi.nlm.nih.gov/variation/tools/get-rm/
• Allows visualization of questionable calls
Initial uses of high-confidence NIST-
GIAB genotypes for NA12878
• NIST have released
several versions of high-
confidence genotypes
for its pilot RM
• These data are
presently being used for
benchmarking
– prior to release of RMs
– SNPs & indels
• ~77% of the genome
Using Genome in a Bottle calls to
benchmark clinical exome sequencing
at Mount Sinai School of Medicine
“We evaluate a set of
NA12878 technical replicates
against GIAB for each new
pipeline version.”
Benchmarking somatic variant calling
at Qiagen
Implications of Technical Accuracy in
Medical Genome Sequencing
• Collaboration with Euan
Ashley group at Stanford
• What is accuracy for
functional variants?
• How much of the exome
falls in high confidence
regions?
• “Black list” in databases
• Sensitivity
– WExS (95%) < WGS (98%)
• especially splicing
– genome < nonsyn < syn
– Most exome FNs caused by
low coverage
– Most WGS FNs cause by
filtering
• Only 81 % of ClinVar
pathogenic or likely
pathogenic SNPs fall in
high-confidence regions
– Lots of work to do!
Overview of NIST RM Development
Genome(s) Q4 2014 Q1 2015 Q2 2015 Q3 2015 Q4 2015
HG-
001/NA1287
8
(“Pilot”
Genome)
Release NIST
RM8398;
Preliminary
large
deletions
Refined
Structural
Variants
HG-002 to
HG-004
(Ashkenazim
trio)
Illumina,
Complete
Genomics,
Ion,
BioNano,
homogeneity
/stability
Preliminary
SNPs/indels;
120x-150x
PacBio data;
“moleculo”;
mate-pair;
CG-LFR
Refined
SNPs/indels
;
Preliminary
SVs
Refined
Structural
Variants
NIST RMs
8391/839
2 release
HG-005 (son
in Asian trio)
Illumina,
Complete
Genomics,
Ion,
BioNano,
homogeneity
/stability
“moleculo”;
mate-pair;
CG-LFR
Preliminary
SNPs/indels
Refined
SNPs/indels;
Refined
Structural
Variants
NIST
RM8393
release
Ashkenazim Jewish PGP RM Trio
Dataset Characteristics Coverage Availability Good for…
Illumina Paired-
end
150x150bp ~300x/individu
al
Fastq on ftp SNPs/indels/so
me SVs
Illumina Long
Mate pair
~6000 bp insert ~40x/individual Feb-Mar 2015 SVs
Illumina
“moleculo”
Custom library ~30x by long
fragments
Feb-Mar 2015 SVs/phasing/as
sembly
Complete
Genomics
100x/individual On ftp SNPs/indels/so
me SVs
Complete
Genomics
LFR ?? SNPs/indels/ph
asing
Ion Proton Exome 1000x/individu
al
On SRA SNPs/indels in
exome
BioNano
Genomics
Feb 2015 SVs/assembly
PacBio ~10kb reads ~120-150x on
AJ trio
Finished ~Mar
2015
SVs/phasing/as
sembly/STRs
Asian PGP trio
• Similar sequencing to
Ashkenazim trio except
for PacBio
• Only son will be NIST
RM
Future Directions
Germline mutations
• Difficult regions/variants
– Long-read technologies
– Forming an analysis group
• Tools for assessing
performance
– How to stratify performance
and understand biases?
Somatic mutations
• Pilot interlaboratory study
to assess comparability of
spike-ins
• Commercial members
developing FFPE cell lines
• Participants interested in
mixing different RMs
How to get involved
• Use our integrated
SNP/indel genotypes for
NA12878 and give us
feedback
– Cells and DNA currently
available from Coriell
– NIST RM available April
2015
• Join our new Analysis
group
– Use Long-read
technologies
– Structural Variant calls
– De novo assembly
– Help create the best-ever
characterized trio
• Attend our biannual
workshops (January in CA,
August in MD)
• Develop tools/metrics
with Global Alliance for
Genomics and Health
Benchmarking Team
Acknowledgments
• FDA – Elizabeth Mansfield,
HPC staff
• HSPH
• GCAT - David Mittelman,
Jason Wang
• Francisco De La Vega
• Illumina - Mike Eberle
• Personalis - Deanna Church
• NCBI – Chunlin Xiao
• Celera - Andrew Grupe
• Genome in a Bottle
– www.genomeinabottle.org
– New members welcome!
– Sign up for email newsletters
– jzook@nist.gov

Más contenido relacionado

La actualidad más candente

Aug2015 horizon diagnostics
Aug2015 horizon diagnosticsAug2015 horizon diagnostics
Aug2015 horizon diagnosticsGenomeInABottle
 
GIAB GRC Workshop slides
GIAB GRC Workshop slidesGIAB GRC Workshop slides
GIAB GRC Workshop slidesGenomeInABottle
 
161115 precision fda giab
161115 precision fda giab161115 precision fda giab
161115 precision fda giabGenomeInABottle
 
Giab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptxGiab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptxGenomeInABottle
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GenomeInABottle
 
2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justinGenomeInABottle
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshopGenomeInABottle
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomesGenomeInABottle
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleGenomeInABottle
 
GIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seqGIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seqGenomeInABottle
 
160628 giab for festival of genomics
160628 giab for festival of genomics160628 giab for festival of genomics
160628 giab for festival of genomicsGenomeInABottle
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128GenomeInABottle
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3GenomeInABottle
 
GIAB Sep2016 Lightning tera bowers horizon nipt
GIAB Sep2016 Lightning tera bowers horizon niptGIAB Sep2016 Lightning tera bowers horizon nipt
GIAB Sep2016 Lightning tera bowers horizon niptGenomeInABottle
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsGenomeInABottle
 

La actualidad más candente (20)

Aug2015 horizon diagnostics
Aug2015 horizon diagnosticsAug2015 horizon diagnostics
Aug2015 horizon diagnostics
 
GIAB GRC Workshop slides
GIAB GRC Workshop slidesGIAB GRC Workshop slides
GIAB GRC Workshop slides
 
161115 precision fda giab
161115 precision fda giab161115 precision fda giab
161115 precision fda giab
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
Giab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptxGiab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptx
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005
 
2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin
 
170326 giab abrf
170326 giab abrf170326 giab abrf
170326 giab abrf
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
2016 ashg giab poster
2016 ashg giab poster2016 ashg giab poster
2016 ashg giab poster
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottle
 
Jan2016 horizon GIAB
Jan2016 horizon GIABJan2016 horizon GIAB
Jan2016 horizon GIAB
 
GIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seqGIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seq
 
160628 giab for festival of genomics
160628 giab for festival of genomics160628 giab for festival of genomics
160628 giab for festival of genomics
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
 
GIAB Sep2016 Lightning tera bowers horizon nipt
GIAB Sep2016 Lightning tera bowers horizon niptGIAB Sep2016 Lightning tera bowers horizon nipt
GIAB Sep2016 Lightning tera bowers horizon nipt
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequins
 
heb_lab_talk_2015
heb_lab_talk_2015heb_lab_talk_2015
heb_lab_talk_2015
 

Similar a 150224 giab 30 min generic slides

Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenomeInABottle
 
GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517GenomeInABottle
 
140128 use cases of giab RMs
140128 use cases of giab RMs140128 use cases of giab RMs
140128 use cases of giab RMsGenomeInABottle
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GenomeInABottle
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016GenomeInABottle
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030GenomeInABottle
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marcGenomeInABottle
 
140127 GIAB update and NIST high-confidence calls
140127 GIAB update and NIST high-confidence calls140127 GIAB update and NIST high-confidence calls
140127 GIAB update and NIST high-confidence callsGenomeInABottle
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Nathan Olson
 
Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821GenomeInABottle
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923GenomeInABottle
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...nist-spin
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917GenomeInABottle
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...Integrated DNA Technologies
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GenomeInABottle
 
2014 agbt giab data integration poster 140206
2014 agbt giab data integration poster 1402062014 agbt giab data integration poster 140206
2014 agbt giab data integration poster 140206GenomeInABottle
 
Aug2013 reference material selection and design working group
Aug2013 reference material selection and design working groupAug2013 reference material selection and design working group
Aug2013 reference material selection and design working groupGenomeInABottle
 
Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Malachi Griffith
 
NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1QIAGEN
 

Similar a 150224 giab 30 min generic slides (20)

171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp Leiden
 
GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517
 
140128 use cases of giab RMs
140128 use cases of giab RMs140128 use cases of giab RMs
140128 use cases of giab RMs
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marc
 
140127 GIAB update and NIST high-confidence calls
140127 GIAB update and NIST high-confidence calls140127 GIAB update and NIST high-confidence calls
140127 GIAB update and NIST high-confidence calls
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
2014 agbt giab data integration poster 140206
2014 agbt giab data integration poster 1402062014 agbt giab data integration poster 140206
2014 agbt giab data integration poster 140206
 
Aug2013 reference material selection and design working group
Aug2013 reference material selection and design working groupAug2013 reference material selection and design working group
Aug2013 reference material selection and design working group
 
Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...
 
NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1
 

Más de GenomeInABottle

GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GenomeInABottle
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGenomeInABottle
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907GenomeInABottle
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...GenomeInABottle
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGenomeInABottle
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020GenomeInABottle
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGenomeInABottle
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGenomeInABottle
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGenomeInABottle
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGenomeInABottle
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGenomeInABottle
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGenomeInABottle
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyGenomeInABottle
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GenomeInABottle
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphsGenomeInABottle
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normalGenomeInABottle
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccsGenomeInABottle
 

Más de GenomeInABottle (20)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 

Último

Presentation on General Anesthetics pdf.
Presentation on General Anesthetics pdf.Presentation on General Anesthetics pdf.
Presentation on General Anesthetics pdf.Prerana Jadhav
 
LESSON PLAN ON fever.pdf child health nursing
LESSON PLAN ON fever.pdf child health nursingLESSON PLAN ON fever.pdf child health nursing
LESSON PLAN ON fever.pdf child health nursingSakthi Kathiravan
 
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisGolden Helix
 
ANEMIA IN PREGNANCY by Dr. Akebom Kidanemariam
ANEMIA IN PREGNANCY by Dr. Akebom KidanemariamANEMIA IN PREGNANCY by Dr. Akebom Kidanemariam
ANEMIA IN PREGNANCY by Dr. Akebom KidanemariamAkebom Gebremichael
 
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityCEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityHarshChauhan475104
 
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdfLippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdfSreeja Cherukuru
 
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...MehranMouzam
 
HERBS AS HEALTH FOOD - Brief introduction and therapeutic applications of: N...
HERBS AS HEALTH FOOD - Brief introduction and therapeutic applications of:  N...HERBS AS HEALTH FOOD - Brief introduction and therapeutic applications of:  N...
HERBS AS HEALTH FOOD - Brief introduction and therapeutic applications of: N...Divya Kanojiya
 
Plant Fibres used as Surgical Dressings PDF.pdf
Plant Fibres used as Surgical Dressings PDF.pdfPlant Fibres used as Surgical Dressings PDF.pdf
Plant Fibres used as Surgical Dressings PDF.pdfDivya Kanojiya
 
The next social challenge to public health: the information environment.pptx
The next social challenge to public health:  the information environment.pptxThe next social challenge to public health:  the information environment.pptx
The next social challenge to public health: the information environment.pptxTina Purnat
 
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdfMedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdfSasikiranMarri
 
Big Data Analysis Suggests COVID Vaccination Increases Excess Mortality Of ...
Big Data Analysis Suggests COVID  Vaccination Increases Excess Mortality Of  ...Big Data Analysis Suggests COVID  Vaccination Increases Excess Mortality Of  ...
Big Data Analysis Suggests COVID Vaccination Increases Excess Mortality Of ...sdateam0
 
Presentació "Real-Life VR Integration for Mild Cognitive Impairment Rehabilit...
Presentació "Real-Life VR Integration for Mild Cognitive Impairment Rehabilit...Presentació "Real-Life VR Integration for Mild Cognitive Impairment Rehabilit...
Presentació "Real-Life VR Integration for Mild Cognitive Impairment Rehabilit...Badalona Serveis Assistencials
 
COVID-19 (NOVEL CORONA VIRUS DISEASE PANDEMIC ).pptx
COVID-19  (NOVEL CORONA  VIRUS DISEASE PANDEMIC ).pptxCOVID-19  (NOVEL CORONA  VIRUS DISEASE PANDEMIC ).pptx
COVID-19 (NOVEL CORONA VIRUS DISEASE PANDEMIC ).pptxBibekananda shah
 
Screening for colorectal cancer AAU.pptx
Screening for colorectal cancer AAU.pptxScreening for colorectal cancer AAU.pptx
Screening for colorectal cancer AAU.pptxtadehabte
 
Basic principles involved in the traditional systems of medicine PDF.pdf
Basic principles involved in the traditional systems of medicine PDF.pdfBasic principles involved in the traditional systems of medicine PDF.pdf
Basic principles involved in the traditional systems of medicine PDF.pdfDivya Kanojiya
 
Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.ANJALI
 
Introduction to Sports Injuries by- Dr. Anjali Rai
Introduction to Sports Injuries by- Dr. Anjali RaiIntroduction to Sports Injuries by- Dr. Anjali Rai
Introduction to Sports Injuries by- Dr. Anjali RaiGoogle
 
Role of medicinal and aromatic plants in national economy PDF.pdf
Role of medicinal and aromatic plants in national economy PDF.pdfRole of medicinal and aromatic plants in national economy PDF.pdf
Role of medicinal and aromatic plants in national economy PDF.pdfDivya Kanojiya
 

Último (20)

Presentation on General Anesthetics pdf.
Presentation on General Anesthetics pdf.Presentation on General Anesthetics pdf.
Presentation on General Anesthetics pdf.
 
LESSON PLAN ON fever.pdf child health nursing
LESSON PLAN ON fever.pdf child health nursingLESSON PLAN ON fever.pdf child health nursing
LESSON PLAN ON fever.pdf child health nursing
 
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
 
ANEMIA IN PREGNANCY by Dr. Akebom Kidanemariam
ANEMIA IN PREGNANCY by Dr. Akebom KidanemariamANEMIA IN PREGNANCY by Dr. Akebom Kidanemariam
ANEMIA IN PREGNANCY by Dr. Akebom Kidanemariam
 
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityCEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
 
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdfLippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
 
JANGAMA VISHA .pptx-
JANGAMA VISHA .pptx-JANGAMA VISHA .pptx-
JANGAMA VISHA .pptx-
 
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
 
HERBS AS HEALTH FOOD - Brief introduction and therapeutic applications of: N...
HERBS AS HEALTH FOOD - Brief introduction and therapeutic applications of:  N...HERBS AS HEALTH FOOD - Brief introduction and therapeutic applications of:  N...
HERBS AS HEALTH FOOD - Brief introduction and therapeutic applications of: N...
 
Plant Fibres used as Surgical Dressings PDF.pdf
Plant Fibres used as Surgical Dressings PDF.pdfPlant Fibres used as Surgical Dressings PDF.pdf
Plant Fibres used as Surgical Dressings PDF.pdf
 
The next social challenge to public health: the information environment.pptx
The next social challenge to public health:  the information environment.pptxThe next social challenge to public health:  the information environment.pptx
The next social challenge to public health: the information environment.pptx
 
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdfMedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
 
Big Data Analysis Suggests COVID Vaccination Increases Excess Mortality Of ...
Big Data Analysis Suggests COVID  Vaccination Increases Excess Mortality Of  ...Big Data Analysis Suggests COVID  Vaccination Increases Excess Mortality Of  ...
Big Data Analysis Suggests COVID Vaccination Increases Excess Mortality Of ...
 
Presentació "Real-Life VR Integration for Mild Cognitive Impairment Rehabilit...
Presentació "Real-Life VR Integration for Mild Cognitive Impairment Rehabilit...Presentació "Real-Life VR Integration for Mild Cognitive Impairment Rehabilit...
Presentació "Real-Life VR Integration for Mild Cognitive Impairment Rehabilit...
 
COVID-19 (NOVEL CORONA VIRUS DISEASE PANDEMIC ).pptx
COVID-19  (NOVEL CORONA  VIRUS DISEASE PANDEMIC ).pptxCOVID-19  (NOVEL CORONA  VIRUS DISEASE PANDEMIC ).pptx
COVID-19 (NOVEL CORONA VIRUS DISEASE PANDEMIC ).pptx
 
Screening for colorectal cancer AAU.pptx
Screening for colorectal cancer AAU.pptxScreening for colorectal cancer AAU.pptx
Screening for colorectal cancer AAU.pptx
 
Basic principles involved in the traditional systems of medicine PDF.pdf
Basic principles involved in the traditional systems of medicine PDF.pdfBasic principles involved in the traditional systems of medicine PDF.pdf
Basic principles involved in the traditional systems of medicine PDF.pdf
 
Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.
 
Introduction to Sports Injuries by- Dr. Anjali Rai
Introduction to Sports Injuries by- Dr. Anjali RaiIntroduction to Sports Injuries by- Dr. Anjali Rai
Introduction to Sports Injuries by- Dr. Anjali Rai
 
Role of medicinal and aromatic plants in national economy PDF.pdf
Role of medicinal and aromatic plants in national economy PDF.pdfRole of medicinal and aromatic plants in national economy PDF.pdf
Role of medicinal and aromatic plants in national economy PDF.pdf
 

150224 giab 30 min generic slides

  • 1. Genome in a Bottle: So you’ve sequenced a genome – how well did you do? February 2015 Justin Zook, Marc Salit, and the Genome in a Bottle Consortium
  • 2. Whole genome sequencing technologies disagree about 100,000’s of variants 3,198,316 (80.05%) 125,574 (3.14%) Platform #1 Platform #2 Platform #3 230,311 (5.76%) 121,440 (3.04%) 208,038 (5.21%) 71,944 (1.80%) 39,604 (0.99%) # SNPs (% of SNPs detected by any platform)
  • 3. Bioinformatics programs also disagree O’Rawe et al. Genome Medicine 2013, 5:28
  • 4. NIST-hosted Genome in a Bottle Consortium • Infrastructure for performance assessment of NGS – support science-based regulatory oversight • No widely accepted set of metrics to characterize the fidelity of variant calls from NGS… • Genome in a Bottle Consortium is developing standards to address this… – well-characterized human genomes as Reference Materials (RMs) • characterized and disseminated by NIST – tools and methods to use these RMs • Global Alliance for Genomics and Health Benchmarking Team http://genomeinabottle.org
  • 5. Genome in a Bottle Consortium Development • NIST met with sequencing technology developers to assess standards needs – Stanford, June 2011 • Open, exploratory workshop – ASHG, Montreal, Canada – October 2011 • Small, invitational workshop at NIST to develop consortium for human genome reference materials – FDA, NCBI, NHGRI, NCI, CDC, Wash U, Broad, technology developers, clinical labs, CAP, PGP, Partners, ABRF, others – developed draft work plan – April 2012 • Open, public meetings of GIAB – August 2012 at NIST – March 2013 at Xgen – August 2013 at NIST – January 2014 at Stanford – August 2014 at NIST – January 2015 at Stanford • Website – www.genomeinabottle.org
  • 6. Others working in this space… Well-characterized genomes • Illumina Platinum Genomes • CDC GeT-RM • Korean Genome Project • Human Longevity, Inc. • Hyditaform mole haploid cell line • Genome Reference Consortium Performance Metrics • Global Alliance for Genomics and Health Benchmarking Team • NCBI/CDC GeT-RM Browser • GCAT website
  • 7. NIST Plays a Role in the First FDA Authorization for Next-Generation Sequencer November 20, 2013
  • 8. Measurement Process Sample gDNA isolation Library Prep Sequencing Alignment/Mapping Variant Calling Confidence Estimates Downstream Analysis • gDNA reference materials will be developed to characterize performance of a part of process – materials will be certified for their variants against a reference sequence, with confidence estimates genericmeasurementprocess Analytical steps Pre-Analytical steps Clinical Interpretation
  • 9. • NIST worked with GIAB to select genomes • Current genomes – NA12878 HapMap sample as Pilot sample • part of 17-member pedigree – 2 trios from PGP • Ashkenazim • Asian 12889 12890 12891 12892 12877 12878 12879 12880 12881 12882 12883 12884 12885 1288712886 12888 12893 CEPH Utah Pedigree 1463 Putting “Genomes” in Bottles 11 children
  • 10. NIST Human Genome RMs in the pipeline • All 10 ug samples of DNA isolated from multistage large growth cell cultures – all are intended to act as stable, homogeneous references suitable for use in regulated applications – all genomes also available from Coriell repository • Pilot Genome – ~8400 tubes • Ashkenazim Jewish Trio – ~10000 son; ~2500 each parent • Asian Trio – ~10000 son; parents not yet planned as NIST RM
  • 11. Goals for Data to Accompany RM • ~0 false positive AND false negative calls in confident regions • Include as much of the genome as possible in the confident regions (i.e., don’t just take the intersection) • Avoid bias towards any particular platform – take advantage of strengths of each platform • Avoid bias towards any particular bioinformatics algorithms 11
  • 12. Pilot Genome: Integrate 12 14 Datasets from 5 platforms 12
  • 13. Dataset#1Dataset#2Dataset#3 Annotation #1 Histogram (e.g., coverage) Dataset#1Dataset#2Dataset#3 Annotation #2 Histogram (e.g., strand bias) Site A Site B Potential Bias Site C Dataset Site A Site B Site C Dataset #1 0/0 0/0 1/1 Dataset #2 0/1 0/1 1/1 Dataset #3 0/0 0/1 1/1 Integration 0/0 0/1 Uncer- tain Candidate variants Concordant variants Find characteristics of bias Arbitrate using evidence of bias Confidence Level Integration Methods to Establish Benchmark Variant Calls
  • 14. Integration Methods to Establish Benchmark Variant Calls Candidate variants Concordant variants Find characteristics of bias Arbitrate using evidence of bias Confidence Level Zook et al., Nature Biotechnology, 2014.
  • 15. Assigning confidence to genotypes High-confidence sites • Sequencing/bioinformatics methods agree or we understand the biases causing disagreement • At least some methods have no evidence of bias • Inherited as expected Less confident sites • In a region known to be difficult for current technologies • State reasons for lower confidence • If a site is near a low confidence site, make it low confidence
  • 16. Challenges with assessing performance • All variant types are not equal • All regions of the genome are not equal • Labeling difficult variants as uncertain leads to higher apparent accuracy when assessing performance • Genotypes fall in 3+ categories (not positive/negative) – standard diagnostic accuracy measures not well posed 16
  • 17. Challenge in variant comparison: Complex variants have multiple correct representations BWA ssaha2 CGTools Novo- align Ref: T insertion TCTCT insertion 17 FP SNPs FP MNPs FP indels Traditional comparison 0.38% (610) 100% (915) 6.5% (733) Comparison with realignment 0.15% (249) 4.2% (38) 2.6% (298)
  • 18. Global Alliance for Genomics and Health Benchmarking Task Team • Formed June 2014 to develop methods and tools for comparing variant calls to a benchmark • Developed standardized definitions for performance metrics like TP, FP, and FN. • Initial focus on germline SNPs/indels • Developing benchmarking tools • Comparison engine • Pluggable web interface with modules for: • Reporting/calculation of metrics • Visualization/user interface • Working with Genome in a Bottle Consortium to host data and calls from their well-characterized genomes www.bioplanet.com/gcat Example User Interface
  • 19. Stratifying Performance • Measure performance for different types of variants in different sequence contexts – Types of variants • SNPs • indels of different sizes • complex variants • structural variants – Sequence contexts • Homopolymers, • STRs • Duplications – Functional context • Exome vs genome, etc – Data characteristics • Coverage • Mapping quality • Challenge of smaller gene panels vs genome sequencing – one RM may not have a sufficient number of examples of different classes of variants or sequence contexts – likely need more samples with specific types of variants
  • 20. NCBI/CDC GeT-RM Browser • http://www.ncbi.nlm.nih.gov/variation/tools/get-rm/ • Allows visualization of questionable calls
  • 21. Initial uses of high-confidence NIST- GIAB genotypes for NA12878 • NIST have released several versions of high- confidence genotypes for its pilot RM • These data are presently being used for benchmarking – prior to release of RMs – SNPs & indels • ~77% of the genome
  • 22. Using Genome in a Bottle calls to benchmark clinical exome sequencing at Mount Sinai School of Medicine “We evaluate a set of NA12878 technical replicates against GIAB for each new pipeline version.”
  • 23. Benchmarking somatic variant calling at Qiagen
  • 24. Implications of Technical Accuracy in Medical Genome Sequencing • Collaboration with Euan Ashley group at Stanford • What is accuracy for functional variants? • How much of the exome falls in high confidence regions? • “Black list” in databases • Sensitivity – WExS (95%) < WGS (98%) • especially splicing – genome < nonsyn < syn – Most exome FNs caused by low coverage – Most WGS FNs cause by filtering • Only 81 % of ClinVar pathogenic or likely pathogenic SNPs fall in high-confidence regions – Lots of work to do!
  • 25. Overview of NIST RM Development Genome(s) Q4 2014 Q1 2015 Q2 2015 Q3 2015 Q4 2015 HG- 001/NA1287 8 (“Pilot” Genome) Release NIST RM8398; Preliminary large deletions Refined Structural Variants HG-002 to HG-004 (Ashkenazim trio) Illumina, Complete Genomics, Ion, BioNano, homogeneity /stability Preliminary SNPs/indels; 120x-150x PacBio data; “moleculo”; mate-pair; CG-LFR Refined SNPs/indels ; Preliminary SVs Refined Structural Variants NIST RMs 8391/839 2 release HG-005 (son in Asian trio) Illumina, Complete Genomics, Ion, BioNano, homogeneity /stability “moleculo”; mate-pair; CG-LFR Preliminary SNPs/indels Refined SNPs/indels; Refined Structural Variants NIST RM8393 release
  • 26. Ashkenazim Jewish PGP RM Trio Dataset Characteristics Coverage Availability Good for… Illumina Paired- end 150x150bp ~300x/individu al Fastq on ftp SNPs/indels/so me SVs Illumina Long Mate pair ~6000 bp insert ~40x/individual Feb-Mar 2015 SVs Illumina “moleculo” Custom library ~30x by long fragments Feb-Mar 2015 SVs/phasing/as sembly Complete Genomics 100x/individual On ftp SNPs/indels/so me SVs Complete Genomics LFR ?? SNPs/indels/ph asing Ion Proton Exome 1000x/individu al On SRA SNPs/indels in exome BioNano Genomics Feb 2015 SVs/assembly PacBio ~10kb reads ~120-150x on AJ trio Finished ~Mar 2015 SVs/phasing/as sembly/STRs
  • 27. Asian PGP trio • Similar sequencing to Ashkenazim trio except for PacBio • Only son will be NIST RM
  • 28. Future Directions Germline mutations • Difficult regions/variants – Long-read technologies – Forming an analysis group • Tools for assessing performance – How to stratify performance and understand biases? Somatic mutations • Pilot interlaboratory study to assess comparability of spike-ins • Commercial members developing FFPE cell lines • Participants interested in mixing different RMs
  • 29. How to get involved • Use our integrated SNP/indel genotypes for NA12878 and give us feedback – Cells and DNA currently available from Coriell – NIST RM available April 2015 • Join our new Analysis group – Use Long-read technologies – Structural Variant calls – De novo assembly – Help create the best-ever characterized trio • Attend our biannual workshops (January in CA, August in MD) • Develop tools/metrics with Global Alliance for Genomics and Health Benchmarking Team
  • 30. Acknowledgments • FDA – Elizabeth Mansfield, HPC staff • HSPH • GCAT - David Mittelman, Jason Wang • Francisco De La Vega • Illumina - Mike Eberle • Personalis - Deanna Church • NCBI – Chunlin Xiao • Celera - Andrew Grupe • Genome in a Bottle – www.genomeinabottle.org – New members welcome! – Sign up for email newsletters – jzook@nist.gov