"Metrology for Identity and Other Nominal Properties" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by the National Institute for Standards and Technology October 2014 by David Duewer, PhD from NIST.
Metrology for Identity and Other Nominal Properties
1. Metrology
for
Iden)ty
and
Other
Nominal
Proper)es
David
Lee
Duewer
Chemical
Sciences
Division
Materials
Measurement
Laboratory
Na;onal
Ins;tute
of
Standards
and
Technology
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
2. And
we
take
ourselves
When
I
Say
“
Wveer”y…
s
eriously…
PhD
1985
Analy.cal
chemist
5
y
Perkin-‐Elmer
–
Instrument
Design/Development
24
y
NIST
“Innovator”
PhD
1976
Analy.cal
chemist
11
y
Monsanto
-‐
process
&
biodiscovery
23y
NIST
“Data
Jock”
Marc
Salit
Dave
Duewer
Leader,
Genome
Scale
Measurements
Group
Co-‐Director,
NIST/Stanford
U.
Joint
Ini;a;ve
on
Measurements
in
Biology
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
3. Metrology
(Measurement
Science)
• Metrology
is
the
stuff
needed
so
data
can
support
informed
decision
making.
• in
a
good
world,
decisions
are
informed
with
data
• which
are
the
results
of
measurements!
• Calculus
of
Confidence
• we
posit
that
metrology
is
the
‘formal’
system
that
tells
us
how
well
we
trust
those
data
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
4. Calculus
of
Confidence
• The
tools
of
metrology:
• Traceability
• Uncertainty
• Valida;on
• enable
this
calculus
of
confidence
by
which
decisions
are
informed
by
measurement
results
with
established
confidence.
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
5. Cra_
• Metrology
is
more
a
cra_
than
a
technology
• this
doesn’t
mean
that
7
year
appren;ceships
are
required!
• it
does
mean
that
two
different
skilled
metrologists
might
take
very
different
approaches
to
the
same
problem
• but
they
should
both
come
to
largely
equivalent
solu)ons!
• maaer
of
style
• must
be
defensible
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
6. The
“How
Much”
Worldview
as
seen
by
chemists/biochemists
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
7. Tools
of
the
Trade
“GUM”
“VIM”
www.bipm.org/en/publica;ons/guides/#vim
www.nist.gov/pml/pubs/sp811 www.bipm.org/en/publica;ons/guides/#gum
/
Workshop
on
DNA
Methods
for
Quality
Control
of
Botanical
Products
USP,
23-‐Oct-‐2014
8. Metrological
Traceability
enables
comparisons
to
be
made
over
)me
and
place
SI
unit
(amount
of
substance)
purity
analysis
Result
primary
methods
reference
methods
rou;ne
methods
high
purity
primary
RM
primary
calibra;on
CRM
secondary
calibra;on
RM
rou;ne
sample
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
9. Valida;on
ensures
measurement
processes
are
well-‐understood
• “checks
the
measurement
model”
• tests
completeness
• tests
assump;ons
• helps
establish
an
uncertainty
budget
• iden;fies
relevant
parameters
to
keep
under
control
• tests
scope
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
10. • “how
Metrological
Uncertainty
enables
meaningful
comparison
of
results
much”
results
are
only
useful
when
compared
• different
results
in
different
places
or
measured
at
different
;mes…
• “comparability
over
space-‐and-‐;me”
• Are
these
results
the
same?
• is
there
significant
bias?
• Is
measurement
precision
fit-‐for-‐purpose
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
11. Perhaps
NIST’s
Best
Uncertainty
Statement
Dr.
C.H.
Meyers,
on
his
measurements
of
the
heat
capacity
of
ammonia
(circa
1920):
“We
think
our
reported
value
is
good
to
1
part
in
10,000:
we
are
willing
to
bet
our
own
money
at
even
odds
that
it
is
correct
to
2
parts
in
10,000.
Furthermore,
if
by
any
chance
our
value
is
shown
to
be
in
error
by
more
than
1
part
in
1000,
we
are
prepared
to
eat
the
apparatus
and
drink
the
ammonia.”
Quote from: Doiron T and Stoup J, Uncertainty and Dimensional Calibrations, JNIST 1997;102:647-676
http://dx.doi.org/10.6028/jres.102.044
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
12. The
“What”
Worldview
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
13. Several
Different
“What”s
• Iden)fica)on
• “Pure
substance”
Cer;fied
Reference
Material
(CRM)
• Use/develop
convincingly
specific
methods
• Inclusion
• exclusion
• Define
and
cer;fy
unambiguous
“barcode”
• CRMs
are
expensive
• Verifica)on
• Secondary
reference
materials
(RMs)
and
controls
• Check
“barcode”
against
CRM
• Can
be
commercial
or
home-‐brew
• Recogni)on
• Component
of
a
mixture
• Check
“barcode”
against
library
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
14. Barcode
of
Life
hap://www.barcodeoflife.org/content/about/what-‐dna-‐barcoding
Iden;fica;on
Valida;on
Recogni;on
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
15. Metrological
Traceability
enables
comparisons
to
be
made
over
)me
and
place
Authority
chemical
structure,
biological
nomenclature
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
iden;fica;on
methods
Result
verifica;on
methods
recogni;on
methods
“pure”
primary
RM
QC
and
secondary
RMs
rou;ne
samples
{CAS,
IUPAC}
{ICZN,
ICN}
16. Taxonomic
Hierarchy
Ginkgo
biloba
L.
Kingdom
Plantae
–
plantes,
Planta,
Vegetal,
plants
Subkingdom
Viridaeplantae
–
green
plants
Infrakingdom
Streptophyta
–
land
plants
Division
Tracheophyta
–
vascular
plants,
tracheophytes
Subdivision
Spermatophy;na
–
spermatophytes,
seed
plants,
phanérogames
Infradivision
Gymnospermae
–
gymnosperms,
gymnospermes,
gimnosperma
Class
Ginkgoopsida
–
ginkgo
Order
Ginkgoales
Family
Ginkgoaceae
Genus
Ginkgo
L.
–
ginkgo
Species
Ginkgo
biloba
L.
–
maidenhair
tree,
common
ginkgo
en.wikipedia.org/wiki/Ginkgo_biloba
hap://www.i;s.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=183269
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
17. Valida;on
ensures
measurement
processes
are
well-‐understood
• “checks
the
measurement
model”
• tests
if
iden;fica;on
criteria
fit-‐for-‐purpose
• includes
everything
wanted
• excludes
everything
else
• (Ideally,
this
can
be
done
in
silico)
• tests
if
measurements
consistent
with
iden;fica;on
criteria
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
18. Specificity
Valida;on
Design
Chloroplast
DNA
sequences
from
authen;cated
Ginkgo
biloba
samples
are
used
to
establish
inclusivity
Chloroplast
DNA
sequences
from
close
rela;ves
are
used
to
establish
exclusivity
haps://www-‐s.nist.gov/srmors/view_cert.cfm?srm=3246
Labudde, R.; Harnly, J.M.; Probability of identification (POI): A Statistical Model for the Validation of Qualitative Botanical Identification Methods
Official Methods of Analysis of AOAC International., Vol. 95, pp. 273–285, (2012).
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
20. • “what”
Metrological
Confidence
enables
meaningful
interpreta)on
of
results
results
are
only
useful
when
• The
same
“things”
can
be
compared
• “measurand”
is
the
metrology-‐speak
term
• Are
these
barcodes
the
same?
• how
confident
are
you
in
the
result?
• essen;al
part
of
being
able
to
compare!
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
21. Defining
“Confidence”
“Where
uncertainty
is
assessed
qualita;vely,
it
is
characterised
by
providing
a
rela.ve
sense
of
the
amount
and
quality
of
evidence
(that
is,
informa;on
from
theory,
observa;ons
or
models
indica;ng
whether
a
belief
or
proposi;on
is
true
or
valid)
and
the
degree
of
agreement…
This
approach
is
used
by
WG
III
through
a
series
of
self-‐explanatory
terms
such
as:
high
agreement,
much
evidence;
high
agreement,
medium
evidence;
medium
agreement,
medium
evidence;
etc.”
Climate
Change
2007:
Synthesis
Report
www.ipcc.ch/publica;ons_and_data/ar4/syr/en/contents.html
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
22. “Confidence”: NIST’s Initial Definitions
DNA
Sequence
via
Sanger
sequencing
Workshop
on
DNA
Methods
for
Quality
Control
of
Botanical
Products
USP,
23-‐Oct-‐2014
23. On
Further
Thought…
• Highest
confidence
• sufficient
evidence
• no
ambigui;es
or
contradic;ons
• Very
confident
• sufficient
evidence
• all
ambigui;es
unambiguously
resolved
• Confident
• sufficient
evidence
• all
ambigui;es
“understood”
• but
insufficient
evidence
to
prove
it
• Insufficient
evidence
to
Cer;fy
Acquire
Evidence
Sufficient?
Unambiguous?
Highest
Resolved?
Very
Understood?
Confident
Yes
Yes
Yes
Yes
No
No
No
No
Confidence
No
Maybe
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
24. Who
Defines
“Sufficient”?
You!
and
the
rest
of
the
experts
within
your
community
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
25. Criteria
for
Iden;fica;on
of
Seized
Drugs
SWGDRUG
Recommenda.ons
:
If
one
technique
from
A,
then
one
other
(A,
B,
or
C).
If
no
techniques
from
A,
then
three
others
(two
from
B).
Category
A
Category
B
Category
C
Infrared
Spectroscopy
Capillary
Electrophoresis
Color
Tests
Mass
Spectrometry
Gas
Chromatography
Fluorescence
Spectroscopy
Nuclear
Magne;c
Resonance
Spectroscopy
Ion
Mobility
Spectrometry
Immunoassay
Raman
Spectroscopy
Liquid
Chromatography
Mel;ng
Point
X-‐ray
Diffractometry
Microcrystalline
Tests
Ultraviolet
Spectroscopy
Pharmaceu;cal
Iden;fiers
Thin
Layer
Chromatography
hap://www.swgdrug.org/approved.htm
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
26. Barcode
of
Life:
Standards
and
Guidelines
2.D.ii
In
November
2009,
CBOL
approved
rbcL
and
matK
as
the
barcode
regions
for
vascular
plants.
They
are
defined
rela;ve
to
the
Arabidopsis
thaliana
chloroplast
NC_000932
sequence
annota;on
as
follows:
the
rbcL
barcode
region
is
at
the
5'
end
of
the
rbcL
gene
between
bp1-‐599
(27-‐579
excluding
primer
sequences);
the
matK
barcode
region
is
between
bp205-‐1046
(227-‐
1019
excluding
primer
sequences).
4.C
In
deciding
whether
a
record
will
be
repeatable
and
reliable
for
species
iden;fica;on,
submiaers
should
select
as
poten;al
BARCODE
records
only
those
for
which
the
con;g
was
based
on
bi-‐direc;onal
coverage
with
non-‐N
base
calls
at
no
less
than
40%
of
the
reported
sequence.
As
described
below
(5D),
CBOL
can
direct
GenBank
(or
another
INSDC
member)
to
remove
the
BARCODE
designa;on
from
records
which
have
all
required
elements
(1A-‐I)
but
have
been
shown
to
be
unreliable
for
species
iden;fica;on
due
to
low
sequence
quality
and
coverage.
www.barcodeoflife.org/content/resources/standards-‐and-‐guidelines
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
27. Recent
Work
in
“What”
Metrology
Chemical
Iden;fica;on
and
its
Quality
Assurance
Boris
L.
Milman
D.I.
Mendeleyev
Ins;tute
for
Metrology,
St.
Petersburg,
Russia
January
12,
2011
Springer,
281
pages,
English
“Unlike
analy;cal
techniques
for
qualita;ve
and
quan;ta;ve
determina;ons,
well-‐presented
in
books
and
reviews,
theore;cal
principles
of
iden;fica;on
and
general
experimental
approaches
to
its
implementa;on
have
not
received
comprehensive
treatment
in
the
literature.”
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014
28. Thank
you
for
your
aaen;on!
Standards
for
Pathogen
Iden;fica;on
via
Next-‐Genera;on
Sequencing
Workshop
NIST,
20-‐Oct-‐2014