Worked examples of sampling uncertainty evaluation

Worked examples of
sampling uncertainty
evaluation
https://consultglp.com

Introduction
• Impossible to determine the analyte amount in a whole bulk (lot)
of material (population; sampling target)
• Taking a “good” sample for analysis to infers the analyte property
of the population.
• Slogan: “The test result is no better than the sample that it is based
upon”
• So, sample taken for analysis should be as representative of the
sampling target as possible

Introduction
• Sampling cannot be considered as a standalone activity because its
activities always be associated with testing and calibration.
• It is essential for the whole testing and calibration process
• Sampling process uncertainty is recognized as an important
contributor to the uncertainty associated with the reported results.
• But, the uncertainty from sampling has long been neglected in
analytical world.
• It is important to know the between-samples variation in addition to
within-sample variation which refers to analytical standard deviation.

Sampling errors
• Most sampling errors , except the preparation
errors, are due to the material heterogeneity
• Two classes of material heterogeneity:
• Constituent heterogeneity - all natural
materials are heterogeneous, i.e., they consist
of different types of forms (metallic,
molecular, ionic, grains, etc.).
• Distribution heterogeneity - if the material
particles are not randomly distributed in the
sampling target (lot, population) to be
studied.
4

Other possible general sources of sampling
errors
• Contamination (extraneous material in sample)
• Losses (adsorption, condensation, precipitation, etc.)
• Alteration of chemical composition (No preservation) or
loss of volatiles
• Alteration of physical composition (agglomeration,
breaking of particles, moisture, etc.)
• Involuntary mistakes (mixed sample numbers, lack of
knowledge, negligence)
• Deliberate faults (deliberate errors in increment
delimitation, forgery, etc.)
5

Measurement uncertainty of a test result
• Measurement uncertainty ( U ) of test result :
• Laboratory analysts are traditionally concerned measurements within
their own laboratory; the process of sampling often is conducted by
others in or outside their own organizations
• Sub-sampling (secondary sampling) of lab sample received is part of
the analytical procedure.
• Increasingly, in-situ testing by test laboratory is also required; e.g.
field measurements with potable devices on polluted sites
• Understanding of the complete process of sampling and
measurement is therefore very important 6
analysis
sampling
result U
U
U 2
2



Sampling
uncertainty –
principles and
practices
7

Important note:
* Sampling uncertainty cannot be considered alone, without
evaluating the contribution from the uncertainty of analysis
* The confidence limit of the sample mean by the Central
Limit Theorem cannot be assumed to be the sampling
uncertainty when only single measurement is made on each
sample; it refers to the sampling random error only.
8

The uncertainty and decision chain
9
Note: Sampling uncertainty expressed as sampling variance
Analytical uncertainty expressed as analytical variance
Sampling Bias
(if necessary)
Analysis Bias
(if any)

Sample and Analysis Variances
Table 1 : Measurement Situations
Situation Variance Significant Not Significant
A Analysis variance
Sample variance
X
X
B Analysis variance
Sample variance X
X
C Analysis variance
Sample variance
X
X
D Analysis variance
Sample variance
X
X
10

Let’s go to the
basic: review
Analysis of Variance
(ANOVA) technique

Analysis of variance
(ANOVA)
• ANOVA is a powerful statistical
technique used to separate
and estimate the different
causes of variation.
• The objective of ANOVA is to
analyze differences among the
means of groups (factors) of
continuous data, assuming a
normal distribution.
Between or among controlled or fixed factors,
groups or variations (e.g. methods 1 and 2, lab
Nos. 1 and 2, analyst A and B, etc.)
Random error of each group (within for inherent
variations)

ANOVA technique is used for estimating both sampling and
analytical uncertainty for overall measurement uncertainty
• Types of ANOVA :
• One-way (or One-factor)
ANOVA
• Two-way ANOVA with
replication
• Two-way ANOVA without
replication
• Two-way ANOVA with /
without interaction

One-Way ANOVA
• In a situation where we consider only ONE factor on different
treatments with random error within each treatment, we use the
One-Way ANOVA technique
• Example:
• 7 laboratories conducting analyses on a similar sample (Factor:
Lab; treatment: different labs)
• Comparing the yield of fruits on different amounts of fertilizers
added (Factor: fertilizer; treatment: different amounts of
fertilizers).
14

One-way ANOVA
• We use:
• Null hypothesis Ho : µ1 = µ2 = … = µn
• to test against the alternative H1 that
the n means are not all the same,
e.g. H1 : µ1 ≠ µ2 ≠ µ3 ≠ µ4 ≠ … ≠ µn
15
One-way ANOVA can estimate:
 one mean differing from all the others
 all the means may differ from each
other
 the means may fall into 2 distinct
groups

One-way ANOVA
• Partitioning the total variation in a one-way ANOVA model with
k groups (say, 8 labs, each has 3 repeats, i.e. n = 8x3 = 24):
Total variation (SST)
d.f. = n - 1
Between or among
group variation (SSB)
d.f. = k - 1
Within-group (inherent)
variation (SSW)
d.f. = n - k
16

One-way ANOVA
• In summary:
• Total variation represented by the sum of squares total
(SST).
• It consists of 2 parts:
• attributable to differences between the groups by sum
of squares among (SSB)
• being due to inherent variation within the group by
sum of squares within (SSW)
17

k number of
samples;
n replicates
for total
samples
18

Example 1: One-way ANOVA
Hypothesis testing
Ho : x(bar)1 = x(bar)2 = x(bar)3
H1 : not all x(bars)’ are equal
Sum of squares within Analyst




k
i
i
i
w s
n
SS
1
2
)
1
(
Sum of squares between Analysts





k
i
i
i
b x
x
n
SS
1
2
__
)
(
=18.345
=10.858
19
Analyst
Repeat, n A B C
1 10.2 11.6 8.1
2 8.5 12.0 9.0
3 8.4 9.2 10.7
4 10.5 10.3 9.1
5 9.0 9.9 10.5
6 8.1 12.5 9.5
Mean of
Means 𝑥
Mean 𝒙 9.12 10.92 9.48 9.84
Variance, s 2
1.006 1.702 0.962

Excel® Data Analysis
Toolpak
• Loading Data Analysis Toolpak in Window
Excel
• Click the File tab, click Options, and then
click the Add-Ins category
• In the Manage box, select Excel Add-
ins and then click Go.
• If you're using Excel for Mac, in the file
menu go to Tools > Excel Add-ins.
• In the Add-Ins box, check the Analysis
ToolPak check box, and then click OK.
20

Example 1: One-way ANOVA (contd.)
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Analyst A 6 54.7 9.117 1.006
Analyst B 6 65.5 10.917 1.702
Analyst C 6 56.9 9.483 0.967
ANOVA
Source of Variation SS df MS F P-value F crit
Between Analysts 10.858 2 5.429 4.439 0.0306 3.682
Within Analysts 18.345 15 1.223
Total 29.203 17 21
As Calculated F value > 3.682 critical
value, the between-analysts variation is
significantly different from within-
analysts variation with 95% confidence.

Other ANOVA Techniques
• Two-way ANOVA with/without replication
• Two-way ANOVA calculations are much more complicated; usually a statistics
software or MS Excel spreadsheet is used.
• Two-way ANOVA results can show if there are interactions between groups of data
• Its principles are basically the same as the one-way ANOVA
• It is the simplest design of experiment (DOE)
• Example:
• The yield of a synthesized chemical product depends on factors such as :
• Temperature of reaction
• Duration of reflux for complete reaction
• Amounts of reactants, etc.
22

A top-down approach to estimating
field sampling uncertainty & analysis
Empirical approach (top down)
References:
* Eurachem/CITAC Guide (2nd Ed, 2019) “Measurement Uncertainty arising from Sampling”
* Nordtest Technical Report TR 604 (2007) “Uncertainty from Sampling”
analysis
sampling
t
measuremen s
s
s 2
2


23

The Empirical Methods (top down approach)
• The empirical approach is intended to obtain a reliable uncertainty
estimate without necessarily knowing any of the sources
individually
• It is to study the overall reproducibility estimates (of in-house or
inter-organizational sampling and measurement trials)
• Eurachem provides 4 general approaches by considering random
and systematic (bias) effects arising from the sampling process or
analytical process
• Not easy to study sampling bias but not impossible:
• Reference sampling target
• Inter-organizational sampling trial
• Or, assumed to be negligible
24

Four Eurachem empirical methods for estimating MU
Method
#
Method
description
Samplers
(persons)
Sampling
Protocols
Component estimated
Sampling Analytical
Precision Bias Precision Bias
A Duplicates Single Single Yes No Yes No
B Protocols Single Multiple Between protocols Yes No
C CTS Multiple Single Between samplers Yes Yes
D SPT Multiple Multiple
Between protocols
+ between samplers
Yes Yes
CTS : Collaborative trial in sampling; SPT : Sampling proficiency test
25

Empirical method A : The duplicate method
• Simplest and probably most cost effective amongst the four
methods
• It is a study of sampling precision (bias not included)
• Based upon a single sampler duplicating a small portion (i.e.
10% or no less than 8 targets) of the primary samples
• If there is only one sampling target, all 8 duplicates can be
randomly taken from it through, say a grid design
• The same design can be used with different samplers to
incorporate the “between operator” contribution to
uncertainty (equivalent to #3 method)
• #3 and #4 methods offer complete sampling uncertainty
covering both random and systematic error.
26

27
Empirical method 1 : A simple split experimental design for
taking multiple samples from a target for duplicate analysis
Suitable for
homogeneous
and fairly
homogeneous
distribution of
targeted
analyte in a
population

A Worked Example
• The Cadmium results (in ppm) of 10 random samples taken from a
field (ref: Michael Thompson & Philip Lowthian, page 58)
28
Soil
sample 1 2 3 4 5 6 7 8 9 10
Result
1 11.8 6.4 11.9 12.2 7.5 6.4 10.1 11.3 14.0 16.5
Result
2 9.8 6.3 10.3 10.2 7.3 6.4 10.0 9.9 12.5 15.1
Mean = 10.8 6.35 11.1 11.2 7.4 6.4 10.05 10.6 13.25 15.8
Note: The mean values between samples were not very variable. So this approach is applicable.

Applying ANOVA
SUMMARY
Soil 1 2 21.6 10.8 2
Soil 2 2 12.7 6.35 0.005
Soil 3 2 22.2 11.1 1.28
Soil 4 2 22.4 11.2 2
Soil 5 2 14.8 7.4 0.02
Soil 6 2 12.8 6.4 0
Soil 7 2 20.1 10.05 0.005
Soil 8 2 21.2 10.6 0.98
Soil 9 2 26.5 13.25 1.125
Soil 10 2 31.6 15.8 0.98
ANOVA
Source of
Variation SS df MS F P-value F crit
Between Samples 160.0545 9 17.7838 21.1838 2.26552E-05 3.020383
Within samples 8.395 10 0.8395
Total 168.4495 19

Basic calculation of standard uncertainty (sampling)
• MS (between samples) covers two uncertainty components:
• 1. MS (analysis; within) i.e., variance (analysis)
• 2. MS (sampling; between) i.e., variance (sampling)
• Their relationship is:
• MS (between samples) = MS (analysis; within) + k x MS (sampling)
• where k = 2 (duplicate analysis)
• Therefore,
30
𝑉𝑎𝑟 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 = 𝑀𝑆 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 =
𝑀𝑆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 − 𝑀𝑆(𝑤𝑖𝑡ℎ𝑖𝑛)
𝑘

31
SUMMARY
Soil 1 2 21.6 10.8 2
Soil 2 2 12.7 6.35 0.005
Soil 3 2 22.2 11.1 1.28
Soil 4 2 22.4 11.2 2
Soil 5 2 14.8 7.4 0.02
Soil 6 2 12.8 6.4 0
Soil 7 2 20.1 10.05 0.005
Soil 8 2 21.2 10.6 0.98
Soil 9 2 26.5 13.25 1.125
Soil 10 2 31.6 15.8 0.98
ANOVA
Source of
Variation SS df MS F P-value F crit
Between
Samples 160.0545 9 17.7838 21.1838 2.26552E-05 3.020383
Within samples 8.395 10 0.8395
Total 168.4495 19
From the above mean squares, we calculate the
analytical standard deviation sa as:
𝑠𝑎 = 𝑀𝑆𝑤 = 𝟎. 𝟖𝟑𝟗 = 0.92 ppm
The sampling standard deviation ss is given by:
𝑠𝑠 =
𝑀𝑆𝑏−𝑀𝑆𝑤
𝑘
=
𝟏𝟕.𝟕𝟖−𝟎.𝟖𝟑𝟗
2
= 2.91 ppm
(Note that k = 2 because we did duplicate
analysis.)
The total standard uncertainty as expressed as
standard deviation for a combined operation of
sampling (random) and analysis is going to be:
𝑠𝑇𝑜𝑡𝑎𝑙 = 𝑠𝑠
2
+ 𝑠𝑎
2
= 2.912 + 0.922 =3.05 ppm

32
Empirical method 2: The
duplicate method by single
sampler on a lot (population)
(a Balanced nested Design)
This method is suitable for
heterogeneous distribution of
targeted analyte in a population,
e.g. a site with contaminated soil

Eurachem’s worked
example: Nitrate in
glasshouse grown
lettuce
• Samples drawn from 8
target locations with
duplicate sampling with
duplicate analysis for
concentrations of nitrate
(mg/kg) in glasshouse
grown lettuce on each of
the 16 samples. The
best estimates of the
results are as follows:
33

Basic calculations for sum of squares
Target, i
S1A1 S1A2
x(i,1,1) x(i,1,2)
A 3898 4139
34
Mean i,1 = 4018.5
Sum of squares of deviation = (3898 - 4018.5)2 + (4139 – 4018.5)2 = 29041
By Excel function, we use ‘=DEVSQ(3898,4139)
Target, i
S1A1 S1A2 S2A1 S2A2 S1 S2 S1 S2
x(i,1,1) x(i,1,2) x(i,2,1) x(i,2,2) Mean i,1 Mean i,2 2*D(i1)^2 2*D(i2)^2
A 3898 4139 4466 4693 4018.5 4579.5 29040.5 25764.5
B 3910 3993 4201 4126 3951.5 4163.5 3444.5 2812.5
C 5708 5903 4061 3782 5805.5 3921.5 19012.5 38920.5
D 5028 4754 5450 5416 4891 5433 37538 578
E 4640 4401 4248 4191 4520.5 4219.5 28560.5 1624.5
F 5182 5023 4662 4839 5102.5 4750.5 12640.5 15664.5
G 3028 3224 3023 2901 3126 2962 19208 7442
H 3966 4283 4131 3788 4124.5 3959.5 50244.5 58824.5
Overall Mean = 4345.6 SSE(analysis) = 351320
Deviations from mean value

Basic calculations for standard uncertainty (within-
samples, i.e. analysis)
No. Target, i 8
No. Sample, j 2
No. analysis, k 2
35
Standard uncertainty of within-samples (analysis)
DF (analysis) = 16 MS(analysis) = 21957.5
Std Dev (analysis) = 148.18
%RSD(analysis) = 3.41
= SSE(analysis) / DF
= 351320 / 16
= SQRT(21957.5)
= 148.18*100/Overall Mean
= 148.18*100/4354.6
DF(analysis) = (i * j * k) – (i * j) = (8*2*2) – (2*2) = 16
Important Note:
Mean square (MS) is also known as variance (or square of standard deviation)

Basic calculations of deviation of sample means
Target, i
S1 S2
Mean of
1,2
2*D(Target, i)^2
Mean
Target,
i,1
Mean
Target,
i,2
A 4018.5 4579.5 4299 157360.5
36
Overall Mean of S1 and S2 in Target A = 4299
Sum of squares of deviation = (4018.5 - 4299)2 +
(4579.5 – 4299)2 =157360.5
By Excel function, we use =DEVSQ(4018.5,4579.5)
Target, i
S1 S2
Mean of
1,2
2*D(Target, i)^2
Mean Target,
i,1
Mean Target,
i,2
A 4018.5 4579.5 4299.0 157360.5
B 3951.5 4163.5 4057.5 22472.0
C 5805.5 3921.5 4863.5 1774728.0
D 4891 5433 5162 146882.0
E 4520.5 4219.5 4370 45300.5
F 5102.5 4750.5 4926.5 61952.0
G 3126 2962 3044 13448
H 4124.5 3959.5 4042 13612.5
TSS(bet samples) = 4471511
DF(bet samples) = 8
MS(bet samples) = 558939
=No. of samples per target*Sum
of Squares(Target)
=2*Sum of Squares(Target)
DF(bet samples) = i* j - i = 8

• MS (between samples) covers two uncertainty components:
• 1. MS (analysis; within) i.e., variance (analysis)
• 2. MS (sampling; between) i.e., variance (sampling)
• Their relationship is:
• MS (between samples) = MS (analysis; within) + k x MS (sampling)
• where k = 2 (duplicate analysis)
• Therefore,
37
𝑀𝑆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 − 𝑀𝑆(𝑤𝑖𝑡ℎ𝑖𝑛)
𝑘

Variance(sampling) = 268491
Std Dev (sampling) = 518.161
%RSD(sampling) = 11.92
38
By calculation, we have:
=SQRT(268491)
= 518.161*100/Overall Mean
= 518.161*100/4354.6
558939 − 21957.5
2
Therefore, %𝑅𝑆𝐷 𝑀𝑒𝑎𝑠𝑢𝑟𝑚𝑒𝑛𝑡 = %𝑅𝑆𝐷(𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔)2 + %𝑅𝑆𝐷(𝑎𝑛𝑎𝑙𝑦𝑠𝑖𝑠)2
= 3.142 + 11.922 = 12.40
Given a test result A, the combined standard uncertainty, u = 12.40 x A. The expanded
uncertainty U = 2 x u with 95% confidence

39
The Excel 2-factor ANOVA gives same results.

Conclusions
• The importance of taking a representative sample from a
population (sampling target) cannot be over emphasized.
• Sampling uncertainty is expected to be very much greater
than the analysis uncertainty, particularly when the test
parameter(s) are not homogeneously distributed in the
population for sampling.
• Without doing a proper sampling protocol, it is impossible
to estimate the uncertainty arising from sampling; this will
cast a doubt on the final test result produced.
40

Worked examples of sampling uncertainty evaluation

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Worked examples of sampling uncertainty evaluation

Similar a Worked examples of sampling uncertainty evaluation (20)

Más de GH Yeoh

Más de GH Yeoh (7)

Último

Último (20)

Worked examples of sampling uncertainty evaluation