(2013) A Trade-off Between Number of Impressions and Number of Interaction Attempts

The 8th International Conference on Information Technology and Applications (ICITA 2013)
Abstract--The amount of time taken to enroll or collect
data from a subject in a fingerprint recognition system is of
paramount importance. Time taken directly affects cost. A
trade-off between number of impressions collected and
number of interaction attempts allowed to submit those
impressions must be realized. In this experiment, data were
collected using an optical fingerprint sensor. Each subject
submitted six successful impressions with a maximum of 18
interaction attempts. The resulting images were analyzed
using three methods: the number of interaction attempts
per finger, quality differences from the first three
impressions to the last three impressions, and finally
matching performance from the first three impressions to
the last three impressions. The right middle finger seemed
to have the most issues collecting as it required the most
interaction attempts. Analysis was performed to show no
significant differences in image quality or matching
performance. However, after further analysis, a steady
improvement was noticed from Group A to Group B in both
image quality and matching performance
Index Terms-- Biometrics, image quality, impression,
interaction, matching performance
I. INTRODUCTION
There are many factors that impact the performance of a
biometric system, from poor quality data including ridge-valley
structure [1], skin conditions [2], human interaction with the
sensor [3], and the associated metadata attached to biometric
data [4]. Poor quality data, in this case fingerprint images,
regardless of the source have a resulting impact on the
performance of a biometric [5–8], and can impact the operations
of the system. Test protocol designers are faced with a series of
challenges when collecting data and minimizing error,
regardless of the cause. In [9], the development of the Human
Biometric Sensor Interaction model is discussed, which
examined four fundamental issues – how do users interact with
the biometric device, what errors do the users make and are
there any commonalities within these different errors, and what
J. Hasselgren is with the Technology, Leadership, and Innovation
Department of Purdue University, West Lafayette IN 47907 USA (telephone:
765-494-2311, e-mail: jahassel@purdue.edu).
S. Elliott is with the Technology, Leadership, and Innovation Department of
Purdue University, West Lafayette IN 47907 USA (telephone: 765-494-2311,
e-mail: elliott@purdue.edu).
J. Gue is a student in the Technology, Leadership, and Innovation
Department of Purdue University, West Lafayette IN 47907 USA (telephone:
765-494-2311, e-mail: gu66@purdue.edu).
ISBN: 978-0-9803267-5-8
level of training should one expect to give the subject (if any at
all) to successfully use a biometric device. Test protocol
designers can reference documents describing the best practices
of designing a test protocol, (for example [10]). And while
minimizing the error is paramount in a test, so too are the
decisions relating to the number of test subjects and the time
they spend in the test center. The number of test subjects is an
important task in developing the test protocol. Mansfield and
Wayman note that the ideal test would be to have as many
volunteers as is practically possible, each making a single
transaction. They provide an example whereby an evaluation
may have 200 subjects each enrolling and making three genuine
transactions, with two further revisits, providing 1200 genuine
attempts [10]. Test crews, and the number of attempts vary,
depending on the nature of the test as well as the allowable
expense related to test subject recruitment and administration of
the test. In their guidance, [10] state that the test population
should be “as large as practically possible”. Test protocols in
the literature vary on the number of samples collected. One
study examined image quality and performance on a single
fingerprint sensor. Fifty subjects participated, providing three
samples of their index, middle, ring and little on both hands,
resulting in 1200 images [11]. Another study examined the
effects of scanner height on fingerprint capture, and collected
fingerprints from 75 different subjects at four different heights,
with five different attempts [12]. Another example, FVC 2000
collected 880 fingerprints in total, with 8 impressions each per
finger [13]. Each of these studies examined very different topics
within fingerprint performance, but each test protocol designer
made the determination of the number of fingerprints to collect,
and the number of attempts that the subject would complete.
II. MOTIVATION
In an operational setting, there is an inherent trade-off
between the number of samples collected, the number of
interaction attempts to collect the samples, and the cost of the
collection. For example, should the test personnel keep trying to
collect from an individual that has poor image quality in the
hope that they will provide better image quality because they are
either getting accustomed to the device and improve their
presentation? Or, in this scenario, is it better to stop after the
first three attempts because the time taken to acquire the images
does not provide any additional value? The research questions
are as follows: does the quality improve with experience or
familiarity with the device? Does performance change across
different groups, such as the first three successfully acquired
samples, the last three, the top three image quality samples, and
for reference, the bottom three? All of these questions are
A Trade-off Between Number of Impressions and
Number of Interaction Attempts
Jacob A. Hasselgren, Stephen J. Elliott, and Jue Gue, Member, IEEE

applicable in determining the best enrollment policy and will
impact the time that the subject is at the enrollment station.
III. METHODOLOGY
For the purposes of this study, and subsequent analysis the
following definitions are used. A successfully acquired sample
(SAS) is determined when the fingerprint sensor acquired a
sample. In these experiments, the fingerprint sensor acquired
the sample with a slight set image quality threshold, which
required a minimum number of minutiae. The following fingers
were collected from the subject: right index, right middle, left
index and left middle. Fig. 1 visually shows the hands used
during this collection.
Fig. 1. Representation of fingers used for collection
Six impressions that were determined to be SAS’s were taken
on each finger. Each SAS was given an impression number,
which in this case would always be a value between one and six.
When a subject attempted to present to the sensor, regardless of
whether a SAS occurred, or whether the presentation was good
or bad, it was considered the subject had committed an
interaction attempt. The subject was allowed maximum of 18
interaction attempts. The sensor used was the Digital Persona
U.are.U 4500 sensor, which is commercially available. The data
used in these analyses were taken from an on-going aging study
in the BSPA Labs at Purdue University. Four fingerprint sensors
were used in the overall data collection, along with other
modalities. This particular sensor was the last sensor used in this
fingerprint station.
The test protocol and subsequent definitions is consistent with
the human biometric sensor interaction model as outlined in [3].
The schematic of interaction attempts and impressions is shown
below in Fig. 2. Fig. 2 is only an example of the difference
between impression numbers and interaction attempt numbers.
Group A could consist of attempts higher in the order. Group B
can consist of attempts 7, 8 and 9 or even 7, 11, and 16.
Fig. 2. Schematic of interaction attempts and impressions
Four different groups were established throughout these
analyses. Group A consisted of the first three successfully
acquired samples for a subject for each finger. Group B
consisted of the last three successfully acquired samples for
each subject for each finger. Group C included the images that
have the lowest quality scores while Group D consisted of the
highest image quality scores. Not all groups were used in every
analysis.
Four commercially available software packages were used.
Neurotechnology Megamatcher v4.3 was used for matching
performance while the Aware WSQ1000 quality tool was used
for image quality analysis. Oxford Wave graphing software was
used to plot and calculate the Equal Error Rates, and Minitab 14
was used to determine statistical measures and results.
IV. RESULTS
The results of the experiment are divided into three sections.
Table 1 provides a description of each analysis.
Table 1. Framework
Analysis Description Groupings
Number of
interaction
attempts
Differences in the
number of
interaction
attempts based on
finger location
Groups A and B
Image Quality
Differences in
image quality
from the first
three SAS to the
last three SAS
Group A vs.
Group B
Differences in
image quality
from the lowest
three quality
scoring SAS to
the highest three
quality scoring
SAS
Group C vs.
Group D
Matching
Performance
Differences in
matching
performance
from the first
three SAS to the
last three SAS
Group A vs.
Group B
Differences in
matching
performance
from the lowest
three quality
scoring SAS to
the highest three
quality scoring
SAS
Group C vs.
Group D
The test subject population consisted of 49 males, 53 females,
and four subjects who did not disclose their demographic
information.

A. Number of interaction attempts
The results consist of those subjects that presented six
successfully acquired samples in 18 or less interaction attempts.
The results of the number of attempts are shown below for each
finger collected (right index, left index, right middle, and left
middle). There was no significant difference for interaction
attempts between Group A and Group B for any given finger.
In an ideal data collection scenario, the impression numbers
should match the interaction attempt numbers, as no additional
attempts would have been necessary. Group A’s impression
numbers were always one through three, but some subjects,
particularly in the right middle finger, needed as many as 12
attempts just to submit three SAS.
The majority of individuals achieved their samples in six
interaction attempts across all finger locations. However, there
are some fingers, notably the right middle, where the
distribution is more spread out. This is shown in Table 2.
Table 2. Variance of attempts for group per finger
Finger Location Group Variance
LI
A 1.0830
B 2.0962
LM
A 1.0320
B 1.3039
RI
A 1.3047
B 2.2292
RM
A 2.0180
B 2.8247
The right middle (RM) and the right index (RI) have a greater
variance in Groups A and B than the other fingers. This
difference in variance may be explained by the ordering in
which the fingers were collected. For this collection, the fingers
were collected in the following order: right index, right middle,
left index and left middle. These higher values in variation for
the right index and right middle fingers could be a result of the
subject becoming comfortable with the sensor. Since the right
index and right middle fingers are the first two fingers to present
to the sensor, perhaps there is a habituation factor that is
affecting the result of the number of interaction attempts and the
variance. This could also simply be a case of hand dominance;
however, this was not available for this paper.
B. Image Quality
It is well understood that image quality impacts performance.
In this section, we evaluate image quality across four groups –
the groups A and B (first three SAS and last three SAS,
respectively) and additionally groups C (top three image
quality) and D (bottom three image quality). The images were
processed using a commercial quality scoring algorithm, Aware
WSQ1000 that provided an aggregate quality score from 0-100.
The breakdown of these quality scores are as follows: good
ranges from 85-100, adequate from 75-84, marginal from
60-74, and poor from 0-59. The distribution of image quality
scores are shown in Fig. 3.
Fig. 3. Distribution of quality across groups A and B and
finger location.
Referring to Fig. 3, each finger’s mean quality is between70
and 76, or marginal quality.
Modality Subtype
Group2
RMRILMLI
DCDCDCDC
100
90
80
70
60
50
40
30
Quality
Quality in Groups
Fig. 4. Distribution of quality across groups C and D and
finger location.
Fig. 4 shows the quality distribution for Groups C and D, the
lowest three quality scoring SAS and the highest three quality
scoring SAS, respectively.
Table 3. Basic quality statistics for groups per finger
Finger
Location
Group Mean Std.
Dev.
Variance
LI A 71.604 9.397 88.313
B 72.911 9.545 91.115
C 68.785 9.442 89.149
D 75.729 8.182 66.946
LM A 73.327 9.992 99.843
B 74.871 9.322 86.907
C 70.459 10.059 101.176
D 77.739 7.758 60.180
RI A 72.139 10.253 105.131
B 73.289 9.327 86.984
C 69.014 10.104 102.082
D 76.415 7.951 63.213
RM A 74.683 9.296 86.418
B 75.237 9.042 81.767
C 71.720 9.501 90.276

D 78.200 7.550 56.997
The variances of Group A were larger than Group B in all but
the left index finger. The means of quality for Group A and
Group B of each finger were compared in a one-way ANOVA
statistical test. There was no significant difference between
Group A and Group B for any given finger.
The means of quality for Group C and Group D of each finger
were compared in a one-way ANOVA statistical test. There was
a significant difference for all fingers (p<.001).
C.Performance
To observe the differences in matching performance, the
SAS, in their respective groups, were enrolled into
minutiae-based matching software, Megamatcher 4.3. The
resulting equal error rates for these matching sequences are
presented in Table 4.
Table 4. Group A (first three) vs. Group B (last three)
Finger
Group A vs
Group A
Group B vs
Group B
Group A vs
Group B
LI 0.0000 0.0000 0.0006
LM 0.3322 0.0000 0.1282
RI 0.0000 0.0000 0.0000
RM 0.0000 0.0000 0.0000
No improvements were noticed in performance for any
fingers except for the left middle finger. When examining the
performance Group A of the left middle finger, an Equal Error
Rate (EER) of 0.3322 was observed. Group B of the same finger
was matched to itself and the performance improved to 0.0000.
The third matching procedure was an interoperable match with
Group A being matched to Group B. This also produced an
improvement from Group A being matched to itself at an EER
of 0.1282.
To also observe the effect quality has on performance,
Groups C and D were also matched to themselves and the other.
The matching rates of Group C and D (the top three image
quality scores and the bottom three image quality scores,
respectively) are shown below.
Table 5: Group C (top three) vs. Group D (bottom three)
Finger
Group C vs
Group C
Group D vs
Group D
Group C vs
Group D
LI 0.0000 0.0000 0.0000
LM 0.2816 0.0506 0.1766
RI 0.0000 0.0000 0.0000
RM 0.0000 0.0000 0.0000
The left middle finger was the only finger that produced an
EER more than 0.0000. When examining the performance
Group C of the left middle finger, an EER of 0.2816 was
observed. In the second matching run, Group D was matched to
itself and the performance improved 0.0506. This points to the
conclusion that quality does affect performance as the highest
three scoring improved the EER by 0.2310. The third matching
run performed was an interoperable match as Group C was
matched to Group D. This also produced an improvement from
Group C being matched to itself at an EER of 0.1282. These
results do point to the idea that quality does affect performance.
V. CONCLUSIONS AND RECOMMENDATIONS
It should be noted that the distribution of SAS does differ
from finger to finger. Subsequent work would be to examine
other sensors and draw conclusions from this. Furthermore,
there is additional work being conducted by O’Connor on the
development of a metric to determine whether the subject is
stable in their presentation – that is, it answers the problem of
whe ther to take additional metrics given the prior knowledge of
the individual’s performance within a given dataset [14].
Further work can be leveraged which would also identify test
administrator error and provide an error-checking methodology
for test administrators in the number of interaction attempts and
impressions that are conducted.
While controlled laboratory style testing may not be impacted
by this preliminary work, these results will provide guidance to
operational data collections by answering the initial motivation
of the study. In this study, we can conclude that test personnel
would not benefit from collecting the additional fingerprints (4,
5 and 6) from LI, RI and RM, but would benefit marginally from
collecting the six images. Furthermore, the quality metric may
provide an additional tool in answering this question. Recall that
the LM had the lowest group of quality images. Upon further
analysis, these impressions came from subjects 60, 77 and 88.
Perhaps these poor image quality metrics were caused by poor
placement or age. The subjects’ ages were 60, 66 and 23,
respectively.
It also should be noted that overall, the right index required
more attempts to submit all six SAS’. This is interesting as it is
assumed that the right index could be the more controllable
finger for those with right hand dominance and this needs
additional research.
Additionally, this study will be furthered by observing these
metrics over multiple visits to attempt to measure habituation.
Recall that both quality and performance improved from the
first three impressions collected to the last three. This
improvement could be an effect of using the device multiple
times and becoming comfortable with it. The study from which
this data was pulled from is a multiple visit study. Data will be
available to observe this effect over multiple visits as well as
multiple uses per visit.
REFERENCES
[1] T. P. Pang, J. Xirdong, and W. Y. Yao, “Fingerprint image quality
analysis,” in 2004 International Conference on Image Processing,2004.
ICIP ’04., 2004, pp. 1253–1256.
[2] K. Ito, A. Morita, T. Aoki, T. Higuchi, H. Nakajima, and K. Kobayashi,
“A fingerprint recognition algorithm using phase-based image matching
for low-quality fingerprints,” in IEEE International Conference on
Image Processing 2005, 2005, pp. 33–36.
[3] E. Kukula, S. Elliott, and V. Duffy, “The effects of human interaction on
biometric system performance,” in First International Conference on
Digital Human Modeling (ICDHM 2007), Held as Part of HCI
International, 2007, pp. 904–914.

[4] A. Hicklin and R. Khanna, “The role of data quality in biometric
systems,” White Paper. Mitretek Systems (February 2006), no. February,
2006.
[5] J. Fierrez-Aguilar, L. Munoz-Serrano, F. Alonso-Fernandez, and J.
Ortega-Garcia, “On the effects of image quality degradation on minutiae-
and ridge-based automatic fingerprint recognition,” in Proceedings 39th
Annual 2005 International Carnahan Conference on Security
Technology, 2005, pp. 79–82.
[6] S. K. Modi, S. J. Elliott, and H. Kim, “Statistical analysis of fingerprint
sensor interoperability performance,” in 2009 IEEE 3rd International
Conference on Biometrics: Theory, Applications, and Systems, 2009, pp.
1–6.
[7] C. Jin, H. Kim, X. Cui, E. Park, J. Kim, J. Hwang, and S. Elliott,
“Comparative Assessment of Fingerprint Sample Quality Measures
Based on Minutiae-Based Matching Performance,” in 2009 Second
International Symposium on Electronic Commerce and Security, 2009,
vol. 2, pp. 309–313.
[8] P. Grother and E. Tabassi, “Performance of biometric quality measures.,”
IEEE transactions on pattern analysis and machine intelligence, vol. 29,
no. 4, pp. 531–43, Apr. 2007.
[9] S. J. Elliott and E. P. Kukula, “A definitional framework for the
human/biometric sensor interaction model,” in Biometric Technology for
Human Identification VII, 2010, vol. 7667, no. 1, p. 76670H–8.
[10] A. J. Mansfield and J. L. Wayman, “Best Practices in Testing and
Reporting Performance of Biometric Devices ver 2.01,” Teddington,
2002.
[11] M. R. Young and S. J. Elliott, “Image Quality and Performance Based on
Henry Classification and Finger Location,” in 2007 IEEE Workshop on
Automatic Identification Advanced Technologies, 2007, pp. 51–56.
[12] M. Theofanos, S. Orandi, R. Micheals, B. Stanton, and N. Zhang,
“Effects of Scanner Height on Fingerprint Capture.” National Institute of
Standards and Technology, Gaithersburg, p. 58, 2006.
[13] R. Cappelli, D. Maio, D. Maltoni, J. L. Wayman, and A. K. Jain,
“Performance evaluation of fingerprint verification systems.,” IEEE
transactions on pattern analysis and machine intelligence, vol. 28, no. 1,
pp. 3–18, Jan. 2006.
[14] K.J. O'Connor, “Examination of stability in fingerprint recognition across
force levels,” M.S. thesis, Dept. Tech., Lead., and Innov., Purdue Univ.,
West Lafayette, IN, 2013.

(2013) A Trade-off Between Number of Impressions and Number of Interaction Attempts

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (9)

Similar a (2013) A Trade-off Between Number of Impressions and Number of Interaction Attempts

Similar a (2013) A Trade-off Between Number of Impressions and Number of Interaction Attempts (20)

Más de International Center for Biometric Research

Más de International Center for Biometric Research (20)

Último

Último (6)

(2013) A Trade-off Between Number of Impressions and Number of Interaction Attempts