SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
PROGRAM NARRATIVE
A. Research Question
Machine learning algorithms take a number of approaches to the quantitative analysis of
fingerprints. These include identifying and matching minutiae (refs), matching patterns of local
orientation based on dynamic masks (refs), and neural network approaches that attempt to learn
the structure of fingerprints (refs). While these techniques provide good results in biometric
applications and serve a screening role in forensic cases, they are less useful when applied to
severely degraded fingerprints, which must be matched by human experts. Indeed, statistical
approaches and human experts have different strengths. Despite the enormous computational
power available today for use by computer analysis systems, the human visual system remains
unequaled in its flexibility and pattern recognition abilities. Three possible reasons for this
success come from the experts knowledge of where the most important regions are located on a
particular set of prints, the ability to tune their visual systems to specific features, and the
integration of information across different features. In the present project, we propose to
integrate the knowledge of experts into the quantitative analysis of fingerprints to a degree not
achieved by other approaches. There is much that fingerprint examiners can add to machine
learning algorithms and, as we describe below, many ways in which statistical learning
algorithms can assist human experts. Thus the central research question of this proposal is: How
can the integration of information derived from experts improve the quantitative analysis of
fingerprints?
B. Research goals and objectives
The goal of the present proposal is to integrate data from human experts with statistical
learning algorithms to improve the quantitative analysis of inked and latent prints. We introduce
a novel procedure developed by one investigator (Tom Busey) and use it to guide the input to
statistical learning algorithms developed and extended by our other investigator (Chen Yu). The
fundamental idea behind our approach is that the quantitative evaluation of the information
Page 1
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
contained in latent and inked prints can be vastly improved by using elements of human
expertise to assist the statistical modeling, as well as to introduce a new dimension of time that is
not contained in the static latent print analysis. The main benefit, as we discuss in sections C.x.x,
is that the format of the data extracted from experts allows the application of novel quantitative
models that are adapted from related areas. To apply this knowledge derived from experts, we
will use our backgrounds in vision, perception, machine learning and behavioral testing to design
experiments that extract relevant information from experts and use this to improve the
quantitative analysis techniques applied to fingerprints by integrating the two sources of
information.
Our research interests differ somewhat from the existing approaches and reflects the
adaptations that are necessary to incorporate human expert knowledge. Existing statistical
algorithms developed to match fingerprints rely on several different classes of algorithms, Some
extract minutiae and other robust sources of information such as the number of ridges between
minutiae (refs). Others rely on the computation of local curvature of the ridges, and then partition
these into different classes (MASK refs). Virtually all approaches make reasoned and reasonable
guesses as to what the important sources of information might be, such as minutiae, local ridge
orientation or local ridge width (dgs paper). The present approach takes a more agnostic
approach to what might be the important sources of information in fingerprints, and we will
develop statistical models that take advantage of the data derived from experts. However, a
major goal of the grant is to demonstrate how expert knowledge can be applied to any extant
model, and to suggest how this might be accomplished. Thus we will spend substantial time
documenting our application of expert knowledge for our statistical models. In addition, we will
make all of our expert data available for other researchers and practitioners. It is likely that the
data will have implications for training, although this is not the focus of the present proposal.
C. Research design and methods
At the heart of our approach is idea that human expertise, properly represented, can improve
the quantitative analyses of fingerprints. In a later section we describe how we apply human
Page 2
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
expert knowledge to various statistical analyses, but first we need to answer the question of
whether human experts can add something to the quantitative analyses of prints.
The answer to this question can be broken down into two parts. First, do human visual
systems in general possesses attributes not captured by current statistical approaches, and second,
do human experts have additional capacities not shared by novices, capacities that could further
inform statistical approaches. Below we briefly summarize what the visual science literature tells
us about how humans recognize patterns, and then describe our own work that has addressed the
differences between experts and novices. As we will show, human experts have much to add to
quantitative approaches.
We should stress that while we will gather data from human experts to improve our
quantitative analyses of fingerprints, the goal of this grant is not to study human experts in order
to determine whether or how they differ from novices, nor are we interested in questions about
the reliability or accuracy of human experts. Instead, we will generalize our previous results that
demonstrate strong differences in the visual processing of fingerprints in experts, and apply this
expertise to our own statistical analyses. As a result, we will only gather data from human
experts (latent print examiners with at least 5 years of post-apprentice work in the field) under
the assumption that this will provide maximum improvement to our statistical methods. We can
demonstrate the effectiveness of this knowledge by simply re-running the statistical analyses
without the benefit of knowledge from experts. There are various metrics attached to each
analysis technique that demonstrate the superiority of expert-enhanced analyeses, such as the
correct recognition/false recognoition tradeoff graphs, or the dimensionality
reduction/reconstruction successes of data reduction techniques.
We will also apply novel approaches adapted from the related domain of language analyses.
It might seem odd to apply techniques developed for linguistic analyses to a visual domain such
as pattern recognition, but the principles that underlie both domains are very similar. Both
involve large numbers of features that have complex statistical relations. In the case of language,
the features are often words, phonemes or other acoustical signals. Fingerprints are defined by a
Page 3
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
complex but very regular dictionary of features that also share a complex and meaningful
correlational structure. One of us (Chen) is a highly-published expert in the field machine
learning algorithms as applied to multimodal data, and several papers inlcuded as appendicies
detail this expertisze. His work on multimodal applications between visual and auditory domains
make him well-suited to address the relation between human data and machnie leanring
algorythms. Both linguistic and visual informaiton contain highly-structured data that consist of
regularities that are extracted by perceivers, and this is not unlike the temporal sequence that
experts go through when they perform a latent print examination, as we describe in a later
section. First, however, we address how we might document the principles of human expertise.
Can we use elements of the human visual system to improve our statistical analyses?
The answer to this question is straightforward, in part because of the overwhelming evidence
that human-based recognition systems contain processes that are not captured by current
statistical approaches. One of us (Busey) has published many articles addressing different
aspects of human sensation, perception and cognition, and thus is well-suited to manage the
acquisition and application of human expertise to statistical approaches. Below we briefly
summarize the properties of the human visual system and in a later section we describe how we
plan to extract fundamental principles from this design in order to improve our statistical
analyses of fingerprints.
An analyses of the human visual system by vision scientists demonstrates that the recognition
process proceeds via an hierarchical series of stages, each with important non-linearities (nature
ref), that produce areas that respond to objects of greater and greater complexity. This process
also provides increasing spatial independence, allowing brain areas to integrate over larger and
larger regions. This will become important for holistic or configural processing, as discussed in a
later section. (also talk about feature-based attention)
A second benefit of this hierarchical approach is that objects achieve limited scale and
contrast invariance. Statistical approaches often deal with this through local contrast or
brightness normalization, but this is a separate process. Scale invariance is often achieved by
Page 4
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
explicitly measuring the width of ridges (grayscale ref), again a separate process.
A third strength of the human visual system is that it appears to have the ability to form new
feature templates through an analyses of the statistical information contained in the fingerprints.
This process, called unitization, will tend to improve feature detection in noisy environments as
is often found with latent prints.
Do forensic scientists have visual capabilities not shared by novices?
The prior summary of the elements of the human visual system suggests that current
statistical approaches can be improved by adapting some of the principles underlying the human
visual system. There are, however, other processes that are specifically developed by latent print
examiners that may also be profitably applied to statistical models. Below we summarize the
results of two empirical studies that have recently been published in the highly respected journal
Vision Research (Busey & Vanderkolk, 2005). The results demonstrate not only that experts are
better than novices, but suggest the nature of the processes that produce this superior
performance.
Visual expertise takes many forms. It could be different for different parts of the
identification process, and may not even be verbalizable by the expert since many elements of
perceptual expertise remain cognitively impenetrable (refs). A major focus of our research is to
capture elements of this expertise and use this as a training signal for our statistical learning
algorithms. What is novel to our approach is our ability to capture the expertise at a very deep
and rich level. In the next section we describe our prior work documenting the nature of the
processes that enable experts to perform at levels much superior to novices, and then in Section
C.2 we describe how we capture this expertise in a way that we can use it to improve our
statistical learning algorithms.
C.1. Documenting expertise in human latent print examiners
Initially, experts tend to focus on the entire print, which leads to benefits that we have
previously identified as configural processing (Busey & Vanderkolk, 2005). Configural
Page 5
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
processing takes several forms, but the basic idea behind this process is that instead of focusing
on individual features or minutiae, the observer instead integrates information over a large
region, to identify important relations such as relative locations of features or curvature of ridge
flow. Fingerprint examiners often talk about 'viewing the image in its totality', which is different
language for the same process.
While configural processing reveals the overall structure of an image and selects important
regions for further inspection, the real work comes in comparing small regions in one print to
regions in the other. These regions may be selected on the basis of minutiae identified in the
print, or high-quality Level 3 detail. We know from related work on perceptual learning in the
visual system that one of the processes by which expertise develops is through the development
of new feature detectors. Experts spend a great deal of time viewing prints, and this has the
potential to result in profound changes in how their visual systems process fingerprints. (config
processing refs)
One process by which experts could improve how they extract latent print information from
noisy prints is termed unitization, in which novel feature detector are created through experience
(unitization refs). Fingerprints contain remarkable regularities and the human visual system
C.1.a. Do experts have information valuable to training networks or documenting the
quantitative nature of fingerprints?
Fingerprint examiners have received almost no attention in the perceptual learning or
expertise literatures, and thus the PI began a series of studies in consultation with John
Vanderkolk, of the Indiana State Police Forensic Sciences Laboratory in Fort Wayne, Indiana.
Our first study addressed the nature of the expertise effects in a behavioral experiment, and then
we followed up evidence for configural processing with an electrophysiological study. The
discussion below describes the experiments in some detail, in part because extensions of this
work are proposed in Section D, and a complete description here illustrates the technical rigor
and converging methods of our approach.
Page 6
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
C.1.b. Behavioral evidence for
configural processing
In our first experiment, we abstracted
what we felt were the essential elements
of the fingerprint examination process
into an X-AB task that could be
accomplished in relatively short order.
This work is described in Busey and
Vanderkolk (2005), but we briefly
describe the methods here since they
illustrate how our approach seeks to find
a paradigms that is less time-consuming
than fully realistic forensic examinations
(which can take hours to days to
complete) yet still maintains enough
ecological validity to tap the expertise of the examiners. Figure 1 shows the stimuli used in the
experiment as well as a timeline of one trial. We cropped out fingerprint fragments from inked
prints, grouped them into pairs, and briefly presented one of the two for 1 second. This was
followed by a mask for either 200 or 5200 ms, and then the expert or novice subject made a
forced-choice response indicating which of the two test prints they believe was shown at study.
We introduced orientation and brightness jitter at study, and the construction of the pairs was
done to reduce the reliance on idiosyncratic features such as lint or blotches.
At test, we introduced two manipulations that we thought captured aspects of latent prints, as
shown in Figure 2. First, latent prints are often embedded in visual noise from the texture of the
surface, dust, and other sources. One expert, in describing how he approached latent prints,
stated that his job was to 'see through the noise.' To simulate at least elements of this noise, we
embedded half of our test prints in white visual noise. While this may have a spatial distribution
Page 7
Study1 Sec
Mas200 oMilli
TestUntil
Figure 1. Sequence of events in a behavioral experiment with
fingerprint experts and novices. Note that the study image has a
different orientation and is slightly brighter to reduce reliance
on low-level cues.
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
that differs from the noise typically
encountered by experts, we hoped that it
would tap whatever facilities experts
may have developed to deal with noise.
The second manipulation was
motivated by the observation that latent
prints are rarely complete copies of their
inked counterparts. They often appear
patchy if made on an irregular surface,
and sections may be partially masked out. To simulate this, we created partially-masked
fingerprint fragments as shown in the upper-right panel of Figure 2. Note that the partially-
masked print and its complement each contain exactly half of the information of the full print
and the full print can be recovered by summing the two partial prints pixel-by-pixel. We use this
property to test for configural effects as described in a later section.
All three manipulations (delay between study and test, added noise and partial masking) were
fully crossed to create 8 conditions. The
data is shown in Figure 3, which show
main effects for all three factors for
novices. Somewhat surprising is the
finding that while experts show effects
of added noise and partial masking, they
show no effect of delay, which suggests
that they are able to re-code their visual
information into a more durable store
resistant to decay, or have better visual
memories. Experts also show an
interaction between added noise and
Page 8
Clear FragmPartially
PartiallyPresentFragmentsPresented inFigure 2. Four types of test trials.
0.5
0.6
0.7
0.8
0.9
1.0
Full Image Partial Image
Experts- Short Delay
No Noise
Noise Added
Percent Correct
Image Type
0.5
0.6
0.7
0.8
0.9
1.0
Full Image Partial Image
Experts- Long Delay
No Noise
Noise Added
Percent Correct
Image Type
0.5
0.6
0.7
0.8
0.9
1.0
Full Image Partial Image
Novices- Short Delay
No Noise
Noise Added
Percent Correct
Image Type
0.5
0.6
0.7
0.8
0.9
1.0
Full Image Partial Image
Novices- Long Delay
No Noise
Noise Added
Percent Correct
Image Type
Figure 3. Behavioral Experiment Data. Error bars represent one
standard error of the mean (SEM).
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
partial masking, but novices do not. This interaction seen with the experts may result from very
strong performance for full images embedded in noise, and may result from configural processes.
To test this in a scale-invariant manner, we developed a multinomial model which makes a
prediction for full-image performance given partial-image performance using principles similar
to probability summation. The complete results are found in Busey & Vanderkolk (2005), but to
summarize, when partial image performance is around 65%, the model predicts full image
performance to be about 75%, and it is almost at 90%, significantly above the probability
summation prediction. Thus it appears that when both halves of an image are present (as in the
full image) experts are much more efficient at extracting information from each half.
The results of this experiment lay the groundwork for a more complete investigation of
perceptual expertise in fingerprint examiners. From this work we have evidence that:
1) Experts perform much better than novices overall, despite the fact that the testing
conditions were time-limited and somewhat different than those found in a traditional latent print
examination.
2) Experts appear immune to longer delays between study and test images, suggesting better
information re-coding strategies and/or better visual memories
3) Experts may have adopted configural processing abilities over the course of their training
and practice. All observers have similar facilities for faces as a consequence the ecological
importance of faces and our quotidian exposure as a result of social interactions. Experts may
have extended this ability to the domain of fingerprints, since configural processing is seen as
one mechanism underlying expertise (e.g. Gauthier & Tarr, 1997).
C.1.c. Electrophysiological evidence for configural processing
To provide converging evidence that fingerprint experts process full fingerprints
configurally, we turned to an electrophysiological paradigm based on work from the face
recognition literature. This experiment is described more fully in Busey and Vanderkolk (2005),
which is included as an appendix. However, these results support the prior conclusions described
above, and demonstrate that the configural processing observed with fingerprint examiners is a
Page 9
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
result of profound and qualitative changes that occur in the very earliest stages of their
perceptual processing of fingerprints.
C.2. Elements of human expertise that could improve quantitative analyses
The two studies described above are important because they illustrate that configural
information is one process that could be adapted for use in the quantitative analyses of
fingerprints. Existing quantitative models of fingerprints incorporate some elements of the
expertise seen above, but many elements could be added that would improve the recognition
accuracy of existing programs. The two major approaches to fingerprint matching rely on local
features such as minutiae detection (refs), and more global approaches such as dynamic masks
applied to orientation computed at many locations on a grid overlaying the print (refs). Of these
two approaches, the dynamic mask approach comes closer to the idea of configural processing,
although it does not compute minutiae directly. strengthen this intro
Neither approach takes advantage of the temporal information that expresses elements of
expertise in the human matching process. Quantitative information such as fingerprint data, when
represented in pixel form, has a highly-dimensional structure. The two techniques described
above reduce this dimensionality by either extracting salient points such as minutiae, or
computing orientation only at discrete locations. Both of these approaches throw out a great deal
of information that could otherwise be used to train a statistical model on the elemental features
that allow for matches. Part of the reason this is necessary is that the high-dimensional space is
difficult to work in: all prints are more or less equally similar without this dimensionality
reduction, and by reducing the dimensionality computations such as similarity become tractable.
The key, then, is to reduce the dimensionality while preserving the essential features that allow
for discrimination among prints. One technique that has been explored in language acquisition is
the concept of "starting small" (Elman ref). In this procedure, machine learning approaches such
as neural network analyses are given very coarse information at first, which helps the network
find an appropriate starting point. Gradually, more and more detail information is added, which
allows the network to make finer and finer discriminations.
Page 10
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
We discuss these ideas more fully in section X.Xx, but we mention it here to motivate the
empirical methods described next. Experts likely select which information they choose to
initially examine based on the need to organize their search processes. Thus they likely acquire
information that may not immediately indicate to a definitive conclusion of confirmation or
rejection, but guides the later acquisition process. In the scene perception literature, this process
is known as 'gist acquisition' (refs), and suggests that the order in which a system (machine or
human) learns information matters. In the section below we describe how we acquire both spatial
and temporal information from experts, and then describe how this knowledge can be
incorporated into quantitative models.
C.3. Capturing the information acquisition process: The moving window paradigm
To identify the nature of the information used by experts, and the order in which it is
gathered, we have begun to use a technique called a moving window procedure. In the sections
below we describe this procedure and how it can be extended to address the role of configural or
gist information in human experts.
C.3.a. The moving window paradigm
The moving window paradigm is a software tool that simulate the relative acuity of the fovea
and peripheral visual systems. As we look around the world, there is a region of high acuity at
the location our eyes are currently pointing. Regions outside the foveal viewing cone are
represented less well. In the moving window paradigm we represent this state by slightly
blurring the image and reducing the contrast.
http://cognitrn.psych.indiana.edu/busey/FingerprintExample/
Page 11
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
Figure 4 shows several frames of the moving window program, captured at different points in
time. The two images have been degraded by a blurring operation that somewhat mimics the
reduced representation of peripheral vision. The exception is a clear circle that responds in real
time to the movement of the mouse. This dynamic display forces the user to move the clear
window to regions of the display that warrant special interest. The blurred portions provide some
context for where to move the window. By recording the position of the mouse each time it is
Page 12
Figure 4. The moving window paradigm allows the user to move the circle of interest around to different
locations on the two prints. This circle provides high-quality information, and allows the expert the
opportunity to demonstrate, in a procedure that is very similar to an actual latent print examination, which
sections of the prints they believe are most informative. This procedure also records the order in which
different sites are visited.
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
moved, we can reconstruct a complete record of the manner in which the user examined the
prints. This method has some drawbacks in that the eyes move faster than the mouse. However,
we find that with practice the experts report very little limitations with this procedure and it has
the benefit of precise spatial localization. A major benefit of this procedure is that it can be done
over the web, reaching dozens of experts and producing a massive dataset. Many related
information theoretic approaches such as latent semantic analysis find that a large corpus of data
is necessary in order to reveal the underlying structure of the representation of information, and a
web-based approach provides sufficient data.
The data produced by this paradigm is vast: x/y coordinates for the clear window at each
millisecond. We have begun to analyze this data using several different techniques. The first
analysis we designed creates a mask that is black for regions the observer never visited and clear
for areas visited most often. Figure 5 shows an example of this kind of analysis. Areas visited
less often are somewhat darkened. The left panels of Figure 5 show two masked images, which
shows not only where the experts visited, but how long they spent inspecting each location. Thus
it represents a window into the regions the experts believed informative.
The right panels give a slightly different view, where unvisited areas are represented in red.
This illustrates that experts actually spend most of their time in relatively small regions of the
prints.
As a first pass, the images in Figure 5 reveal where the experts believe the task-relevant
information resides. However, lost in such a representation is the order in which these sites were
visited. In addition, this information is very specific to a particular set of print. Ultimately we
will produce more general representation that characterizes both the fundamental set of features
(often described as the basis set) that experts rely on, as well as how they process these features.
We have begun to explore an information-theoretic approach to this problem that seeks to find a
set of visual features that is common to a number of experts and fingerprint pairs. This approach
is related to many of the dimensionality reduction techniques that have been applied to natural
images (e.g. Olshausen & Field, 1996). Later project extend this approach to incorporate
Page 13
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
elements of configural processing or context-specific models. In the present proposal we discuss
several different ways we plan to analyze what is a very rich dataset.
Our experts report relatively little hindrance when using the mouse to move the window. The
latent and inked prints have their own window (only one is visible at any one time) and users
press a key to flip back and forth between the two prints. This flip is actually faster than an
eyemovement and automatically serves as a landmark pointer for each print, making this
procedure almost as easy to use as free viewing of the two prints (which are often done under a
loupe with its own movement complexities). In addition, we also give users brief views of the
entire image to allow configural processes to work to establish the basic layout.
C.3.b. Measuring the role of configural processing in latent print examinations
behavioral experiment- blurred vs. very low contrast- qualitative changes across experts?
complete this section
Page 14
Figure 5. Examples of masked imaged revealing where experts choose to acquire information in order to make
an identification. The black versions show only regions where the expert spent any time, and the mask is
clearer for regions in which the expert spent more time. The right-hand images show teh same information, but
allow some of the uninspected information to show through. These images reveal that experts pay relatively
little attention to much of the image and only focus on regions they deem releveant for the identification. We
suggest that this element of expertise, learning to attend to relevant locations, is something that coudl benefit
quantitative analyes of fingerprints.
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
C.3.c. Verification with eyemovement recording
complete this section
C.4. Extracting the fundamental features used when matching prints
Because latent and inked prints are rarely direct copies of each other, an expert must extract
invariants from each image that survive the degradations due to noise, smearing, and other
transformations. Once these invariants are extracted, the possibility of a match can be assessed.
This is similar in principle to the type of categorical perception observed in speech recognition,
in which the invariants of parts of speech are extracted from the voices of different talkers. This
suggests that there exists a set of fundamental building blocks, or basis functions, that experts
use to represent and even clean up degraded prints. The nature and existence of these features are
quite relevant for visual expertise, since in some sense these are the direct outcomes of any
perceptual system that tunes itself to the visual diet it experiences.
We propose to perform data reduction techniques on the output of the moving window
paradigm. These techniques have successfully been applied to derive the statistics of natural
images (Hyvarinen & Hoyer, 2000). The results provided individual features that are localized in
space and resemble the response profiles of simple cells in primary visual cortex. Many of these
studies are performed on random sampling of images and visual sequences, but the moving
window application provides an opportunity to use these techniques to recover the dimensions of
only the inspected regions, and to compare the recovered dimensions from experts and
representations based on random window locations.
The specifics of this technique are straightforward. For each position of the moving window,
extract out (say) a 12 x 12 patch of pixels. This is repeated at each location that was inspected by
the subject, with each patch weighted by the amount of time spent at each location. The moving
window experiment tens of thousands of patches of pixels, which are submitted to a data
reduction technique (independent component analysis, or ICA), which is similar to principle
components analysis, with the exception that the components are independent, not just
Page 15
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
uncorrelated. The linear decomposition generated by ICA has the property of sparseness, which
has been shown to be important for representational systems (Field, 1994; Olshausen & Field,
1996) and implies that a random variable (the basis function) is active only very rarely. In
practice, this sparse representation creates basis functions that are more localized in space than
those captured by PCA and are more representative of the receptive fields found in the early
areas of the visual system.
Huge copra of samples are required to extract invariants from noisy images, and at present
we have only pilot data from several experts. However, the results of this preliminary analysis
can be found in Figure 6. This figure shows features discovered using the ICA algorithm (Hurri
& Hyvarinen, 2003; Hyvarinen, Hoyer & Hurri, 2003). Each image represents a basis function
that when linearly combined will reproduce the windows examined by experts. Inspection of
Figure 6 reveals that features such as ridge endings, y-brachings and islands are beginging to
become represented. This analysis takes on greater value when applied to the entire database we
will gather, since it will combine across individual features to derive the invariant stimulus
features that provide the basis for fingerprint examinations done by human experts.
The ICA analysis is very sensitive to spatial location, and while cells in V1 are likely also
highly position sensitive, the measured basis functions are properties of the entire visual stream,
not just the early stages. More recent advances in ICA techniques have addressed this issue in a
similar way that the visual system has chosen to solve the problem. In addition performing data
Page 16
Figure 6. ICA components from expert data.
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
reduction techniques to extract the fundamental basis sets, these extended ICA algorithms group
the recovered components based on their energy (squared outputs). This grouping has shown to
produce classes of basis functions that are position invariant by virtue of the fact that they
include many different positions for each fundamental feature type. The examples shown in
Figure 7 were generated by this technique, which reduces the reliance on spatial location. This
groups the recovered features by class and accounts for the fact that rectangles have similar
properties to nearby rectangles. Note that the features in Figure 14 are less localized than those
typically found with ICA decompositions, which may be due to the large correlational structure
inherent in fingerprints, although this remains an open question addressed by this proposal.
The development of ICA approaches is an ongoing field, and we anticipate that the results of
the proposed research will help extend these models as we develop our own extensions based on
the applications to fingerprint experts. There are several ways in which the recovered
components can be used to evaluate the choice of positions by experts (which ultimately
determine, along with the image, the basis functions). First, one can visually inspect the sets of
basis functions recovered from datasets produced by experts, and compare this with one
generated from random window locations.
A second technique can be used to demonstrate that experts do indeed posses a feature set
that differs from a random set. The data from random windows and experts can be combined to
Page 17
Figure 7. ICA components from expert data, and grouped by energy. This analyses allows the basis functions
to have partial spatial independence, at a slight cost to image quailty. This latter issue is less relevant for larger
corpi when many similar features are combined by individual basis function groups.
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
produce a common set of components (basis functions). ICA is a linear technique, and thus the
original data for both experts and random windows can be recovered through weighted sums of
the components, with some error if only some of the components are saved. If experts share a
common set of features that is estimated by ICA, then their data should be recovered with less
error than that of the random windows. This would demonstrate that an important component of
expertise is the ability to take a highly dimensional dataset (as produced by noisy images) and
reduce it down to fundamental features. From this perspective, visual expertise is data reduction.
These kinds of data reduction techniques serve a separate purpose. Many of the experiments
described in other sections of this proposal depend on specifying particular features. While initial
estimates of the relevant features can be made on the basis of discussions with fingerprint
experts, we anticipate that the results of the ICA analysis will help refine our view of what
constitutes an important feature within the context of fingerprint matching.
The moving window procedure has the disadvantage of being a very localized procedure, due
to the nature of the small moving window. There is a fundamental tradeoff between the size of
the window and the spatial acuity of the procedure. If the window is made too large, we know
less about the regions from which the user is attempting to acquire information. To offset this,
we have provided the user the opportunity to view quick flashes of the full image, enough to
provide an overview of the prints, but not enough to allow matches of specific regions. We will
also conduct the studies using large and small windows to see whether the nature of the
recovered components changes with window size.
C.4. Starting Small: Guiding feature extraction with expert knowledge
We need to ask whether this is compelling, and cut it if it is not.
Feature extraction procedures attempt to take a high dimensional space and use the
redundancies in this space to derive a lower-dimensional representation that combine across the
redundancies to provide a basis set. This basis set can be thought of as the fundamental feature
set, and the development of this set can be thought of as one mechanism underlying human
expertise. The difficulty with these highly-dimensional spaces is that algorithms that attempt to
Page 18
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
uncover the feature set through iterative procedures like Independent Component Analysis or
neural networks may fall into local minima and fail to converge upon a global solution. One
solution that has been proposed in the human developmental literature is one of starting small
(Elman, 1993). In this technique, programmers intially restrict the inputs to statistical models to
provide general kinds of information rather than specific information that would lead to learning
of specific instances. As a network matures, more specific information is added, which allows
the network to avoid falling into local minima that represent non-learned states. While the exact
nature of these effects are still being worked out (Rohde & Plaut, 1999), recent work has
provided empirical support in the visual domain (Conway, Ellefson & Christiansen, ref). This
suggests that we might use the temporal component of the data from experts in the moving
window paradigm to help guide the training of our networks.
As an expert views a print, they initially are likely to focus on broad, overall types of
information that give the need to finish if necessary
C.5. Automatic detection of regions of interest using expert knowledge
In both fingerprint classification (e.g. Dass & Jain, 2004; Jain, prabhakar & Hong 1999;
Cappelli, Lumini, Maio & Maltoni, 1999) and fingerprint identification (e.g. Pankanti, Prabhakar
& Jain, 2002; Jain, Prabhakar & Pankanti, 2002) applications, there are two main components for
an automatic system: (1) feature extraction and (2) matching algorithm to compare (or classify)
fingerprints based on feature representation. The feature extraction is the first step to convert
raw images into feature representations. The goal is to find robust and invariant features to deal
with various conditions in real-world applications, such as illumination, orientation and
occlusion. Given a whole image of fingerprint, most fingerprint recognition systems utilize the
location and direction of minutiae as features for pattern matching. In our preliminary study of
human expert behaviors, we observe that human experts focus on just parts of images (regions of
interest – ROIs) as shown in Figure XX, suggesting that it is not necessary for a human expert to
check through all minutiae in a fingerprint. A small subset of minutiae seems to be sufficient for
the human expert to make a judgment. What regions are useful for matching among all the
Page 19
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
minutiae in a fingerprint? Is it possible to build an automatic ROI detection system that can
achieve a similar performance as a human expert? We attempt to answer this question by
building a classification system based on the training data captured from human experts. Given a
new image, the detection system is able to automatically detect and label regions of interest for
the matching purpose. We want to note that we expect that most regions selected by our system
will be minutiae but we also expect that the system will potentially discover the structure
regularities from non-minutia regions that are overlooked in previous studies. Different from
previous studies of minutiae detection (e.g. Maio & Maltoni, 1997), our automatic detection
system will not simply detect minutiae in a fingerprint but focus on detecting both a small set of
minutiae and other useful regions for the matching task. Considering the difficulties in
fingerprint recognition, building this automatic detection system is challenging. However, we are
confident that this proposed research will be first steps toward the success and make important
contributions. This confidence lies in two important factors that make our work different from
other studies: (1) we will record detailed behaviors of human experts (e.g. where they look in a
matching task) and recruits the knowledge extracted from human experts to build a pattern
recognition system; and (2) we will apply state-of-art machine learning techniques in this study
to efficiently encode both expert knowledge and regularities in fingerprint data. The combination
Page 20
Figure X. The overview of automatic detection of regions of interest. The red regions in the fingerprints
indicate whether human expert focus on in pattern matching task.
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
of these two factors will lead us to achieve this research plan.
To build this kind of system, we need to develop a machine learning algorithm and estimate
the parameters based on the training data. Using the moving window paradigm (described in
C.3), we collect the information of where a human expert examines from moment to moment
when he performs a matching task. Hence, the expert’s visual attention and behaviors (moving
the windows) can be utilized as labels of regions of interest – providing the teaching signals for a
machine learning algorithm. In the proposed research, we will build an automatic detection
system that captures the expert’s knowledge to guide the detection of useful regions in a
fingerprint for pattern matching.
We will use the data collected from C.X. Each circular area examined by the expert is filtered
by a bank of Gabor filters. Specifically, the Gabor filters with three scales and five orientations
are applied to the segmented image. It is assumed that the local texture regions are spatially
homogeneous, and the mean and the standard deviation of the magnitude of the transform
coefficients are used to represent an object in a 48-dimensional feature vector. We reduced the
high-dimensional feature vectors into the vectors of dimensionality 10 by principle component
analysis (PCA), which represents the data in a lower dimensional subspace by pruning away
those dimensions with the least variance. We also randomly sample other areas that the expert
doesn’t pay attention to and code these areas with a Non-ROI label which is paired with feature
vectors extracted from these areas. In total, the training data consists of two groups of labeled
features – ROI and Non-ROI.
Next, we will build a binary classifier based on Support vector machines (SVMs). SVMs
have been successfully applied to many classification tasks (Vapnik 1995; Burges 1998). A SVM
trains a linear separating plane for classifying data, through maximizes the margins of two
parallel planes near the separating one. The central idea is to nonlinearly map the input vector
into a high-dimensional feature space and then construct an optimal hyperplane for separating
the features. This decision hyperplane depends on only a subset of the training data called
support vectors.
Page 21
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
For a set of n-dimensional training examples,
m
iix 1}{ ==Χ labeled by expert’s visual attention
m
iiy 1}{ = , and a mapping of data into q-dimensional vectors
m
iixX 1)}({)( == φφ by kernel function
where nq >> , a SVM can be built on the set of mapping training data based on the solution of
the following optimization problem:
Minimizing over ),...,,,( 1 mbw ξξ the cost function: ∑=
+
m
i
i
T
Cww
12
1
ξ
Subject to: ii
Tm
i bxwy ξφ −≥+∀ = 1))((:1 and
0≥ιξ for all i
Where C is a user-specified constant for controlling the penalty to the violation terms denoted by
each iξ . The ξ is called slack variables that measure the deviation of a data point from the
ideal condition of pattern separability. After training, w and b constitute of the classifier:
))(( bxwsigny T
+= φ
Compared with other approaches used in fingerprint recognition, such as neural networks and
k-nearest neighbors, SVMs have demonstrated more effective in many classification tasks. In
addition, we first transform original features into a lower-dimensional space based on PCA. The
purpose of this first step is to deal with the curse of dimensionality. We then map the data points
into another higher-dimensional space so that they are linearly separable. By doing so, we
convert the original pattern recognition problem into a simpler one. This idea is quite in line with
kernel-based nonlinear PCA (Scholkopf, Smola & Muller 1998) that have been successfully used
in several fields (e.g. Wu, Su & Carpuat 2004).
Given a new testing fingerprint, we will shift a 40x40 window over the image and classify all
the patches at each location and scale. The system will first extract Gabor-based features from
local patches which will be the input to the detector. The detector will label all the regions as
either ROI or Non-ROI. We expect that most ROIs are minutiae. Different from the methods
based on minutiae matching, we also expect that only a small of minutiae are utilized by human
experts. Moreover, we expect the system to detect some areas that are not defined as minutiae
but human experts also pay attention to during the matching task. Thus, the ROI detector we
develop will go beyond the standard approach in fingerprint recognition (minutiae extraction and
Page 22
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
matching). By efficiently encoding the knowledge of human expert, the proposed system will
have opportunities to discover the statistical regularities in fingerprints that have been
overlooked in previous studies.
C.6. Using expert-identified correspondences to extract environmental models
In our moving window paradigm, a human expert moves the window back and forth between
inked and latent fingerprints to perform pattern matching. We propose that the dynamic
behaviors of the expert provide additional signals indicating one-to-one correspondences
between two images. In light of this, our hypothesis is that an expert’s decision is based on the
comparison of these one-to-one patches. Therefore, we propose that these expert-identified
correspondences can serve as additional information to find the regularities in fingerprint and
build the automatic detection system.
We propose to use this knowledge as a prior for the training data. We observe that not all the
focused regions in the latent print have the corresponding regions in the inked print. Thus, it is
more likely that those one-to-one pairs play a more important role in pattern matching than other
regions of interest. Based on this observation, we propose to maintain a set of weights over the
training data. More specifically, for each ROI in the latent image, we find the most likely pairing
patch in the inked image. Two constraints guide the searching of the matching pair. The temporal
constraint is based on the expert’s behaviors. For instance, the patch in the inked pair that the
expert immediately examine (right after looking at the ROI in the latent image) is more likely to
associate with that ROI in the latent pair. The spatial constraint is to find the highest similarity of
the patch in the latent image and any other patch in the inked image. In this way, each ROI in the
latent image can be assigned with a weight indicating the probability to map this region to a
region in the other image. With a set of weighted training data, we will apply a SVM-based
algorithm (briefly described in C.5) which will focus on the paired samples (with high weights)
in the training data. More specifically, we replace the constant C in the standard SVM with a set
of variables ic , each of which corresponds to the weight of a data point. Accordingly, the new
Page 23
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
objective function is ∑=
+
m
i
ii
T
cww
12
1
ξ . Thus, the matching
regions receive more penalties if they are nonseparable
points while other regions receive less attention because it
is more likely that they are irrelevant to the expert’s
decision. Thus, the parameters of the SVM are tuned up to
favorite the regions that human experts are especially
interested in. By encoding this knowledge in a machine
learning algorithm, we expect that this method will lead to a
better performance by closely imitating the expert’s
decision.
C.7. Dependencies between global and local information: The role of gist information
Fingerprints are categorized into several classes, such as whorl, right loop, left loop, arch,
and tented arch in the Henry classification system (Henry 1900). In the literatures, researchers
use only 4-7 classes in an automatic classification system. This is because the task of
determining a fingerprint class can be difficult. For example, it is hard to find robust features
from raw images that can aid classification as well as exhibit low variations within each class. In
C.5 and C.6, we discuss how to use expert knowledge to find useful features for pattern
matching. By taking a bigger picture of feature detection and fingerprint classification in this
section, we find that we need to deal with a chicken-and-egg problem: (1) useful local features
can predict fingerprint classes; and (2) a specific fingerprint class can predict what kinds of local
regions likely occur in this type of fingerprint. In contrast, standard alone feature detection
algorithms (e.g. in C.5 and C.6) usually look at local pieces of the image in isolation when
deciding whether the patch is a region of interest. In machine learning, Murphy, Torralba and
Freeman (2003) proposed a conditional random filed for jointly solving the tasks of object
detection and scene classification. In light of this, we propose to use the whole image context as
an extra source of global information to guide the searching of ROIs. In addition, a better set of
Page 24
Figure X. The overview of automatic
detection of regions of interest. The red
regions in the fingerprints indicate
whether human expert focus on in
pattern matching task.
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
ROIs will also potentially make the classification of the whole fingerprint more accurate. Thus,
the chicken-and-eggs problem is tackled by a bootstrapping procedure in which local and global
pattern recognition systems interact with and boost each other.
We propose a machine learning system based on graphical models (Jordan 1999) as shown in
Figure XX. We define the gist of image as a feature vector extracted from the whole image by
treating it as a single patch. The gist is denoted by Gv . Then we introduce a latent variable T
describing the type of fingerprint. The central idea in our graphical model is that ROI presence is
conditionally independent given the type and the type is determined by the gist of image. Thus,
our approach encodes the contextual information on a per image basis instead of extracting
detailed correlations between different kinds of ROIs (e.g. a fix prior such as the patch A always
occurs to the left of the patch B) because of the complexity and variations of detailed
descriptions. Next we need to classify fingerprint types. We will simply train a one-vs-all binary
SVM classifier for recognizing each fingerprint type based on the gist. We will then normalize
the results:
∑ =
=
==
'
'
)|1(
)|1(
)|(
t G
t
G
t
G
vTp
vTp
vtTp where )|1( Gt
vTp = is the output of tth one-vs-all
classifier.
As far as the fingerprint type is known, we can use this information to facilitate ROI
detection. As shown on the tree-structured graphical model in Figure XX, the following
conditional joint density can be expressed as follows:
∏∑∏ ==
i t
ititG
i
iiGN vTRpTpvTp
z
vTRpvTp
z
vRRTp ),|()()|(
1
),|()|(
1
)|,...,,( 1
Page 25
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
Where Gv and iv are local and global features respectively. iR is the class of a local patch. In
the proposed research, we will investigate two types of R. One classification defines ROI
and Non-ROI types which is the same with C.5 and C.6. The other classification defines
several minutia types (plus Non-ROI) such as termination minutia and bifurcation
minutia. z is a normalizing constant. Based on this graphical model, we will be able to
use contextual knowledge to facilitate the classification of a local image. We also plan to
develop a more advanced model which will use local information to facilitate the
fingerprint type classification. We expect that this kind of approach will lead to a more
effective automatic system that can perform both top-down inference (fingerprint types to
minutia types) and bottom-up inference (minutia types to fingerprint types).
C.8. Summary of quantitative approaches
(Tom writes)
General themes:
Incorporate expert knowledge
Links between global and local structure made possible by input from experts
Specification of elemental basis or feature set
Classifying informativeness of regions
Defining an intermediate level between low-level feature extractors and high-level gist or
configural information
D. Implications for knowledge and practice
The implications of the knowledge gained by the results of these studies and analyses falls
into four broad categories, each of which are discussed below.
D.1. Implications for quantitative understanding of the information content of fingerprints
Page 26
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
D.2. Implications for an understand of the links between quantitative information content
and the latent print examination process
D.3. Implications for the classification and filtering of poor-quality latent prints
D.4. Implications for the development of software-based tools to assist human-based latent
print examinations and training
E. Mangement plan and organization
F. Dissemination plan for project deliverables
scientific articles, presentations at machine learning conferences and fingerprint conferences,
proof-of-concept Java-based applets.
(end of 30 pages)
Page 27
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
G. Description of estimated costs
Personnel
The project will be co-directed by Thomas Busey and Chen Yu. We request 11 weeks of
summer support, during which time both will devote 100% of their efforts to the project.
Benefits are calculated at 19.81%. The salaries are incremented 3% per year.
Many of the simulations will be conducted by a graduate student, who will be hired
specifically for the purposes of this project. This student, likely an advanced computer science
student with a background in cognitive science, requires a stipend, a fee remission and health
insurance. The health insurance is incremented at 5% per year.
Subject coordination and database management will be coordinated by hourly students who
will work 20 hours/wk on the project. We will pay them $10/hr.
Consultant
John Vanderkolk, with whom Busey has worked with for the past two years, has agreed to
serve as an unpaid consultant on this grant. He does require modest travel costs when he visits
Bloomington.
Travel
Money is requested to bring in four experts for testing using the eyemovement recording
equipment. These costs will total approximately $1500/yr.
Money is requested for three conferences a year. These will enable the investigators to travel
to conferences such as Neural Information Processing (NIPS) and forensic science conferences
such as the International Association for Identification (IAI) to interact with colleagues and share
the results of our analyses. These trips serve an important role in communicating the efforts of
this grant to a wider audience.
Other Costs
Equipment
This research is very computer-intensive, and thus we require a large UNIX-based server to
run simulations in parallel. In addition, we require three pc-based workstations to run Matlab and
Page 28
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
other simulations programs. Finally, conferences such as IAI and local Society for Identification
meetings provide an ideal place to gather data from experts, and thus we require a portable
computer for such onsite data-gathering purposes. We anticipate that up to half of our data can
be collected using these on-site techniques, and this technique is preferable because we have
control over the monitor and software. Thus the laptop computer represents a good investiment
in the success of the project.
Other costs
The graduate student line requires a fee remission each year. The fee remission is
incremented at 5% per year.
The results of our studies require resources to reach a wide audience, and thus we require
dissemination costs to cover the costs of publication and web-based dissemination.
This project is highly image-intensive, and we require money to purchase image-processing
software and upgrades. These include software packages such as Adobe Photoshop, as well as
new image processing packages as they become available.
We will test 80 subjects a year to obtain the necessary data for use in our statistical
applications. Each subject requires $20 for the approximate 90 minute testing period.
The project will consume supplies of approximately $100/month, for items such as backups,
power supplies, etc.
Indirect Costs
The indirect rate negotiated between Indiana University and the federal government is set at
51.5%. This rate is assessed against all costs except the fee remission. This was negotiated with
DHHS on 5.14.04
G. Staffing plan and Resources
Both Busey and Chen maintain laboratories in the Department of Psychololgy at Indiana
University that each contain approximately 700 sq. feet of space. These have subject running
rooms, offices and spaces for servers. Chen's lab contains an eyemovement recording setup that
is sufficent for the eyemovement porition of the experiments. Both investigators have offices in
Page 29
Adding human expertise to the quantitative analysis of fingerprints Busey and Chen
the Psychology department as well.
We will recruit a graduate student from the Computer Science or Psychology programs at
Indiana University. This student must have experience with machine learning algorythms at a
theoretical level, and also be an expert programmer. They will work 20 hrs/wk. We will also
recruit two hourly undergraduate students to coordinate the subject running, data analysis and
server maintainance. They will also be responsible for managing the data repository site where
our data will be accessible by other reserachers who wish to integrate human expert knowledge
into their networks.
The bulk of the theoretical work will be handled by Chen and Busey, while the graduate
student will work in impliemnation and model testing.
H. Timeline
This is a multi-year project that is designed to alternate between acquiring human data and
using it to refine the quantitative analyses of latent and inked prints.
Year 1: Acquire necessary fingerprint databases. Begin testing 80 experts on 72 different
latent/inked print pairs. Program Support Vector and Global Local models. Test 2 experts on the
eyemovement equipment using all 72 prints.
Year 2: Test an additional 80 experts on 72 new latent/inked prints. Begin model fitting and
refinement. Test 2 experts on the eyemovement equipment using all 72 prints. Compare results
from eyemovement studies and moving window studies.
Year 3: Test the final 80 experts on 72 new latent/inked prints. Develop new versions of
statistical models based on prior results. Put entire database online for use by other researchers.
Disseminate results to peer-reviewed journals.
Page 30

Más contenido relacionado

La actualidad más candente

User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...Aalto University
 
An Approach to Face Recognition Using Feed Forward Neural Network
An Approach to Face Recognition Using Feed Forward Neural NetworkAn Approach to Face Recognition Using Feed Forward Neural Network
An Approach to Face Recognition Using Feed Forward Neural NetworkEditor IJCATR
 
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...csandit
 
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti OulasvirtaComputational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti OulasvirtaAalto University
 
Machine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryMachine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryDeakin University
 
Memory based recognition for 3 d object-kunal
Memory based recognition for 3 d object-kunalMemory based recognition for 3 d object-kunal
Memory based recognition for 3 d object-kunalKunal Kishor Nirala
 
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...ijaia
 
Neural networks in business forecasting
Neural networks in business forecastingNeural networks in business forecasting
Neural networks in business forecastingAmir Shokri
 
Collnet _Conference_Turkey
Collnet _Conference_TurkeyCollnet _Conference_Turkey
Collnet _Conference_TurkeyGohar Feroz Khan
 
A Survey on Perceptual image hash for authentication of content
A Survey on Perceptual image hash for authentication of contentA Survey on Perceptual image hash for authentication of content
A Survey on Perceptual image hash for authentication of contentIRJET Journal
 
On Using Network Science in Mining Developers Collaboration in Software Engin...
On Using Network Science in Mining Developers Collaboration in Software Engin...On Using Network Science in Mining Developers Collaboration in Software Engin...
On Using Network Science in Mining Developers Collaboration in Software Engin...IJDKP
 
Collnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainCollnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainHan Woo PARK
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Sarvesh Kumar
 
Top 10 neural_network_papers_pdf
Top 10 neural_network_papers_pdfTop 10 neural_network_papers_pdf
Top 10 neural_network_papers_pdfgerogepatton
 
Paper id 252014107
Paper id 252014107Paper id 252014107
Paper id 252014107IJRAT
 
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWSUSING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWScsandit
 

La actualidad más candente (17)

User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
 
An Approach to Face Recognition Using Feed Forward Neural Network
An Approach to Face Recognition Using Feed Forward Neural NetworkAn Approach to Face Recognition Using Feed Forward Neural Network
An Approach to Face Recognition Using Feed Forward Neural Network
 
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
 
Wingfield_et_al_Submitted
Wingfield_et_al_SubmittedWingfield_et_al_Submitted
Wingfield_et_al_Submitted
 
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti OulasvirtaComputational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
 
Machine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryMachine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug Discovery
 
Memory based recognition for 3 d object-kunal
Memory based recognition for 3 d object-kunalMemory based recognition for 3 d object-kunal
Memory based recognition for 3 d object-kunal
 
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
 
Neural networks in business forecasting
Neural networks in business forecastingNeural networks in business forecasting
Neural networks in business forecasting
 
Collnet _Conference_Turkey
Collnet _Conference_TurkeyCollnet _Conference_Turkey
Collnet _Conference_Turkey
 
A Survey on Perceptual image hash for authentication of content
A Survey on Perceptual image hash for authentication of contentA Survey on Perceptual image hash for authentication of content
A Survey on Perceptual image hash for authentication of content
 
On Using Network Science in Mining Developers Collaboration in Software Engin...
On Using Network Science in Mining Developers Collaboration in Software Engin...On Using Network Science in Mining Developers Collaboration in Software Engin...
On Using Network Science in Mining Developers Collaboration in Software Engin...
 
Collnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainCollnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domain
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
 
Top 10 neural_network_papers_pdf
Top 10 neural_network_papers_pdfTop 10 neural_network_papers_pdf
Top 10 neural_network_papers_pdf
 
Paper id 252014107
Paper id 252014107Paper id 252014107
Paper id 252014107
 
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWSUSING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
 

Destacado

Partizionamento_Dualboot_Andreapazo_LD09
Partizionamento_Dualboot_Andreapazo_LD09Partizionamento_Dualboot_Andreapazo_LD09
Partizionamento_Dualboot_Andreapazo_LD09andreapazzo
 
Justin tv tahsin ceit 418
Justin tv tahsin ceit 418Justin tv tahsin ceit 418
Justin tv tahsin ceit 418tahsin198
 
Learning to Extract Relations from the Web using Minimal Supervision
Learning to Extract Relations from the Web using Minimal SupervisionLearning to Extract Relations from the Web using Minimal Supervision
Learning to Extract Relations from the Web using Minimal Supervisionbutest
 
Module 6 Taxation Of Foreign Taxpayers
Module 6 Taxation Of Foreign TaxpayersModule 6 Taxation Of Foreign Taxpayers
Module 6 Taxation Of Foreign TaxpayersUmling
 
Surgery Recommendation-CCR- CMS
Surgery Recommendation-CCR- CMSSurgery Recommendation-CCR- CMS
Surgery Recommendation-CCR- CMSRobert Lindemann
 
076 география. 9класс. атлас. 2012 -48с
076  география. 9класс. атлас. 2012 -48с076  география. 9класс. атлас. 2012 -48с
076 география. 9класс. атлас. 2012 -48сdfdkfjs
 

Destacado (12)

Partizionamento_Dualboot_Andreapazo_LD09
Partizionamento_Dualboot_Andreapazo_LD09Partizionamento_Dualboot_Andreapazo_LD09
Partizionamento_Dualboot_Andreapazo_LD09
 
DL Bliss State Park Brochure
DL Bliss State Park BrochureDL Bliss State Park Brochure
DL Bliss State Park Brochure
 
Justin tv tahsin ceit 418
Justin tv tahsin ceit 418Justin tv tahsin ceit 418
Justin tv tahsin ceit 418
 
Anthony LaJoye Resume
Anthony LaJoye ResumeAnthony LaJoye Resume
Anthony LaJoye Resume
 
Learning to Extract Relations from the Web using Minimal Supervision
Learning to Extract Relations from the Web using Minimal SupervisionLearning to Extract Relations from the Web using Minimal Supervision
Learning to Extract Relations from the Web using Minimal Supervision
 
Banksy
BanksyBanksy
Banksy
 
LCA innesto
LCA innestoLCA innesto
LCA innesto
 
Module 6 Taxation Of Foreign Taxpayers
Module 6 Taxation Of Foreign TaxpayersModule 6 Taxation Of Foreign Taxpayers
Module 6 Taxation Of Foreign Taxpayers
 
Mi lugar
Mi lugarMi lugar
Mi lugar
 
Surgery Recommendation-CCR- CMS
Surgery Recommendation-CCR- CMSSurgery Recommendation-CCR- CMS
Surgery Recommendation-CCR- CMS
 
076 география. 9класс. атлас. 2012 -48с
076  география. 9класс. атлас. 2012 -48с076  география. 9класс. атлас. 2012 -48с
076 география. 9класс. атлас. 2012 -48с
 
Test PowerPoint
Test PowerPointTest PowerPoint
Test PowerPoint
 

Similar a DOJProposal7.doc

Texture Analysis As An Aid In CAD And Computational Logic
Texture Analysis As An Aid In CAD And Computational LogicTexture Analysis As An Aid In CAD And Computational Logic
Texture Analysis As An Aid In CAD And Computational Logiciosrjce
 
Graph embedding approach to analyze sentiments on cryptocurrency
Graph embedding approach to analyze sentiments on cryptocurrencyGraph embedding approach to analyze sentiments on cryptocurrency
Graph embedding approach to analyze sentiments on cryptocurrencyIJECEIAES
 
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...MITAILibrary
 
Developing cognitive applications v1
Developing cognitive applications v1Developing cognitive applications v1
Developing cognitive applications v1Harsha Srivatsa
 
Deep_Learning_Innovations_In_Facial_Analysis
Deep_Learning_Innovations_In_Facial_AnalysisDeep_Learning_Innovations_In_Facial_Analysis
Deep_Learning_Innovations_In_Facial_AnalysisKrishnaMargaliGopara
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
 
Hyperspectral Image Classification
Hyperspectral Image ClassificationHyperspectral Image Classification
Hyperspectral Image ClassificationAndrea Olin
 
Paper id 25201494
Paper id 25201494Paper id 25201494
Paper id 25201494IJRAT
 
Product Analyst Advisor
Product Analyst AdvisorProduct Analyst Advisor
Product Analyst AdvisorIRJET Journal
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionEditorIJAERD
 
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...IJECEIAES
 
A Study on Face Expression Observation Systems
A Study on Face Expression Observation SystemsA Study on Face Expression Observation Systems
A Study on Face Expression Observation Systemsijtsrd
 
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic AlgorithmIntellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic AlgorithmIJCSIS Research Publications
 
Analysis And Findings On Outdoor Activities
Analysis And Findings On Outdoor ActivitiesAnalysis And Findings On Outdoor Activities
Analysis And Findings On Outdoor ActivitiesCarli Ferrante
 
Guest post #17 Dr Daniel Turner Can a computer do qualitative analy.pdf
Guest post #17 Dr Daniel Turner Can a computer do qualitative analy.pdfGuest post #17 Dr Daniel Turner Can a computer do qualitative analy.pdf
Guest post #17 Dr Daniel Turner Can a computer do qualitative analy.pdfallamodainternationa
 

Similar a DOJProposal7.doc (20)

Texture Analysis As An Aid In CAD And Computational Logic
Texture Analysis As An Aid In CAD And Computational LogicTexture Analysis As An Aid In CAD And Computational Logic
Texture Analysis As An Aid In CAD And Computational Logic
 
A017350106
A017350106A017350106
A017350106
 
Graph embedding approach to analyze sentiments on cryptocurrency
Graph embedding approach to analyze sentiments on cryptocurrencyGraph embedding approach to analyze sentiments on cryptocurrency
Graph embedding approach to analyze sentiments on cryptocurrency
 
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
 
Nt1310 Unit 1 Data Analysis
Nt1310 Unit 1 Data AnalysisNt1310 Unit 1 Data Analysis
Nt1310 Unit 1 Data Analysis
 
Developing cognitive applications v1
Developing cognitive applications v1Developing cognitive applications v1
Developing cognitive applications v1
 
Deep_Learning_Innovations_In_Facial_Analysis
Deep_Learning_Innovations_In_Facial_AnalysisDeep_Learning_Innovations_In_Facial_Analysis
Deep_Learning_Innovations_In_Facial_Analysis
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Hyperspectral Image Classification
Hyperspectral Image ClassificationHyperspectral Image Classification
Hyperspectral Image Classification
 
Paper id 25201494
Paper id 25201494Paper id 25201494
Paper id 25201494
 
Product Analyst Advisor
Product Analyst AdvisorProduct Analyst Advisor
Product Analyst Advisor
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regression
 
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
 
Lived Experiences
Lived ExperiencesLived Experiences
Lived Experiences
 
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
 
A Study on Face Expression Observation Systems
A Study on Face Expression Observation SystemsA Study on Face Expression Observation Systems
A Study on Face Expression Observation Systems
 
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic AlgorithmIntellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
 
Analysis And Findings On Outdoor Activities
Analysis And Findings On Outdoor ActivitiesAnalysis And Findings On Outdoor Activities
Analysis And Findings On Outdoor Activities
 
Guest post #17 Dr Daniel Turner Can a computer do qualitative analy.pdf
Guest post #17 Dr Daniel Turner Can a computer do qualitative analy.pdfGuest post #17 Dr Daniel Turner Can a computer do qualitative analy.pdf
Guest post #17 Dr Daniel Turner Can a computer do qualitative analy.pdf
 
top journals
top journalstop journals
top journals
 

Más de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Más de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

DOJProposal7.doc

  • 1. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen PROGRAM NARRATIVE A. Research Question Machine learning algorithms take a number of approaches to the quantitative analysis of fingerprints. These include identifying and matching minutiae (refs), matching patterns of local orientation based on dynamic masks (refs), and neural network approaches that attempt to learn the structure of fingerprints (refs). While these techniques provide good results in biometric applications and serve a screening role in forensic cases, they are less useful when applied to severely degraded fingerprints, which must be matched by human experts. Indeed, statistical approaches and human experts have different strengths. Despite the enormous computational power available today for use by computer analysis systems, the human visual system remains unequaled in its flexibility and pattern recognition abilities. Three possible reasons for this success come from the experts knowledge of where the most important regions are located on a particular set of prints, the ability to tune their visual systems to specific features, and the integration of information across different features. In the present project, we propose to integrate the knowledge of experts into the quantitative analysis of fingerprints to a degree not achieved by other approaches. There is much that fingerprint examiners can add to machine learning algorithms and, as we describe below, many ways in which statistical learning algorithms can assist human experts. Thus the central research question of this proposal is: How can the integration of information derived from experts improve the quantitative analysis of fingerprints? B. Research goals and objectives The goal of the present proposal is to integrate data from human experts with statistical learning algorithms to improve the quantitative analysis of inked and latent prints. We introduce a novel procedure developed by one investigator (Tom Busey) and use it to guide the input to statistical learning algorithms developed and extended by our other investigator (Chen Yu). The fundamental idea behind our approach is that the quantitative evaluation of the information Page 1
  • 2. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen contained in latent and inked prints can be vastly improved by using elements of human expertise to assist the statistical modeling, as well as to introduce a new dimension of time that is not contained in the static latent print analysis. The main benefit, as we discuss in sections C.x.x, is that the format of the data extracted from experts allows the application of novel quantitative models that are adapted from related areas. To apply this knowledge derived from experts, we will use our backgrounds in vision, perception, machine learning and behavioral testing to design experiments that extract relevant information from experts and use this to improve the quantitative analysis techniques applied to fingerprints by integrating the two sources of information. Our research interests differ somewhat from the existing approaches and reflects the adaptations that are necessary to incorporate human expert knowledge. Existing statistical algorithms developed to match fingerprints rely on several different classes of algorithms, Some extract minutiae and other robust sources of information such as the number of ridges between minutiae (refs). Others rely on the computation of local curvature of the ridges, and then partition these into different classes (MASK refs). Virtually all approaches make reasoned and reasonable guesses as to what the important sources of information might be, such as minutiae, local ridge orientation or local ridge width (dgs paper). The present approach takes a more agnostic approach to what might be the important sources of information in fingerprints, and we will develop statistical models that take advantage of the data derived from experts. However, a major goal of the grant is to demonstrate how expert knowledge can be applied to any extant model, and to suggest how this might be accomplished. Thus we will spend substantial time documenting our application of expert knowledge for our statistical models. In addition, we will make all of our expert data available for other researchers and practitioners. It is likely that the data will have implications for training, although this is not the focus of the present proposal. C. Research design and methods At the heart of our approach is idea that human expertise, properly represented, can improve the quantitative analyses of fingerprints. In a later section we describe how we apply human Page 2
  • 3. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen expert knowledge to various statistical analyses, but first we need to answer the question of whether human experts can add something to the quantitative analyses of prints. The answer to this question can be broken down into two parts. First, do human visual systems in general possesses attributes not captured by current statistical approaches, and second, do human experts have additional capacities not shared by novices, capacities that could further inform statistical approaches. Below we briefly summarize what the visual science literature tells us about how humans recognize patterns, and then describe our own work that has addressed the differences between experts and novices. As we will show, human experts have much to add to quantitative approaches. We should stress that while we will gather data from human experts to improve our quantitative analyses of fingerprints, the goal of this grant is not to study human experts in order to determine whether or how they differ from novices, nor are we interested in questions about the reliability or accuracy of human experts. Instead, we will generalize our previous results that demonstrate strong differences in the visual processing of fingerprints in experts, and apply this expertise to our own statistical analyses. As a result, we will only gather data from human experts (latent print examiners with at least 5 years of post-apprentice work in the field) under the assumption that this will provide maximum improvement to our statistical methods. We can demonstrate the effectiveness of this knowledge by simply re-running the statistical analyses without the benefit of knowledge from experts. There are various metrics attached to each analysis technique that demonstrate the superiority of expert-enhanced analyeses, such as the correct recognition/false recognoition tradeoff graphs, or the dimensionality reduction/reconstruction successes of data reduction techniques. We will also apply novel approaches adapted from the related domain of language analyses. It might seem odd to apply techniques developed for linguistic analyses to a visual domain such as pattern recognition, but the principles that underlie both domains are very similar. Both involve large numbers of features that have complex statistical relations. In the case of language, the features are often words, phonemes or other acoustical signals. Fingerprints are defined by a Page 3
  • 4. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen complex but very regular dictionary of features that also share a complex and meaningful correlational structure. One of us (Chen) is a highly-published expert in the field machine learning algorithms as applied to multimodal data, and several papers inlcuded as appendicies detail this expertisze. His work on multimodal applications between visual and auditory domains make him well-suited to address the relation between human data and machnie leanring algorythms. Both linguistic and visual informaiton contain highly-structured data that consist of regularities that are extracted by perceivers, and this is not unlike the temporal sequence that experts go through when they perform a latent print examination, as we describe in a later section. First, however, we address how we might document the principles of human expertise. Can we use elements of the human visual system to improve our statistical analyses? The answer to this question is straightforward, in part because of the overwhelming evidence that human-based recognition systems contain processes that are not captured by current statistical approaches. One of us (Busey) has published many articles addressing different aspects of human sensation, perception and cognition, and thus is well-suited to manage the acquisition and application of human expertise to statistical approaches. Below we briefly summarize the properties of the human visual system and in a later section we describe how we plan to extract fundamental principles from this design in order to improve our statistical analyses of fingerprints. An analyses of the human visual system by vision scientists demonstrates that the recognition process proceeds via an hierarchical series of stages, each with important non-linearities (nature ref), that produce areas that respond to objects of greater and greater complexity. This process also provides increasing spatial independence, allowing brain areas to integrate over larger and larger regions. This will become important for holistic or configural processing, as discussed in a later section. (also talk about feature-based attention) A second benefit of this hierarchical approach is that objects achieve limited scale and contrast invariance. Statistical approaches often deal with this through local contrast or brightness normalization, but this is a separate process. Scale invariance is often achieved by Page 4
  • 5. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen explicitly measuring the width of ridges (grayscale ref), again a separate process. A third strength of the human visual system is that it appears to have the ability to form new feature templates through an analyses of the statistical information contained in the fingerprints. This process, called unitization, will tend to improve feature detection in noisy environments as is often found with latent prints. Do forensic scientists have visual capabilities not shared by novices? The prior summary of the elements of the human visual system suggests that current statistical approaches can be improved by adapting some of the principles underlying the human visual system. There are, however, other processes that are specifically developed by latent print examiners that may also be profitably applied to statistical models. Below we summarize the results of two empirical studies that have recently been published in the highly respected journal Vision Research (Busey & Vanderkolk, 2005). The results demonstrate not only that experts are better than novices, but suggest the nature of the processes that produce this superior performance. Visual expertise takes many forms. It could be different for different parts of the identification process, and may not even be verbalizable by the expert since many elements of perceptual expertise remain cognitively impenetrable (refs). A major focus of our research is to capture elements of this expertise and use this as a training signal for our statistical learning algorithms. What is novel to our approach is our ability to capture the expertise at a very deep and rich level. In the next section we describe our prior work documenting the nature of the processes that enable experts to perform at levels much superior to novices, and then in Section C.2 we describe how we capture this expertise in a way that we can use it to improve our statistical learning algorithms. C.1. Documenting expertise in human latent print examiners Initially, experts tend to focus on the entire print, which leads to benefits that we have previously identified as configural processing (Busey & Vanderkolk, 2005). Configural Page 5
  • 6. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen processing takes several forms, but the basic idea behind this process is that instead of focusing on individual features or minutiae, the observer instead integrates information over a large region, to identify important relations such as relative locations of features or curvature of ridge flow. Fingerprint examiners often talk about 'viewing the image in its totality', which is different language for the same process. While configural processing reveals the overall structure of an image and selects important regions for further inspection, the real work comes in comparing small regions in one print to regions in the other. These regions may be selected on the basis of minutiae identified in the print, or high-quality Level 3 detail. We know from related work on perceptual learning in the visual system that one of the processes by which expertise develops is through the development of new feature detectors. Experts spend a great deal of time viewing prints, and this has the potential to result in profound changes in how their visual systems process fingerprints. (config processing refs) One process by which experts could improve how they extract latent print information from noisy prints is termed unitization, in which novel feature detector are created through experience (unitization refs). Fingerprints contain remarkable regularities and the human visual system C.1.a. Do experts have information valuable to training networks or documenting the quantitative nature of fingerprints? Fingerprint examiners have received almost no attention in the perceptual learning or expertise literatures, and thus the PI began a series of studies in consultation with John Vanderkolk, of the Indiana State Police Forensic Sciences Laboratory in Fort Wayne, Indiana. Our first study addressed the nature of the expertise effects in a behavioral experiment, and then we followed up evidence for configural processing with an electrophysiological study. The discussion below describes the experiments in some detail, in part because extensions of this work are proposed in Section D, and a complete description here illustrates the technical rigor and converging methods of our approach. Page 6
  • 7. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen C.1.b. Behavioral evidence for configural processing In our first experiment, we abstracted what we felt were the essential elements of the fingerprint examination process into an X-AB task that could be accomplished in relatively short order. This work is described in Busey and Vanderkolk (2005), but we briefly describe the methods here since they illustrate how our approach seeks to find a paradigms that is less time-consuming than fully realistic forensic examinations (which can take hours to days to complete) yet still maintains enough ecological validity to tap the expertise of the examiners. Figure 1 shows the stimuli used in the experiment as well as a timeline of one trial. We cropped out fingerprint fragments from inked prints, grouped them into pairs, and briefly presented one of the two for 1 second. This was followed by a mask for either 200 or 5200 ms, and then the expert or novice subject made a forced-choice response indicating which of the two test prints they believe was shown at study. We introduced orientation and brightness jitter at study, and the construction of the pairs was done to reduce the reliance on idiosyncratic features such as lint or blotches. At test, we introduced two manipulations that we thought captured aspects of latent prints, as shown in Figure 2. First, latent prints are often embedded in visual noise from the texture of the surface, dust, and other sources. One expert, in describing how he approached latent prints, stated that his job was to 'see through the noise.' To simulate at least elements of this noise, we embedded half of our test prints in white visual noise. While this may have a spatial distribution Page 7 Study1 Sec Mas200 oMilli TestUntil Figure 1. Sequence of events in a behavioral experiment with fingerprint experts and novices. Note that the study image has a different orientation and is slightly brighter to reduce reliance on low-level cues.
  • 8. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen that differs from the noise typically encountered by experts, we hoped that it would tap whatever facilities experts may have developed to deal with noise. The second manipulation was motivated by the observation that latent prints are rarely complete copies of their inked counterparts. They often appear patchy if made on an irregular surface, and sections may be partially masked out. To simulate this, we created partially-masked fingerprint fragments as shown in the upper-right panel of Figure 2. Note that the partially- masked print and its complement each contain exactly half of the information of the full print and the full print can be recovered by summing the two partial prints pixel-by-pixel. We use this property to test for configural effects as described in a later section. All three manipulations (delay between study and test, added noise and partial masking) were fully crossed to create 8 conditions. The data is shown in Figure 3, which show main effects for all three factors for novices. Somewhat surprising is the finding that while experts show effects of added noise and partial masking, they show no effect of delay, which suggests that they are able to re-code their visual information into a more durable store resistant to decay, or have better visual memories. Experts also show an interaction between added noise and Page 8 Clear FragmPartially PartiallyPresentFragmentsPresented inFigure 2. Four types of test trials. 0.5 0.6 0.7 0.8 0.9 1.0 Full Image Partial Image Experts- Short Delay No Noise Noise Added Percent Correct Image Type 0.5 0.6 0.7 0.8 0.9 1.0 Full Image Partial Image Experts- Long Delay No Noise Noise Added Percent Correct Image Type 0.5 0.6 0.7 0.8 0.9 1.0 Full Image Partial Image Novices- Short Delay No Noise Noise Added Percent Correct Image Type 0.5 0.6 0.7 0.8 0.9 1.0 Full Image Partial Image Novices- Long Delay No Noise Noise Added Percent Correct Image Type Figure 3. Behavioral Experiment Data. Error bars represent one standard error of the mean (SEM).
  • 9. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen partial masking, but novices do not. This interaction seen with the experts may result from very strong performance for full images embedded in noise, and may result from configural processes. To test this in a scale-invariant manner, we developed a multinomial model which makes a prediction for full-image performance given partial-image performance using principles similar to probability summation. The complete results are found in Busey & Vanderkolk (2005), but to summarize, when partial image performance is around 65%, the model predicts full image performance to be about 75%, and it is almost at 90%, significantly above the probability summation prediction. Thus it appears that when both halves of an image are present (as in the full image) experts are much more efficient at extracting information from each half. The results of this experiment lay the groundwork for a more complete investigation of perceptual expertise in fingerprint examiners. From this work we have evidence that: 1) Experts perform much better than novices overall, despite the fact that the testing conditions were time-limited and somewhat different than those found in a traditional latent print examination. 2) Experts appear immune to longer delays between study and test images, suggesting better information re-coding strategies and/or better visual memories 3) Experts may have adopted configural processing abilities over the course of their training and practice. All observers have similar facilities for faces as a consequence the ecological importance of faces and our quotidian exposure as a result of social interactions. Experts may have extended this ability to the domain of fingerprints, since configural processing is seen as one mechanism underlying expertise (e.g. Gauthier & Tarr, 1997). C.1.c. Electrophysiological evidence for configural processing To provide converging evidence that fingerprint experts process full fingerprints configurally, we turned to an electrophysiological paradigm based on work from the face recognition literature. This experiment is described more fully in Busey and Vanderkolk (2005), which is included as an appendix. However, these results support the prior conclusions described above, and demonstrate that the configural processing observed with fingerprint examiners is a Page 9
  • 10. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen result of profound and qualitative changes that occur in the very earliest stages of their perceptual processing of fingerprints. C.2. Elements of human expertise that could improve quantitative analyses The two studies described above are important because they illustrate that configural information is one process that could be adapted for use in the quantitative analyses of fingerprints. Existing quantitative models of fingerprints incorporate some elements of the expertise seen above, but many elements could be added that would improve the recognition accuracy of existing programs. The two major approaches to fingerprint matching rely on local features such as minutiae detection (refs), and more global approaches such as dynamic masks applied to orientation computed at many locations on a grid overlaying the print (refs). Of these two approaches, the dynamic mask approach comes closer to the idea of configural processing, although it does not compute minutiae directly. strengthen this intro Neither approach takes advantage of the temporal information that expresses elements of expertise in the human matching process. Quantitative information such as fingerprint data, when represented in pixel form, has a highly-dimensional structure. The two techniques described above reduce this dimensionality by either extracting salient points such as minutiae, or computing orientation only at discrete locations. Both of these approaches throw out a great deal of information that could otherwise be used to train a statistical model on the elemental features that allow for matches. Part of the reason this is necessary is that the high-dimensional space is difficult to work in: all prints are more or less equally similar without this dimensionality reduction, and by reducing the dimensionality computations such as similarity become tractable. The key, then, is to reduce the dimensionality while preserving the essential features that allow for discrimination among prints. One technique that has been explored in language acquisition is the concept of "starting small" (Elman ref). In this procedure, machine learning approaches such as neural network analyses are given very coarse information at first, which helps the network find an appropriate starting point. Gradually, more and more detail information is added, which allows the network to make finer and finer discriminations. Page 10
  • 11. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen We discuss these ideas more fully in section X.Xx, but we mention it here to motivate the empirical methods described next. Experts likely select which information they choose to initially examine based on the need to organize their search processes. Thus they likely acquire information that may not immediately indicate to a definitive conclusion of confirmation or rejection, but guides the later acquisition process. In the scene perception literature, this process is known as 'gist acquisition' (refs), and suggests that the order in which a system (machine or human) learns information matters. In the section below we describe how we acquire both spatial and temporal information from experts, and then describe how this knowledge can be incorporated into quantitative models. C.3. Capturing the information acquisition process: The moving window paradigm To identify the nature of the information used by experts, and the order in which it is gathered, we have begun to use a technique called a moving window procedure. In the sections below we describe this procedure and how it can be extended to address the role of configural or gist information in human experts. C.3.a. The moving window paradigm The moving window paradigm is a software tool that simulate the relative acuity of the fovea and peripheral visual systems. As we look around the world, there is a region of high acuity at the location our eyes are currently pointing. Regions outside the foveal viewing cone are represented less well. In the moving window paradigm we represent this state by slightly blurring the image and reducing the contrast. http://cognitrn.psych.indiana.edu/busey/FingerprintExample/ Page 11
  • 12. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen Figure 4 shows several frames of the moving window program, captured at different points in time. The two images have been degraded by a blurring operation that somewhat mimics the reduced representation of peripheral vision. The exception is a clear circle that responds in real time to the movement of the mouse. This dynamic display forces the user to move the clear window to regions of the display that warrant special interest. The blurred portions provide some context for where to move the window. By recording the position of the mouse each time it is Page 12 Figure 4. The moving window paradigm allows the user to move the circle of interest around to different locations on the two prints. This circle provides high-quality information, and allows the expert the opportunity to demonstrate, in a procedure that is very similar to an actual latent print examination, which sections of the prints they believe are most informative. This procedure also records the order in which different sites are visited.
  • 13. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen moved, we can reconstruct a complete record of the manner in which the user examined the prints. This method has some drawbacks in that the eyes move faster than the mouse. However, we find that with practice the experts report very little limitations with this procedure and it has the benefit of precise spatial localization. A major benefit of this procedure is that it can be done over the web, reaching dozens of experts and producing a massive dataset. Many related information theoretic approaches such as latent semantic analysis find that a large corpus of data is necessary in order to reveal the underlying structure of the representation of information, and a web-based approach provides sufficient data. The data produced by this paradigm is vast: x/y coordinates for the clear window at each millisecond. We have begun to analyze this data using several different techniques. The first analysis we designed creates a mask that is black for regions the observer never visited and clear for areas visited most often. Figure 5 shows an example of this kind of analysis. Areas visited less often are somewhat darkened. The left panels of Figure 5 show two masked images, which shows not only where the experts visited, but how long they spent inspecting each location. Thus it represents a window into the regions the experts believed informative. The right panels give a slightly different view, where unvisited areas are represented in red. This illustrates that experts actually spend most of their time in relatively small regions of the prints. As a first pass, the images in Figure 5 reveal where the experts believe the task-relevant information resides. However, lost in such a representation is the order in which these sites were visited. In addition, this information is very specific to a particular set of print. Ultimately we will produce more general representation that characterizes both the fundamental set of features (often described as the basis set) that experts rely on, as well as how they process these features. We have begun to explore an information-theoretic approach to this problem that seeks to find a set of visual features that is common to a number of experts and fingerprint pairs. This approach is related to many of the dimensionality reduction techniques that have been applied to natural images (e.g. Olshausen & Field, 1996). Later project extend this approach to incorporate Page 13
  • 14. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen elements of configural processing or context-specific models. In the present proposal we discuss several different ways we plan to analyze what is a very rich dataset. Our experts report relatively little hindrance when using the mouse to move the window. The latent and inked prints have their own window (only one is visible at any one time) and users press a key to flip back and forth between the two prints. This flip is actually faster than an eyemovement and automatically serves as a landmark pointer for each print, making this procedure almost as easy to use as free viewing of the two prints (which are often done under a loupe with its own movement complexities). In addition, we also give users brief views of the entire image to allow configural processes to work to establish the basic layout. C.3.b. Measuring the role of configural processing in latent print examinations behavioral experiment- blurred vs. very low contrast- qualitative changes across experts? complete this section Page 14 Figure 5. Examples of masked imaged revealing where experts choose to acquire information in order to make an identification. The black versions show only regions where the expert spent any time, and the mask is clearer for regions in which the expert spent more time. The right-hand images show teh same information, but allow some of the uninspected information to show through. These images reveal that experts pay relatively little attention to much of the image and only focus on regions they deem releveant for the identification. We suggest that this element of expertise, learning to attend to relevant locations, is something that coudl benefit quantitative analyes of fingerprints.
  • 15. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen C.3.c. Verification with eyemovement recording complete this section C.4. Extracting the fundamental features used when matching prints Because latent and inked prints are rarely direct copies of each other, an expert must extract invariants from each image that survive the degradations due to noise, smearing, and other transformations. Once these invariants are extracted, the possibility of a match can be assessed. This is similar in principle to the type of categorical perception observed in speech recognition, in which the invariants of parts of speech are extracted from the voices of different talkers. This suggests that there exists a set of fundamental building blocks, or basis functions, that experts use to represent and even clean up degraded prints. The nature and existence of these features are quite relevant for visual expertise, since in some sense these are the direct outcomes of any perceptual system that tunes itself to the visual diet it experiences. We propose to perform data reduction techniques on the output of the moving window paradigm. These techniques have successfully been applied to derive the statistics of natural images (Hyvarinen & Hoyer, 2000). The results provided individual features that are localized in space and resemble the response profiles of simple cells in primary visual cortex. Many of these studies are performed on random sampling of images and visual sequences, but the moving window application provides an opportunity to use these techniques to recover the dimensions of only the inspected regions, and to compare the recovered dimensions from experts and representations based on random window locations. The specifics of this technique are straightforward. For each position of the moving window, extract out (say) a 12 x 12 patch of pixels. This is repeated at each location that was inspected by the subject, with each patch weighted by the amount of time spent at each location. The moving window experiment tens of thousands of patches of pixels, which are submitted to a data reduction technique (independent component analysis, or ICA), which is similar to principle components analysis, with the exception that the components are independent, not just Page 15
  • 16. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen uncorrelated. The linear decomposition generated by ICA has the property of sparseness, which has been shown to be important for representational systems (Field, 1994; Olshausen & Field, 1996) and implies that a random variable (the basis function) is active only very rarely. In practice, this sparse representation creates basis functions that are more localized in space than those captured by PCA and are more representative of the receptive fields found in the early areas of the visual system. Huge copra of samples are required to extract invariants from noisy images, and at present we have only pilot data from several experts. However, the results of this preliminary analysis can be found in Figure 6. This figure shows features discovered using the ICA algorithm (Hurri & Hyvarinen, 2003; Hyvarinen, Hoyer & Hurri, 2003). Each image represents a basis function that when linearly combined will reproduce the windows examined by experts. Inspection of Figure 6 reveals that features such as ridge endings, y-brachings and islands are beginging to become represented. This analysis takes on greater value when applied to the entire database we will gather, since it will combine across individual features to derive the invariant stimulus features that provide the basis for fingerprint examinations done by human experts. The ICA analysis is very sensitive to spatial location, and while cells in V1 are likely also highly position sensitive, the measured basis functions are properties of the entire visual stream, not just the early stages. More recent advances in ICA techniques have addressed this issue in a similar way that the visual system has chosen to solve the problem. In addition performing data Page 16 Figure 6. ICA components from expert data.
  • 17. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen reduction techniques to extract the fundamental basis sets, these extended ICA algorithms group the recovered components based on their energy (squared outputs). This grouping has shown to produce classes of basis functions that are position invariant by virtue of the fact that they include many different positions for each fundamental feature type. The examples shown in Figure 7 were generated by this technique, which reduces the reliance on spatial location. This groups the recovered features by class and accounts for the fact that rectangles have similar properties to nearby rectangles. Note that the features in Figure 14 are less localized than those typically found with ICA decompositions, which may be due to the large correlational structure inherent in fingerprints, although this remains an open question addressed by this proposal. The development of ICA approaches is an ongoing field, and we anticipate that the results of the proposed research will help extend these models as we develop our own extensions based on the applications to fingerprint experts. There are several ways in which the recovered components can be used to evaluate the choice of positions by experts (which ultimately determine, along with the image, the basis functions). First, one can visually inspect the sets of basis functions recovered from datasets produced by experts, and compare this with one generated from random window locations. A second technique can be used to demonstrate that experts do indeed posses a feature set that differs from a random set. The data from random windows and experts can be combined to Page 17 Figure 7. ICA components from expert data, and grouped by energy. This analyses allows the basis functions to have partial spatial independence, at a slight cost to image quailty. This latter issue is less relevant for larger corpi when many similar features are combined by individual basis function groups.
  • 18. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen produce a common set of components (basis functions). ICA is a linear technique, and thus the original data for both experts and random windows can be recovered through weighted sums of the components, with some error if only some of the components are saved. If experts share a common set of features that is estimated by ICA, then their data should be recovered with less error than that of the random windows. This would demonstrate that an important component of expertise is the ability to take a highly dimensional dataset (as produced by noisy images) and reduce it down to fundamental features. From this perspective, visual expertise is data reduction. These kinds of data reduction techniques serve a separate purpose. Many of the experiments described in other sections of this proposal depend on specifying particular features. While initial estimates of the relevant features can be made on the basis of discussions with fingerprint experts, we anticipate that the results of the ICA analysis will help refine our view of what constitutes an important feature within the context of fingerprint matching. The moving window procedure has the disadvantage of being a very localized procedure, due to the nature of the small moving window. There is a fundamental tradeoff between the size of the window and the spatial acuity of the procedure. If the window is made too large, we know less about the regions from which the user is attempting to acquire information. To offset this, we have provided the user the opportunity to view quick flashes of the full image, enough to provide an overview of the prints, but not enough to allow matches of specific regions. We will also conduct the studies using large and small windows to see whether the nature of the recovered components changes with window size. C.4. Starting Small: Guiding feature extraction with expert knowledge We need to ask whether this is compelling, and cut it if it is not. Feature extraction procedures attempt to take a high dimensional space and use the redundancies in this space to derive a lower-dimensional representation that combine across the redundancies to provide a basis set. This basis set can be thought of as the fundamental feature set, and the development of this set can be thought of as one mechanism underlying human expertise. The difficulty with these highly-dimensional spaces is that algorithms that attempt to Page 18
  • 19. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen uncover the feature set through iterative procedures like Independent Component Analysis or neural networks may fall into local minima and fail to converge upon a global solution. One solution that has been proposed in the human developmental literature is one of starting small (Elman, 1993). In this technique, programmers intially restrict the inputs to statistical models to provide general kinds of information rather than specific information that would lead to learning of specific instances. As a network matures, more specific information is added, which allows the network to avoid falling into local minima that represent non-learned states. While the exact nature of these effects are still being worked out (Rohde & Plaut, 1999), recent work has provided empirical support in the visual domain (Conway, Ellefson & Christiansen, ref). This suggests that we might use the temporal component of the data from experts in the moving window paradigm to help guide the training of our networks. As an expert views a print, they initially are likely to focus on broad, overall types of information that give the need to finish if necessary C.5. Automatic detection of regions of interest using expert knowledge In both fingerprint classification (e.g. Dass & Jain, 2004; Jain, prabhakar & Hong 1999; Cappelli, Lumini, Maio & Maltoni, 1999) and fingerprint identification (e.g. Pankanti, Prabhakar & Jain, 2002; Jain, Prabhakar & Pankanti, 2002) applications, there are two main components for an automatic system: (1) feature extraction and (2) matching algorithm to compare (or classify) fingerprints based on feature representation. The feature extraction is the first step to convert raw images into feature representations. The goal is to find robust and invariant features to deal with various conditions in real-world applications, such as illumination, orientation and occlusion. Given a whole image of fingerprint, most fingerprint recognition systems utilize the location and direction of minutiae as features for pattern matching. In our preliminary study of human expert behaviors, we observe that human experts focus on just parts of images (regions of interest – ROIs) as shown in Figure XX, suggesting that it is not necessary for a human expert to check through all minutiae in a fingerprint. A small subset of minutiae seems to be sufficient for the human expert to make a judgment. What regions are useful for matching among all the Page 19
  • 20. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen minutiae in a fingerprint? Is it possible to build an automatic ROI detection system that can achieve a similar performance as a human expert? We attempt to answer this question by building a classification system based on the training data captured from human experts. Given a new image, the detection system is able to automatically detect and label regions of interest for the matching purpose. We want to note that we expect that most regions selected by our system will be minutiae but we also expect that the system will potentially discover the structure regularities from non-minutia regions that are overlooked in previous studies. Different from previous studies of minutiae detection (e.g. Maio & Maltoni, 1997), our automatic detection system will not simply detect minutiae in a fingerprint but focus on detecting both a small set of minutiae and other useful regions for the matching task. Considering the difficulties in fingerprint recognition, building this automatic detection system is challenging. However, we are confident that this proposed research will be first steps toward the success and make important contributions. This confidence lies in two important factors that make our work different from other studies: (1) we will record detailed behaviors of human experts (e.g. where they look in a matching task) and recruits the knowledge extracted from human experts to build a pattern recognition system; and (2) we will apply state-of-art machine learning techniques in this study to efficiently encode both expert knowledge and regularities in fingerprint data. The combination Page 20 Figure X. The overview of automatic detection of regions of interest. The red regions in the fingerprints indicate whether human expert focus on in pattern matching task.
  • 21. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen of these two factors will lead us to achieve this research plan. To build this kind of system, we need to develop a machine learning algorithm and estimate the parameters based on the training data. Using the moving window paradigm (described in C.3), we collect the information of where a human expert examines from moment to moment when he performs a matching task. Hence, the expert’s visual attention and behaviors (moving the windows) can be utilized as labels of regions of interest – providing the teaching signals for a machine learning algorithm. In the proposed research, we will build an automatic detection system that captures the expert’s knowledge to guide the detection of useful regions in a fingerprint for pattern matching. We will use the data collected from C.X. Each circular area examined by the expert is filtered by a bank of Gabor filters. Specifically, the Gabor filters with three scales and five orientations are applied to the segmented image. It is assumed that the local texture regions are spatially homogeneous, and the mean and the standard deviation of the magnitude of the transform coefficients are used to represent an object in a 48-dimensional feature vector. We reduced the high-dimensional feature vectors into the vectors of dimensionality 10 by principle component analysis (PCA), which represents the data in a lower dimensional subspace by pruning away those dimensions with the least variance. We also randomly sample other areas that the expert doesn’t pay attention to and code these areas with a Non-ROI label which is paired with feature vectors extracted from these areas. In total, the training data consists of two groups of labeled features – ROI and Non-ROI. Next, we will build a binary classifier based on Support vector machines (SVMs). SVMs have been successfully applied to many classification tasks (Vapnik 1995; Burges 1998). A SVM trains a linear separating plane for classifying data, through maximizes the margins of two parallel planes near the separating one. The central idea is to nonlinearly map the input vector into a high-dimensional feature space and then construct an optimal hyperplane for separating the features. This decision hyperplane depends on only a subset of the training data called support vectors. Page 21
  • 22. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen For a set of n-dimensional training examples, m iix 1}{ ==Χ labeled by expert’s visual attention m iiy 1}{ = , and a mapping of data into q-dimensional vectors m iixX 1)}({)( == φφ by kernel function where nq >> , a SVM can be built on the set of mapping training data based on the solution of the following optimization problem: Minimizing over ),...,,,( 1 mbw ξξ the cost function: ∑= + m i i T Cww 12 1 ξ Subject to: ii Tm i bxwy ξφ −≥+∀ = 1))((:1 and 0≥ιξ for all i Where C is a user-specified constant for controlling the penalty to the violation terms denoted by each iξ . The ξ is called slack variables that measure the deviation of a data point from the ideal condition of pattern separability. After training, w and b constitute of the classifier: ))(( bxwsigny T += φ Compared with other approaches used in fingerprint recognition, such as neural networks and k-nearest neighbors, SVMs have demonstrated more effective in many classification tasks. In addition, we first transform original features into a lower-dimensional space based on PCA. The purpose of this first step is to deal with the curse of dimensionality. We then map the data points into another higher-dimensional space so that they are linearly separable. By doing so, we convert the original pattern recognition problem into a simpler one. This idea is quite in line with kernel-based nonlinear PCA (Scholkopf, Smola & Muller 1998) that have been successfully used in several fields (e.g. Wu, Su & Carpuat 2004). Given a new testing fingerprint, we will shift a 40x40 window over the image and classify all the patches at each location and scale. The system will first extract Gabor-based features from local patches which will be the input to the detector. The detector will label all the regions as either ROI or Non-ROI. We expect that most ROIs are minutiae. Different from the methods based on minutiae matching, we also expect that only a small of minutiae are utilized by human experts. Moreover, we expect the system to detect some areas that are not defined as minutiae but human experts also pay attention to during the matching task. Thus, the ROI detector we develop will go beyond the standard approach in fingerprint recognition (minutiae extraction and Page 22
  • 23. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen matching). By efficiently encoding the knowledge of human expert, the proposed system will have opportunities to discover the statistical regularities in fingerprints that have been overlooked in previous studies. C.6. Using expert-identified correspondences to extract environmental models In our moving window paradigm, a human expert moves the window back and forth between inked and latent fingerprints to perform pattern matching. We propose that the dynamic behaviors of the expert provide additional signals indicating one-to-one correspondences between two images. In light of this, our hypothesis is that an expert’s decision is based on the comparison of these one-to-one patches. Therefore, we propose that these expert-identified correspondences can serve as additional information to find the regularities in fingerprint and build the automatic detection system. We propose to use this knowledge as a prior for the training data. We observe that not all the focused regions in the latent print have the corresponding regions in the inked print. Thus, it is more likely that those one-to-one pairs play a more important role in pattern matching than other regions of interest. Based on this observation, we propose to maintain a set of weights over the training data. More specifically, for each ROI in the latent image, we find the most likely pairing patch in the inked image. Two constraints guide the searching of the matching pair. The temporal constraint is based on the expert’s behaviors. For instance, the patch in the inked pair that the expert immediately examine (right after looking at the ROI in the latent image) is more likely to associate with that ROI in the latent pair. The spatial constraint is to find the highest similarity of the patch in the latent image and any other patch in the inked image. In this way, each ROI in the latent image can be assigned with a weight indicating the probability to map this region to a region in the other image. With a set of weighted training data, we will apply a SVM-based algorithm (briefly described in C.5) which will focus on the paired samples (with high weights) in the training data. More specifically, we replace the constant C in the standard SVM with a set of variables ic , each of which corresponds to the weight of a data point. Accordingly, the new Page 23
  • 24. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen objective function is ∑= + m i ii T cww 12 1 ξ . Thus, the matching regions receive more penalties if they are nonseparable points while other regions receive less attention because it is more likely that they are irrelevant to the expert’s decision. Thus, the parameters of the SVM are tuned up to favorite the regions that human experts are especially interested in. By encoding this knowledge in a machine learning algorithm, we expect that this method will lead to a better performance by closely imitating the expert’s decision. C.7. Dependencies between global and local information: The role of gist information Fingerprints are categorized into several classes, such as whorl, right loop, left loop, arch, and tented arch in the Henry classification system (Henry 1900). In the literatures, researchers use only 4-7 classes in an automatic classification system. This is because the task of determining a fingerprint class can be difficult. For example, it is hard to find robust features from raw images that can aid classification as well as exhibit low variations within each class. In C.5 and C.6, we discuss how to use expert knowledge to find useful features for pattern matching. By taking a bigger picture of feature detection and fingerprint classification in this section, we find that we need to deal with a chicken-and-egg problem: (1) useful local features can predict fingerprint classes; and (2) a specific fingerprint class can predict what kinds of local regions likely occur in this type of fingerprint. In contrast, standard alone feature detection algorithms (e.g. in C.5 and C.6) usually look at local pieces of the image in isolation when deciding whether the patch is a region of interest. In machine learning, Murphy, Torralba and Freeman (2003) proposed a conditional random filed for jointly solving the tasks of object detection and scene classification. In light of this, we propose to use the whole image context as an extra source of global information to guide the searching of ROIs. In addition, a better set of Page 24 Figure X. The overview of automatic detection of regions of interest. The red regions in the fingerprints indicate whether human expert focus on in pattern matching task.
  • 25. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen ROIs will also potentially make the classification of the whole fingerprint more accurate. Thus, the chicken-and-eggs problem is tackled by a bootstrapping procedure in which local and global pattern recognition systems interact with and boost each other. We propose a machine learning system based on graphical models (Jordan 1999) as shown in Figure XX. We define the gist of image as a feature vector extracted from the whole image by treating it as a single patch. The gist is denoted by Gv . Then we introduce a latent variable T describing the type of fingerprint. The central idea in our graphical model is that ROI presence is conditionally independent given the type and the type is determined by the gist of image. Thus, our approach encodes the contextual information on a per image basis instead of extracting detailed correlations between different kinds of ROIs (e.g. a fix prior such as the patch A always occurs to the left of the patch B) because of the complexity and variations of detailed descriptions. Next we need to classify fingerprint types. We will simply train a one-vs-all binary SVM classifier for recognizing each fingerprint type based on the gist. We will then normalize the results: ∑ = = == ' ' )|1( )|1( )|( t G t G t G vTp vTp vtTp where )|1( Gt vTp = is the output of tth one-vs-all classifier. As far as the fingerprint type is known, we can use this information to facilitate ROI detection. As shown on the tree-structured graphical model in Figure XX, the following conditional joint density can be expressed as follows: ∏∑∏ == i t ititG i iiGN vTRpTpvTp z vTRpvTp z vRRTp ),|()()|( 1 ),|()|( 1 )|,...,,( 1 Page 25
  • 26. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen Where Gv and iv are local and global features respectively. iR is the class of a local patch. In the proposed research, we will investigate two types of R. One classification defines ROI and Non-ROI types which is the same with C.5 and C.6. The other classification defines several minutia types (plus Non-ROI) such as termination minutia and bifurcation minutia. z is a normalizing constant. Based on this graphical model, we will be able to use contextual knowledge to facilitate the classification of a local image. We also plan to develop a more advanced model which will use local information to facilitate the fingerprint type classification. We expect that this kind of approach will lead to a more effective automatic system that can perform both top-down inference (fingerprint types to minutia types) and bottom-up inference (minutia types to fingerprint types). C.8. Summary of quantitative approaches (Tom writes) General themes: Incorporate expert knowledge Links between global and local structure made possible by input from experts Specification of elemental basis or feature set Classifying informativeness of regions Defining an intermediate level between low-level feature extractors and high-level gist or configural information D. Implications for knowledge and practice The implications of the knowledge gained by the results of these studies and analyses falls into four broad categories, each of which are discussed below. D.1. Implications for quantitative understanding of the information content of fingerprints Page 26
  • 27. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen D.2. Implications for an understand of the links between quantitative information content and the latent print examination process D.3. Implications for the classification and filtering of poor-quality latent prints D.4. Implications for the development of software-based tools to assist human-based latent print examinations and training E. Mangement plan and organization F. Dissemination plan for project deliverables scientific articles, presentations at machine learning conferences and fingerprint conferences, proof-of-concept Java-based applets. (end of 30 pages) Page 27
  • 28. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen G. Description of estimated costs Personnel The project will be co-directed by Thomas Busey and Chen Yu. We request 11 weeks of summer support, during which time both will devote 100% of their efforts to the project. Benefits are calculated at 19.81%. The salaries are incremented 3% per year. Many of the simulations will be conducted by a graduate student, who will be hired specifically for the purposes of this project. This student, likely an advanced computer science student with a background in cognitive science, requires a stipend, a fee remission and health insurance. The health insurance is incremented at 5% per year. Subject coordination and database management will be coordinated by hourly students who will work 20 hours/wk on the project. We will pay them $10/hr. Consultant John Vanderkolk, with whom Busey has worked with for the past two years, has agreed to serve as an unpaid consultant on this grant. He does require modest travel costs when he visits Bloomington. Travel Money is requested to bring in four experts for testing using the eyemovement recording equipment. These costs will total approximately $1500/yr. Money is requested for three conferences a year. These will enable the investigators to travel to conferences such as Neural Information Processing (NIPS) and forensic science conferences such as the International Association for Identification (IAI) to interact with colleagues and share the results of our analyses. These trips serve an important role in communicating the efforts of this grant to a wider audience. Other Costs Equipment This research is very computer-intensive, and thus we require a large UNIX-based server to run simulations in parallel. In addition, we require three pc-based workstations to run Matlab and Page 28
  • 29. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen other simulations programs. Finally, conferences such as IAI and local Society for Identification meetings provide an ideal place to gather data from experts, and thus we require a portable computer for such onsite data-gathering purposes. We anticipate that up to half of our data can be collected using these on-site techniques, and this technique is preferable because we have control over the monitor and software. Thus the laptop computer represents a good investiment in the success of the project. Other costs The graduate student line requires a fee remission each year. The fee remission is incremented at 5% per year. The results of our studies require resources to reach a wide audience, and thus we require dissemination costs to cover the costs of publication and web-based dissemination. This project is highly image-intensive, and we require money to purchase image-processing software and upgrades. These include software packages such as Adobe Photoshop, as well as new image processing packages as they become available. We will test 80 subjects a year to obtain the necessary data for use in our statistical applications. Each subject requires $20 for the approximate 90 minute testing period. The project will consume supplies of approximately $100/month, for items such as backups, power supplies, etc. Indirect Costs The indirect rate negotiated between Indiana University and the federal government is set at 51.5%. This rate is assessed against all costs except the fee remission. This was negotiated with DHHS on 5.14.04 G. Staffing plan and Resources Both Busey and Chen maintain laboratories in the Department of Psychololgy at Indiana University that each contain approximately 700 sq. feet of space. These have subject running rooms, offices and spaces for servers. Chen's lab contains an eyemovement recording setup that is sufficent for the eyemovement porition of the experiments. Both investigators have offices in Page 29
  • 30. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen the Psychology department as well. We will recruit a graduate student from the Computer Science or Psychology programs at Indiana University. This student must have experience with machine learning algorythms at a theoretical level, and also be an expert programmer. They will work 20 hrs/wk. We will also recruit two hourly undergraduate students to coordinate the subject running, data analysis and server maintainance. They will also be responsible for managing the data repository site where our data will be accessible by other reserachers who wish to integrate human expert knowledge into their networks. The bulk of the theoretical work will be handled by Chen and Busey, while the graduate student will work in impliemnation and model testing. H. Timeline This is a multi-year project that is designed to alternate between acquiring human data and using it to refine the quantitative analyses of latent and inked prints. Year 1: Acquire necessary fingerprint databases. Begin testing 80 experts on 72 different latent/inked print pairs. Program Support Vector and Global Local models. Test 2 experts on the eyemovement equipment using all 72 prints. Year 2: Test an additional 80 experts on 72 new latent/inked prints. Begin model fitting and refinement. Test 2 experts on the eyemovement equipment using all 72 prints. Compare results from eyemovement studies and moving window studies. Year 3: Test the final 80 experts on 72 new latent/inked prints. Develop new versions of statistical models based on prior results. Put entire database online for use by other researchers. Disseminate results to peer-reviewed journals. Page 30