Canosa Saliency Based Decision Support

Saliency-Based Decision Support
Roxanne L. Canosa∗
Rochester Institute of Technology

Figure 1: Examples of types of information collected from eye-tracking data. A large open circle indicates a lesion location, an ’X’ indicates
a mouse click, and a small (open or filled) circle indicates a fixation. A large circle without an ’X’ indicates a false negative (search error);
a small, unfilled circle indicates a fixation less than 350 msec (recognition error); a small filled circle indicates a fixation greater than 350
msec (decision error). Left, participant correctly located all three lesions. Right, four search errors and one decision error.

Abstract of interest, and assigns to each region a weight according to the
computed saliency. For example, bright colors, high luminance ar-
A model of visual saliency is often used to highlight interesting eas, edges, and corners may rate highly in visual saliency and thus
or perceptually significant features in an image. If a specific task would be assigned a higher probability of fixation. If image fea-
is imposed upon the viewer, then the image features that disam- tures of a particular target are known in advance, these can be used
biguate task-related objects from non-task-related locations should to modulate the relative saliency for enhanced target discernability.
be incorporated into the saliency determination as top-down infor-
mation. For this study, viewers were given the task of locating po- The saliency map used for this study is an adaptation of a well-
tentially cancerous lesions in synthetically-generated medical im- known saliency model [Itti et al. 1998]. Map generation took ap-
ages. An ensemble of saliency maps was created to model the tar- proximately 90 seconds using MATLAB on a 1.8 GHz Intel dual
get versus error features that attract attention. For MRI images, core processor. The map consists of three essential feature - color,
lesions are most reliably modeled by luminance features and errors luminance, and oriented edges. The color map is further separated
are mostly modeled by color features, depending upon the type of into two color-opponent process features - the red/green compo-
error (search, recognition, or decision). Other imaging modalities nent and the blue/yellow component. The final map is constructed
showed similar differences between the target and error features as the summation, at each pixel, of the contribution from each fea-
that contribute to top-down saliency. This study provides evidence ture at that pixel location. Equation 1 shows how the feature maps
that image-derived saliency is task-dependent and may be used to are combined to produce the saliency map used for target loca-
predict target or error locations in complex images. tion. ’C’ indicates the color map, ’I’ indicates the luminance map,
’E’ indicates the oriented-edge map, and ’P’ is a high-level proto-
1 Introduction object map that locates potential objects from highly textured re-
gions in the image. Essentially, the only features that are used for
A saliency map is a computational model of human visual per- the saliency map are color, luminance, and textured edges; it is the
ception that defines a relationship between the components of a relative weight of each feature according to the target type that de-
scene and the relative importance of those components to the viewer termines the final contribution of each individual feature to the final
[Koch and Ullman 1985]. A saliency map includes a priority rating saliency map. The relative weights of the features are determined
of each of the components and a gating mechanism whereby se- using a technique described below.
lected regions are processed and non-selected regions are inhibited.
According to the theory, the visual system performs an initial low-
frequency parsing of the environment to identify potential regions saliency map = (C ∗ w1 + I ∗ w2 + E ∗ w3 + P ∗ w4)/4 (1)

∗ e-mail: rlc@cs.rit.edu
Target saliency was derived from the image features at the known
Copyright © 2010 by the Association for Computing Machinery, Inc. lesion locations. Error saliency was derived from mouse clicks and
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
fixations locations recorded during a target search task. Errors were
for commercial advantage and that copies bear this notice and the full citation on the classified as false positives (mouse click on a non-target location)
first page. Copyrights for components of this work owned by others than ACM must be and three categories of false negatives: search error (target never
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on fixated), recognition error (target fixated less than 350 millisec-
servers, or to redistribute to lists, requires prior specific permission and/or a fee.
Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail
onds), and decision error (target fixated greater than 350 millisec-
permissions@acm.org. onds) [Krupinski 2000].
ETRA 2010, Austin, TX, March 22 – 24, 2010.
© 2010 ACM 978-1-60558-994-7/10/0003 $10.00

61

2 Method
An ASL Model 504 remote eye-tracker was used for this experi-
ment, along with the ASL Eye-trac 6 User Interface Software and
Control Unit. 19 participants from the campus community (nine
males and ten females between the ages of 18 and 58) were re-
cruited, all na¨ve with respect to the purpose of the experiment, and
ı
with no prior experience locating lesions in radiological images. All
participants were screened for normal color vision and normal or
corrected-to-normal vision and were allowed an unlimited amount
of time to detect as many targets in each image as possible. The eye-
tracking session lasted approximately 30 minutes per participant,
including calibration time. Prior to the start of the experiment, each
participant was given an instruction sheet with information about
how the experiment would proceed. The instructions stated, in gen-
eral, that a feature would appear as a circular spot in the image and Figure 2: Unweighted saliency map using only low-level features
could be located anywhere within the anatomical portion of the im- of color, luminance, and oriented edges (left) and after thresholding
age (i.e., a feature would never be located on the image border). at 0.45 (right). Lesions are shown surrounded by a white square.
Since the participants used in this study were not radiologists, the
results are not directly applicable to a clinical setting; however, un-
trained observers might still provide useful information about the
The map score is used to determine how well a saliency map models
target and error features that attract attention during search in com-
attention. If the score is close to one, then the map is not a good
plex imagery.
model of attention - since St is nearly equal to Sm , any random
The experiment consisted of monitoring and recording participants’ location would do just as well at predicting the response. If, on
fixation locations, fixation durations, and mouse clicks as they the other hand, the score is greater than one, then the map is a good
viewed eleven sets of six simulated brain images (66 images to- (better than random) model of attention because the target locations
tal). Simulated lesions with known size, shape, contrast, and lo- tend to be on regions of the image that the model has computed as
cation were inserted into the images at random locations. Each being highly salient.
image had between zero and five lesions. The images were gen-
The scoring procedure is repeated with a different set of weights to
erated from single-mode PET and MRI phantoms and multi-mode
produce another candidate map, and stops when the highest possi-
fused PET/MRI images. Three sets of fused images were used,
ble score is produced. Since an exhaustive search across the en-
each set using a different color look-up table for displaying the
tire weight space is computationally prohibitive, a genetic algo-
mixed modes. The fused images were sub-divided into three cat-
rithm was developed to find approximately optimal weights, using
egories depending upon whether the lesions were embedded in the
the scoring metric described above as the fitness criteria. The ge-
PET image, the MRI image, or both. Figure 1 shows examples of
netic algorithm was initialized with random weights for each fea-
the fused PET/MRI images and the types of information that was
ture map, and then over each generation (300 total) the two high-
collected during the experiment.
est scores were selected to randomly exchange their weights, with
crossovers and mutations allowed according to established param-
3 Determining Feature Weights eters. A total of 2,400 trials were run before a solution converged.

A na¨ve saliency map weights each of the low-level feature maps
ı Figure 2 shows an example saliency map generated using only low-
(color, luminance, and orientation) equally in the final summation level features (color, luminance, and oriented edges) without any
step. An optimally weighted map would take into account the rela- task- or target-related information (as in the standard model [Itti
tive importance of any feature for the target type. To determine the et al. 1998]). Figure 3 shows the same image with the saliency map
optimal feature weights, a metric was developed to “score” a map, generated using weights learned from the genetic algorithm and ap-
givn a specific weight vector. A map score is defined simply as the plied to the low-level feature maps. For this example the targets are
ratio of the mean target saliency, St at some pre-defined locations lesions, with locations indicated on the image by a white square.
to the mean saliency of the entire map, Sm . Figure 4 shows the weighted saliency map found for an MRI im-
age with five lesions applied to an MRI image with 3 lesions. The
Score = St / Sm . weighted map is able to correctly predict lesion locations in this
different test image.
Mean target saliency St is found by first generating a saliency
map using a random set of weights for a particular input image.
Next, the x,y-coordinates of a set of target locations are determined 4 Results
from the eye-tracking data, ground-truth data, or from a record
of observer responses (mouse clicks). For each target location, An ensemble of (approximately) optimally-weighted saliency maps
the x,y-coordinate is used as an index into the saliency map, and was created, one for each of the different target types - lesion loca-
the saliency value at that location is extracted. A 7x7 pixel win- tions, false positives, search errors, recognition errors, and decision
dow (corresponding to 1/4◦ visual angle at the viewing distance of errors. The map feature weights for lesions locations are frequently
52 cm) is centered on the location, and all saliency values falling different from the map feature weights for errors. For example,
within the window are averaged together. This procedure is re- Figure 5 shows that the highest weighted feature for lesions in the
peated for every target location in the map and the average of those MRI images is luminance; however, for all of the MRI errors, the
values is used as the mean target saliency, St . The mean map highest weighted feature is the blue-yellow color-opponent feature.
saliency, Sm is the average saliency over all locations in the map Other imaging modalities also showed significant differences be-
(target and non-target). The score of a map is then simply the ratio tween feature weights for target and error locations. This may be an
between the mean target saliency and the mean map saliency. indication that visual search, recognition, and decision errors arise

62

Figure 5: Relative weights of the low-level feature maps that are
combined (summed) together to create the saliency map for the MRI
images. Note that low-level features of the search target (lesions)
are dominated by luminance information, whereas the low-level
features that attract attention for each of the four error types are
dominated by the blue-yellow color feature.

Figure 3: Weighted saliency map with weights determined using a
genetic algorithm optimized for target type (left) and after thresh- from specific attentional characteristics that differ from those for
olding at 0.45 (right). Lesions are shown surrounded by a white correct detection in a search task. This information might be useful
square. in a decision-support or computer-aided detection (CAD) system,
to highlight or otherwise flag locations in the image that have a
high probability of incorrect classification.

5 Conclusion
Low-level features such as luminance, color, and edges can attract
the attention of the human visual system during a search task, and
those features are specific to certain types of targets. More re-
search into the nature of decision-making at the level just below
that of conscious awareness, such as is enabled by eye-tracking ex-
periments, will help to uncover the pre-conscious biases and strate-
gies that contribute to image interpretation, as well as image mis-
interpretation.

Acknowledgements
Thanks to Karl Baum for generation of the MRI images.

References
I TTI , L., KOCH , C., AND N IEBUR , E. 1998. A model of saliency-
based visual attention for rapid scene analysis. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence 20, 11,
1254–1259.
KOCH , C., AND U LLMAN , S. 1985. Shifts in selective visual
attention: Towards the underlying neural circuitry. Human Neu-
robiology 4, 219–227.
K RUPINSKI , E. A. 2000. The importance of perception research
Figure 4: Weighted saliency map on different image, thresholded at in medical imaging. Radiation Medicine 18, 6, 329–324.
0.45. Lesion locations are correcly predicted by the saliency map.

63

Canosa Saliency Based Decision Support

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Canosa Saliency Based Decision Support

Similar to Canosa Saliency Based Decision Support (20)

More from Kalle

More from Kalle (20)

Canosa Saliency Based Decision Support