Gaze visualizations represent an effective way for gaining fast insights into eye tracking data. Current approaches do not adequately support eye tracking studies for three-dimensional (3D) virtual environments. Hence, we propose a set of advanced gaze visualization techniques for supporting gaze behavior analysis in such environments. Similar to commonly used gaze visualizations for twodimensional
stimuli (e.g., images and websites), we contribute advanced 3D scan paths and 3D attentional maps. In addition, we introduce a models of interest timeline depicting viewed models, which can be used for displaying scan paths in a selected time segment. A prototype toolkit is also discussed which combines an implementation of our proposed techniques. Their potential for facilitating eye tracking studies in virtual environments was supported by a user study among eye tracking and visualization experts.
2. 2.2 Models of Interest Timeline
A timeline visualization maps data against time and facilitates find-
ing events before, after, or during a given time interval. Thus, it can
be used to answer several questions, such as: Has object x been
observed repetitively? In what order and for how long are objects
looked at? In this regard, the Object vs. Time View [Lessing and
Linge 2002] is a series of timelines for depicting Areas of Interest
(AOIs). Each AOI is assigned a time row, resulting in an AOI ma-
Figure 1: Two alternative fixation representations, spheres (a) and trix. With an increasing number of AOIs and shorter fixation times,
cones (b), are presented for 3D scan paths. the table size increases and may hinder a good overview.
Therefore, we propose a space-filling models of interest (MOI)
timeline (see Figure 3) for a compact illustration that effectively
utilizes assigned screen space by omitting voids. We suggest to use
the term Model of Interest to describe distinct 3D objects. The MOI
timeline gives an overview about a user’s gaze distribution based on
viewed models. Each model is labeled with a specific color, which
can be manually adapted. By assigning the same value to different
objects, semantic groups can be defined, for example, for group-
ing similar looking models, object hierarchies or closely arranged
items. A legend is displayed for providing an overview of assigned
colors (see in the left area of Figure 5).
Figure 2: Two examples of camera paths (viewpoints and viewing
directions) and fixations. Zooming and selection techniques may aid in making data visible
which would otherwise be suppressed due to limited display sizes.
Based on TimeZoom [Dachselt and Weiland 2006], the MOI time-
mation into meaningful units (fixations). While fixations indicate line can be dragged to a desired position to offer horizontal scrolling
aspects attracting an observer’s attention, saccades provide infor- and supports continuous zooming (e.g., via a zoom slider, see Fig-
mation about how fixations are related to each other. Thereby, fix- ure 3). For this purpose, the viewing area is divided into a large
ations are usually displayed as circles varying in size depending on detail area for zooming (see Figure 3) and a small overview area
the fixation duration and saccades as straight lines. displaying the entire collection. Displayed scan and camera paths
can be filtered with respect to a selected period. For this purpose,
Scan paths are frequently used for static (e.g., images and texts) navigation markers will appear if the path visualizations are em-
and dynamic (e.g., videos) 2D stimuli. For other stimuli, superim- ployed (see Figure 3), otherwise the markers are hidden. The user
posed fixation plots are usually applied to recorded videos of the can simply click and drag the markers to define intervals of inter-
presented content. This often implies a time-consuming frame-by- est within the timeline. While dragging the timeline, the user can
frame video data analysis. One solution proposed by Ramloll et al. play back data of interest as defined by this fixed interval. Finally,
[2004] is a fixation net for dynamic 3D non-stereoscopic content, additional details for a timeline entry (e.g., object identifiers, start
for which gaze positions are mapped onto flattened objects. Rep- and finish times) can be displayed, when hovering over it with the
resenting binocular scan paths in 3D VRs has been presented by mouse cursor. Context menus are triggered by right-clicking on
Duchowski et al. [2000] for which 2D gaze positions and depths items, for instance, for changing its color.
(gaze depth paths) are depicted.
In contrast to Duchowski et al., we propose a monocular scan path 2.3 Three-dimensional Attentional Maps
depicting 3D gaze positions as intersections of a gaze ray and a
viewed model. Besides traditional spherical representations (see An attentional map (or heatmap) is an aggregated representation
Figure 1a), we used conical fixation representations pointing at a depicting areas of visual attention for primarily static 2D stimuli.
corresponding gaze position (see Figure 1b), since it may integrate This is substantiated by the fact that an attentional map usually has
additional information about varying camera positions. So, cones the same dimensions (width and height) as the underlying stimu-
could represent gaze positions (apex), fixation durations (cone’s lus [Wooding 2002]. Since fixations are merely accumulated, at-
base), viewing directions (cone’s orientation), and viewing dis- tentional maps do not provide information about gaze sequences,
tances (cone’s height) within one representation. The traditional but are suitable for indicating areas attracting visual attention over
saccade representation can cause problems in 3D, because saccade a certain period of time. Visualizing gaze data directly in the 3D
lines may cross through surfaces. A simple solution we propose is VE enables an aggregated representation of longer observations (in
to maintain the traditional saccade representation with the possibil- contrast to the traditional frame-by-frame evaluation).
ity for adapting rendering options of 3D models (e.g., wireframe
models or hiding objects) for determining linked fixations. We propose 3D attentional maps as superimposed aggregated vi-
sualizations for virtual 3D stimuli: projected, object-based, and
It is also important to visualize how the locations and viewing direc- surface-based attentional representations (see Figure 4). A pro-
tions of the virtual camera have changed during observation. This jected attentional map is a 2D representation of 3D gaze distribu-
may aid in finding out, for example, if a scene was observed from tions for selected arbitrary views (e.g. a top view as in Figure 4).
diverse locations. We propose to visualize the camera path with For an object-based attentional representation one color is mapped
traces pointing at the respective gaze positions. Figure 2 shows to the surface of each model based on its received visual attention
an example for a camera path for which the camera locations are (see Figure 4b). This allows for quickly evaluating an object’s vi-
depicted as red lines and viewing directions as straight blue lines. sual attractiveness while maintaining information about the spatial
Displayed scan and camera paths can be filtered by means of the relationship with other models (e.g., two adjacent models have re-
models of interest timeline to provide a better overview. ceived high visual attention). Surface-based attentional maps dis-
110
3. Figure 3: An example of the models of interest timeline with selection markers for confining displayed scan and camera paths.
(a) Projected (top view) (b) Object-based (c) Surface-based
Figure 4: Advanced attentional maps for the application in three-
dimensional virtual environments.
play aggregated fixation data as heatmaps directly on a model’s sur-
face using a vertex-based mapping (see Figure 4c). This provides
detailed insights into gaze distributions across models’ surfaces, al-
lowing to draw quick conclusions about which model aspects at-
tracted visual interest. Figure 5: A screenshot from SVEETER illustrating multiple views
at a scene. The MOI timeline is shown in the lower area with its
A combination of these techniques offers different levels of detail legend displayed to the upper left.
for data investigation. While a projected heatmap may give an
overview of the gaze distribution across a scene, an object-based
attentional map can provide information about the models’ visual 3 User Study
attractiveness when zooming in. Finally, the surface-based tech-
nique allows for close examinations of viewed models. We have conducted a user study with eye tracking and visualization
experts to assess the usefulness of the presented techniques. The
method and results are discussed in the following paragraphs.
2.4 Implementation
Participants. Group 1 consisted of 20 eye tracking professionals
and researchers, aged between 23 and 52 years (Mean [M] = 34.50).
For the implementation of the presented gaze visualization tech- Participants in this group rated their general eye tracking knowl-
niques, a virtual 3D scene was required for which we used Mi- edge1 higher than average (M = 3.85, Standard Deviation [SD] =
crosoft’s XNA Game Studio 3.0 (based on C#). Our system was 0.96). Group 2 included 8 local data visualization and computer
confined to static 3D VEs without any transparent phenomena (e.g., graphics experts, aged between 25 and 35 years (M = 28.25). This
smoke or semi-transparent textures). Users can freely explore the group rated their eye tracking knowledge under average (M = 1.25,
scene by moving their camera viewpoints via mouse and keyboard SD = 1.09), but their expertise in computational visualization above
controls. In addition, an integration of the Tobii 1750 Eye Tracker average (M = 3.63, SD = 1.22).
and XNA allowed for logging 3D gaze target positions on models’
surfaces. For this purpose, a 3D collision ray needed to be deter- Measures. The usefulness of the gaze visualization techniques
mined based on 2D screen-based gaze positions which are supplied was investigated in an online survey2 . Each technique was briefly
by the eye tracker. The gaze ray was used to calculate and log its in- described with screenshots. Respondents were asked to rate their
tersection with virtual objects on the precision level of a polygonal agreement to statements such as “Cone visualizations are useful for
triangle on a model [M¨ ller and Trumbore 2005]. The processed
o representing fixations in virtual environments.” The qualitative part
data were stored in log files for post-analysis. of the survey collected comments about usefulness, improvements,
and possible applications of the techniques.
The presented visualization techniques were implemented in a pro-
totype of a gaze analysis software tool: SVEETER (see Figure Procedure and Design. Group 1 had to answer the questions
5). It is based on XNA (for 3D scenes) and Windows Forms to based on brief textual and pictorial descriptions of the gaze visual-
benefit from existing interface elements (e.g., buttons and menus). ization techniques provided in the online survey. For a better un-
SVEETER offers a coherent framework for loading 3D scenes and derstanding of the new gaze visualizations, group 2 could use the
corresponding gaze data logs, as well as deploying adapted gaze vi- SVEETER tool at our University. After welcoming each partici-
sualizations techniques. Multiple views are integrated to look at a
scene from different viewpoints. Context menus offer the possibil- 1 Rated on a Likert-scale from 1 (poor) to 5 (excellent).
ity to apply certain options to each view, such as adapted rendering 2 The survey included 19 Likert scales, from 1 (do not agree at all) to
options. The figures used throughout this paper to illustrate the dif- 5 (extremely agree), and 6 qualitative open questions using LimeSurvey
ferent gaze visualization techniques were created with SVEETER. (Version 1.80+)
111
4. Figure 6: Agreement ratings from both groups with the corresponding standard deviations.
pant of group 2, a short introduction about visual gaze analysis was 4 Conclusion and Future Work
provided and the tool was briefly presented. Following this, sample
gaze data for a 3D scene and a set of 9 predefined tasks were given In this paper, we presented a set of advanced gaze visualization
to each individual. Thereby, the aim was systematical acquaintance techniques for investigating visual attention in 3D VEs. Three-
with the techniques, not an efficiency evaluation. Tasks included, dimensional scan paths assist in examining sequences of fixations
for example, to find out if particular parts of an object received high and saccades. Thereby, camera paths provide valuable information
visual attention and if observers changed their viewpoints. After about how a user navigated through a scene and from which loca-
completing the tasks and familiarizing themselves with the differ- tions objects have been observed. The models of interest timeline
ent techniques, the visualization experts were asked to fill out the may help answering, for example, whether an object was viewed at
online survey. On average, each session took about 45 minutes. a certain point in time or whether any cyclic viewing behavior ex-
isted. Finally, three types of attentional maps were discussed to ex-
Results. Means and standard deviation values are omitted in this amine how visual attention is distributed across a scene (projected
paper, since none of the subjective results showed significant differ- heatmap), among 3D models (object-based attentional map), and
ences between group 1 and group 2. In general, participants agreed across a model’s surface (surface-based heatmap). A combination
that gaze visualization techniques facilitate eye tracking analysis. of these techniques was integrated in a toolkit for visual analysis
The detailed results are shown in Figure 6. of gaze data (SVEETER). The survey results from eye tracking and
visualization experts indicate the potential usefulness for various
Scan paths are generally useful for studying gaze sequences. In application areas.
this context, depicting camera paths is regarded important as well. Since visual gaze analysis for 3D VEs is still in an early stage, our
Thus, a combination of camera and scan paths is found useful by techniques may serve as a solid foundation and provide a basis for
both groups. In contrast, cones were considered only moderately further development. This includes representing data from multiple
suitable for representing fixations. Instead, a simple combination users as well as developing and testing alternative techniques.
of the spherical fixation representations with viewing directions and
positions was preferred. While the motivation for individually de-
picting fixations and saccades was not evident to everybody, it was References
regarded useful in the overall rating.
DACHSELT, R., AND W EILAND , M. 2006. TimeZoom: a flexible
Besides limiting scan and camera paths temporally via the MOI detail and context timeline. In CHI ’06: Extended abstracts on
timeline, additional filtering functionality was requested. Partici- Human factors in computing systems, ACM, 682–687.
pants agreed that the MOI timeline helps to detect gaze patterns.
D UCHOWSKI , A. T., S HIVASHANKARAIAH , V., R AWLS , T.,
Group 2 showed great interest in the MOI timeline when testing
G RAMOPADHYE , A. K., M ELLOY, B., AND K ANKI , B. 2000.
SVEETER. Both groups agreed that a zoomable user interface is
Binocular eye tracking in virtual reality for inspection training.
convenient for the timeline. Filtering objects of interest in the time-
In ETRA ’00, ACM, 89–96.
line was rated beneficial. Suggested improvements for the time-
line included substituting color identification with iconic images L ESSING , S., AND L INGE , L. 2002. IICap: A new environment for
and improved suitability for scenes with many objects by grouping eye tracking data analysis. Master’s thesis. University of Lund,
them dynamically or individually. Sweden.
The SVEETER tool was considered highly useful, combining scan ¨
M OLLER , T., AND T RUMBORE , B. 2005. Fast, minimum storage
paths, attentional maps, and the MOI timeline. Thus, for evaluat- ray/triangle intersection. In SIGGRAPH ’05: ACM SIGGRAPH
ing eye tracking studies in 3D VEs, interviewees could imagine to 2005 Courses, ACM.
use SVEETER. Multiple views are rated practical together with the R AMLOLL , R., T REPAGNIER , C., S EBRECHTS , M., AND
ability to assign different visualizations to each view. We observed B EEDASY, J. 2004. Gaze data visualization tools: opportunities
that a combination of different techniques was often employed by and challenges. Eighth International Conference on Information
users testing SVEETER (group 2). Both scan paths and surface- Visualisation (July), 173–180.
based maps were, for example, used to investigate observation pat-
terns on a model. Participants frequently faded out objects to gain a W OODING , D. S. 2002. Fixation maps: quantifying eye-movement
better overview. In addition, we noticed that the MOI timeline was traces. In ETRA ’02, ACM, 31–36.
often used instead of scan paths.
112