2. What is Augmented
Reality?
A combination of a real
scene viewed by a user
and a virtual scene
generated by a computer
that augments the scene
with additional
information.
3. What is the Goal of AR?
To enhance a person’s performance
and perception of the world
But, what is the ultimate goal????
4. The Ultimate Goal of AR
Create a system such that no user
CANNOT tell the difference between
the real world and the virtual
augmentation of it.
5. Augmented Reality vs.
Virtual Reality
Augmented Reality
System augments the
real world scene
User maintains a
sense of presence in
real world
Needs a mechanism
to combine virtual
and real worlds
Virtual Reality:
Totally immersive
environment
Visual senses are
under control of
system (sometimes
aural and
proprioceptive senses
too)
6. Miligram’s Reality-
Virtuality Continuum
Mixed Reality (MR)
Real
Environmen
t
Virtual
Environment
Augmented
Reality (AR)
Augmented
Virtuality (AV)
Miligram coined the term “Augmented Virtuality” to identify
systems which are mostly synthetic with some real world
imagery added such as texture mapping video onto virtual
objects.
7. Miligram’s Taxonomy for
Mixed Reality Displays
Reproduction Fidelity – quality
of computer generated
imagery
Extent of Presence Metaphor –
level of immersion of the user
within the displayed scene
Extent of World Knowledge –
knowledge of relationship
between frames of reference
for the real world, the camera
viewing it, and the user
Reproductio
n Fidelity
Extent of
Presence
Metaphor
Extent of World
Knowledge
8. Combining the Real and
Virtual Worlds
We need:
Precise models
Locations and optical properties of the
viewer (or camera) and the display
Calibration of all devices
To combine all local coordinate systems
centered on the devices and the objects in
the scene in a global coordinate system
9. Combining the Real and
Virtual Worlds (cont)
Register models of all 3D objects of interest
with their counterparts in the scene
Track the objects over time when the user
moves and interacts with the scene
10. Realistic Merging
Requires:
Objects to behave in physically plausible
manners when manipulated
Occlusion
Collision detection
Shadows
**All of this requires a very detailed description of
the physical scene
12. Research Activities
Develop methods to register the two
distinct sets of images and keep them
registered in real-time
– New work in this area has started to use
computer vision techniques
Develop new display technologies for
merging the two images
13. Performance Issues
Augmented Reality systems are expected:
To run in real-time so that the user can move
around freely in the environment
Show a properly rendered augmented image
Therefore, two performance criteria are placed on
the system:
Update rate for generating the augmenting
image
Accuracy of the registration of the real and
virtual image
14. Limitations for Updating
the Generated Images
Must be at 10 times/second
More photorealistic graphics rendering
Current technology does not support fully
lit, shaded and ray-traced images of
complex scenes
15. Failures in Registration
Failures in registration due to:
– Noise
• Position and pose of camera with respect to the
real scene
• Fluctuations of values while the system is running
– Time delays
• In calculating the camera position
• In calculating the correct alignment of the graphics
camera
20. Video Composition
for Video see-through HMD
Chroma-keying
– Used for special effects
– Background of computer graphics images is
set to a specific color
– Combining step replaces all colored areas
with corresponding parts from video
Depth Information
– Combine real and virtual images by a pixel-
by-pixel depth comparison
21. Advantages of
Video see-through HMD
Flexibility in composition strategies
Wide field of view
Real and virtual view delays can be
matched
23. Applications
Medical
Entertainment
Military Training
Engineering
Design
Robotics and
Telerobotics
Manufacturing,
Maintenance, and
Repair
Consumer Design
Hazard Detection
Audio
To the user of this system, it would appear that he is looking at a single scene.
Show image a few slides back:
Shows the merging and correct registration of pre-operative data
and the patient’s head.
Providing this view to a surgeon in the operating room would:
Enhance the surgeon’s performance
Possibly eliminate the need for any other calibration
fixtures during the operation
Computer generated virtual objects must be accurately registers with the real
in all dimensions.
Errors in registration prevent the real and virtual images from being seen as fused.
Registration must be maintained while the user moves around in the virtual
environment.
Changes in registration can be distracting or physically disturbing.
Augmented reality is closest to the real world because mainly a user is perceiving
the real world with just a little computer generated data.
This distinction will probably fade as technology improves.
Reproduction Fidelity
Quality ranges from wireframe approximations to completely photorealistic
renderings.
Real-time constraints forces AR to be on the low end of this axis.
Extent of Presence Metaphor
AR is on low end of this axis too.
Closely related to display technology - each gives a different sense of immersion
Extend of World Knowledge
In some domains this information is well known - which makes AR easier.
Need these things to seamlessly combine the real and virtual worlds.
Need precise models of the user’s environment and how it is sensed.
1st picture - real world
2nd picture - real world with virtual objects and inter-reflections and virtual
shading
Imaging device is a video camera in this case
Camera performs perspective projections of 3D world onto 2D image plane using
the focal length, lens distortion, position, and pose of the device
to determine exactly what is projected onto the image plane.
Virtual objects modeled in an object reference frame and generated by a standard
computer graphics system.
Graphics system requires information about the imaging of the real scene so that
it can correctly render these objects.
Virtual image merged with real scene image to form augmented reality image.
Want to render virtual parts without any visible jumps (real-time constraints.)
Standard rule of thumb - graphics system must be able to render the virtual scene
at least 10 times/sec - this is well within current capabilities
for moderately complex graphics scenes.
There are a lot of Augmented Reality applications that have virtual parts that are
not very complex and do not require high levels of photorealism.
AR systems sensitive to visual errors - virtual object may not be stationary in the
real scene or it may be in the wrong place.
Misregistration of a pixel can be detected under certain conditions.
Time delays lead to augmented image lagging behind motions in the real scene.
Sometimes referred to as “Windows on the World” or “Fish Tank VR”
Other display technologies are used to increase the sense of presence.
Works by placing optical combiners in front of the user’s eyes.
Combiners are partially transmissive - so user can look directly through them
and see the real world.
Combiners are partially reflective - so user can also see virtual images bounced
off the combiners from head-mounted monitors.
Similar to Head-Up Displays (HUDs) commonly used in military aircraft.
Can see through the display even if the power is turned off.
Works by combining a closed-view HMD with one or two head-mounted video
cameras
Video cameras provide the user’s view of the real world.
Video from cameras is combined with graphics images by the scene generator to
blend the two worlds.
Result is sent to the monitors in from on the user’s eyes in the closed-view HMD.
User has no direct view of the real world.
If power is off, the user is “blind.”
Color selected cannot be used in the virtual object.
Depth Information allows for occlusion.
Flexibility in composition strategies
Basic Problem with optical is the virtual objects do not completely obscure real- world objects because
combiners allow light from both the virtual and real sources.
Virtual objects appear ghost-like and semi-transparent, damaging the illusion of reality because
occlusion is a strong depth cue.
Video see-through is much more flexible about how it merges real and virtual - they are both in
digitized form so compositors can do a pixel-by-pixel comparison.
Produces more compelling environments.
Wide Field of View
Distortions in optical systems are a function of the radial distance away from the optical axis - the
further you look away from the center of the view, the more distorted it gets.
A digitized image taken through a distorted optical system can be undistorted by applying image
processing techniques to unwarp the image.
This requires significant amounts of computation - but this constraint will lessen as computers
become faster.
It is harder to build wide FOV displays with optical see-through constraints.
Distortions of the user’s view of the real world could be corrected optically, but complex optics are
expensive and make the HMD heavier.
Real and Virtual delays can be matched
Delay the video of the real world to match the delay in the virtual image stream. (can’t be done in
optical because it gives the user a direct view of the real world.)
Disadvantage: eliminating dynamic error comes at the cost of delaying both the real and virtual
scenes - user sees everything lagging behind.
Simplicity
Optical has only 1 stream of video to worry about (the graphics images), while
Video has separate streams for real and virtual images.
Optical - real world is seen directly through combiners with a time delay of a few
nanoseconds while both delays in video are in 10s of milliseconds.
Optical HMDs with narrow field of view combiners offer views of the real world
that are basically undistorted, while Video has distortion that must
be compensated for.
Resolution
Video limits the resolution of what the user sees (both real and virtual) to the
resolution of the display devices.
Optical limits the virtual resolution to the resolution of the display devices but not
the user’s view of the real world.
No Eye Offset
Video - user’s view of real world is provided by video cameras that are not
necessarily located at exact positions of user’s eye - creating an
offset between cameras and real eyes.
Video problem can be avoided using mirrors to create a set of optical paths that
mimic the direct path to the user’s eyes, however this adds
complexity to the HMD design.
Offset is not generally a problem for optical.