SlideShare una empresa de Scribd logo
1 de 21
COMPUTER VISION
Introduction
Computer vision is the study and application of methods which allow computers to
"understand" image content or content of multidimensional data in general. The term
"understand" means here that specific information is being extracted from the image data
for a specific purpose: either for presenting it to a human operator (e. g., if cancerous
cells have been detected in a microscopy image), or for controlling some process (e. g.,
an industry robot or an autonomous vehicle). The image data that is fed into a computer
vision system is often a digital gray-scale or colour image, but can also be in the form of
two or more such images (e. g., from a stereo camera pair), a video sequence, or a 3D
volume (e. g., from a tomography device). In most practical computer vision applications,
the computers are pre-programmed to solve a particular task, but methods based on
learning are now becoming increasingly common. Computer vision can also be described
as the complement (but not necessary the opposite) of biological vision. In biological
vision and visual perception real vision systems of humans and various animals are
studied, resulting in models of how these systems are implemented in terms of neural
processing at various levels.
State Of The Art
Relation between Computer vision and various other fields
The field of computer vision can be characterized as immature and diverse. Even though
earlier work exists, it was not until the late 1970's that a more focused study of the field
1
started when computers could manage the processing of large data sets such as images.
However, these studies usually originated from various other fields, and consequently
there is no standard formulation of the "computer vision problem". Also, and to an even
larger extent, there is no standard formulation of how computer vision problems should
be solved. Instead, there exists an abundance of methods for solving various well-defined
computer vision tasks, where the methods often are very task specific and seldom can be
generalized over a wide range of applications. Many of the methods and applications are
still in the state of basic research, but more and more methods have found their way into
commercial products, where they often constitute a part of a larger system which can
solve complex tasks (e.g., in the area of medical images, or quality control and
measurements in industrial processes).
A significant part of artificial intelligence deals with planning or deliberation for system
which can perform mechanical actions such as moving a robot through some
environment. This type of processing typically needs input data provided by a computer
vision system, acting as a vision sensor and providing high-level information about the
environment and the robot. Other parts which sometimes are described as belonging to
artificial intelligence and which are used in relation to computer vision is pattern
recognition and learning techniques. As a consequence, computer vision is sometimes
seen as a part of the artificial intelligence field.
Since a camera can be seen as a light sensor, there are various methods in computer
vision based on correspondences between a physical phenomenon related to light and
images of that phenomenon. For example, it is possible to extract information about
motion in fluids and about waves by analyzing images of these phenomena. Also, a
subfield within computer vision deals with the physical process which given a scene of
objects, light sources, and camera lenses forms the image in a camera. Consequently,
computer vision can also be seen as an extension of physics.A third field which plays an
important role is neurobiology, specifically the study of the biological vision system.
Over the last century, there has been an extensive study of eyes, neurons, and the brain
structures devoted to processing of visual stimuli in both humans and various animals.
This has led to a coarse, yet complicated, description of how "real" vision systems
2
operate in order to solve certain vision related tasks. These results have led to a subfield
within computer vision where artificial systems are designed to mimic the processing and
behaviour of biological systems, at different levels of complexity. Also, some of the
learning-based methods developed within computer vision have their background in
biology.
Yet another field related to computer vision is signal processing. Many existing methods
for processing of one-variable signals, typically temporal signals, can be extended in a
natural way to processing of two-variable signals or multi-variable signals in computer
vision. However, because of the specific nature of images there are many methods
developed within computer vision which have no counterpart in the processing of one-
variable signals. A distinct character of these methods is the fact that they are non-linear
which, together with the multi-dimensionality of the signal, defines a subfield in signal
processing as a part of computer vision.
Beside the above mentioned views on computer vision, many of the related research
topics can also be studied from a purely mathematical point of view. For example, many
methods in computer vision are based on statistics, optimization or geometry. Finally, a
significant part of the field is devoted to the implementation aspect of computer vision;
how existing methods can be realized in various combinations of software and hardware,
or how these methods can be modified in order to gain processing speed without losing
too much performance.
Related Fields
Computer vision, Image processing, Image analysis, Robot vision and Machine vision are
closely related fields. If you look inside text books which have either of these names in
the title there is a significant overlap in terms of what techniques and applications they
cover. This implies that the basic techniques that are used and developed in these fields
are more or less identical, something which can be interpreted as there is only one field
with different names. On the other hand, it appears to be necessary for research groups,
scientific journals, conferences and companies to present or market themselves as
3
belonging specifically to one of these fields and, hence, various characterizations which
distinguish each of the fields from the others have been presented. The following
characterizations appear relevant but should not be taken as universally accepted.
Image processing and Image analysis tend to focus on 2D images, how to transform one
image to another, e.g., by pixel-wise operations such as contrast enhancement, local
operations such as edge extraction or noise removal, or geometrical transformations such
as rotating the image. This characterization implies that image processing/analysis neither
require assumptions nor produce interpretations about the image content.
Computer vision tends to focus on the 3D scene projected onto one or several images,
e.g., how to reconstruct structure or other information about the 3D scene from one or
several images. Computer vision often relies on more or less complex assumptions about
the scene depicted in an image.
Machine vision tends to focus on applications, mainly in industry, e.g., vision based
autonomous robots and systems for vision based inspection or measurement. This implies
that image sensor technologies and control theory often are integrated with the processing
of image data to control a robot and that real-time processing is emphasized by means of
efficient implementations in hardware and software. There is also a field called Imaging
which primarily focus on the process of producing images, but sometimes also deals with
processing and analysis of images. For example, Medical imaging contains lots of work
on the analysis of image data in medical applications.
Finally, pattern recognition is a field which uses various methods to extract information
from signals in general, mainly based on statistical approaches. A significant part of this
field is devoted to applying these methods to image data.A consequence of this state of
affairs is that you can be working in a lab related to one of these fields, apply methods
from a second field to solve a problem in a third field and present the result at a
conference related to a fourth field!
Typical Tasks Of Computer Vision
4
Each of the application areas described above employ a range of computer vision tasks;
more or less well-defined measurement problems or processing problems, which can be
solved using a variety of methods. Some examples of typical computer vision tasks are
presented below.
Recognition
The classical problem in computer vision, image processing and machine vision is that of
determining whether or not the image data contains some specific object, feature, or
activity. This task can normally be solved robustly and without effort by a human, but is
still not satisfactory solved in computer vision for the general case: arbitrary objects in
arbitrary situations. The existing methods for dealing with this problem can at best solve
it only for specific objects, such as simple geometric objects (e.g., polyhedrons), human
faces, printed or hand-written characters, or vehicles, and in specific situations, typically
described in terms of well-defined illumination, background, and pose of the object
relative to the camera.
Different varieties of the recognition problem are described in the literature:
• Recognition: one or several pre-specified or learned objects or object classes can
be recognized, usually together with their 2D positions in the image or 3D poses
in the scene.
• Identification: An individual instance of an object is recognized. Examples:
identification of a specific person face or fingerprint, or identification of a specific
vehicle.
• Detection: the image data is scanned for a specific condition. Examples: detection
of possible abnormal cells or tissues in medical images or detection of a vehicle in
an automatic road toll system. Detection based on relatively simple and fast
computations is sometimes used for finding smaller regions of interesting image
data which can be further analyzed by more computationally demanding
techniques to produce a correct interpretation. Several specialized tasks based on
recognition exist, such as:
5
• Content-based image retrieval: find all images which has a specific content in a
larger set or database of images.
• Pose estimation: estimation of the position and orientation of specific object
relative to the camera. Example: to allow a robot arm to pick up the objects from
the belt.
• Optical character recognition (or OCR): images of printed or handwritten text
are converted to computer readable text such as ASCII or Unicode.
Motion
Several tasks relate to motion estimation in which an image sequence is processed to
produce an estimate of the local image velocity at each point. Examples of such tasks are
• Egomotion: determine the 3D rigid motion of the camera.
• Tracking of one or several objects (e.g. vehicles or humans) through the image
sequence.
• Surveillance: detection of possible activities based on motion.
Scene Reconstruction
Given two or more images of a scene, or a video, scene reconstruction aims at computing
a 3D model of the scene. In the simplest case the model can be a set of 3D points. More
sophisticated methods produce a complete 3D surface model.
Image Restoration
Given an image, an image sequence, or a 3D volume, which has been degraded by noise,
image restoration aims at producing the image data without the noise. Examples of noise
processes which are considered are sensor noise (e.g., ultrasonic images) and motion blur
(e.g., because of a moving camera or moving objects in the scene).
Computer Vision Systems
6
A typical computer vision system can be divided in the following subsystems:
Image acquisition
The image or image sequence is acquired with an imaging system
(camera,radar,lidar,tomography system). Often the imaging system has to be calibrated
before being used.
Preprocessing
In the preprocessing step, the image is being treated with "low-level"-operations. The aim
of this step is to do noise reduction on the image (i.e. to dissociate the signal from the
noise) and to reduce the overall amount of data. This is typically being done by
employing different (digital)image processing methods such as:
1. Downsampling the image.
2. Applying digital filters
3. Computing the x- and y-gradient (possibly also the time-gradient).
4. Segmenting the image.
a. Pixelwise thresholding.
5. Performing an eigentransform on the image
a. Fourier transform
6. Doing motion estimation for local regions of the image (also known as optical
flow estimation).
7. Estimating disparity in stereo images.
8. Multiresolution analysis
Feature extraction
7
The aim of feature extraction is to further reduce the data to a set of features, which ought
to be invariant to disturbances such as lighting conditions, camera position, noise and
distortion. Examples of feature extraction are:
1. Performing edge detection or estimation of local orientation.
2. Extracting corner features.
3. Detecting blob features.
4. Extracting spin images from depth maps.
5. Extracting geons or other three-dimensional primitives, such as superquadrics.
6. Acquiring contour lines and maybe curvature zero crossings.
7. Generating features with the Scale-invariant feature transform.
8. Calculating the Co-occurrence matrix of the image or sub-images to measure
texture.
Registration
The aim of the registration step is to establish correspondence between the features in the
acquired set and the features of known objects in a model-database and/or the features of
the preceding image. The registration step has to bring up a final hypothesis. To name a
few methods:
1. Least squares estimation
2. Hough transform in many variations
3. Geometric hashing
4. Particle filtering
Applications Of Computer Vision
The following is a non-complete list of applications which are studied in computer vision.
In this category, the term application should be interpreted as a high level function which
solves a problem at a higher level of complexity. Typically, the various technical
problems related to an application can be solved and implemented in different ways.
8
Applications Of Computer Vision
A facial recognition system is a computer-driven application for automatically
identifying a person from a digital image. It does that by comparing selected facial
features in the live image and a facial database. It is typically used for security systems
and can be compared to other biometrics such as fingerprint or eye iris recognition
systems.
Popular recognition algorithms include eigenface, fisherface, the Hidden Markov model,
and the neuronal motivated Dynamic Link Matching. A newly emerging trend, claimed to
achieve previously unseen accuracies, is three-dimensional face recognition. Another
emerging trend uses the visual details of the skin, as captured in standard digital or
scanned images. Tests on the FERET database, the widely used industry benchmark,
showed that this approach is substantially more reliable than previous algorithms.
Polly (robot)
Polly was a robot created at the MIT Artificial Intelligence Laboratory by Ian Horswill
for his PhD, which was published in 1993 as a technical report. It was the first mobile
robot to move at animal-like speeds (1m per second) using computer vision for its
navigation. It was an example of behavior based robotics. For a few years, Polly was able
to give tours of the AI laboratory's seventh floor, using canned speech to point out
landmarks such as Anita Flynn's office. The Polly algorithm is a way to navigate in a
cluttered space using very low resolution vision to find uncluttered areas to move forward
into, assuming that the pixels at the bottom of the frame (the closest to the robot) show an
example of an uncluttered area. Since this could be done 60 times a second, the algorithm
only needed to discriminate three categories: telling the robot at each instant to go
straight, towards the right or towards the left.
Mobile robot
9
Mobile Robots are automatic machines that are capable of movement in a given
environment. Robots generally fall into two classes, linked manipulators (or Industrial
robots) and mobile robots. Mobile robots have the capability to move around in their
environment and are not fixed to one physical location. In contrast, industrial
manipulators usually consist of a jointed arm and gripper assembly (or end effector) that
is attached to a fixed surface.
The most common class of mobile robots are wheeled robots. A second class of mobile
robots includes legged robots while a third smaller class includes aerial robots, usually
referred to as unmanned aerial vehicles (UAVs). Mobile robots are the focus of a great
deal or current research and almost every major university has one or more labs that
focus on mobile robot research. Mobile robots are also found in industry, military and
security environments, and appear as consumer products.
Robot
A humanoid robot manufactured by Toyota "playing" a trumpet
The word robot is used to refer to a wide range of machines, the common feature of
which is that they are all capable of movement and can be used to perform physical tasks.
Robots take on many different forms, ranging from humanoid, which mimic the human
form and way of moving, to industrial, whose appearance is dictated by the function they
are to perform. Robots can be grouped generally as mobile robots (eg. autonomous
vehicles), manipulator robots (eg. industrial robots) and Self reconfigurable robots, which
can conform themselves to the task at hand.
Robots may be controlled directly by a human, such as remotely-controlled bomb-
disposal robots, robotic arms, or shuttles, or may act according to their own decision
making ability, provided by artificial intelligence. However, the majority of robots fall in-
between these extremes, being controlled by pre-programmed computers. Such robots
may include feedback loops such that they can interact with their environment, but do not
display actual intelligence.
10
The word "robot" is also used in a general sense to mean any machine which mimics the
actions of a human (biomimicry), in the physical sense or in the mental sense.It comes
from the Czech and Slovak word robota, labour or work (also used in a sense of a serf).
The word robot first appeared in Karel Čapek's science fiction play R.U.R. (Rossum's
Universal Robots) in 1921.
History
The construction of the Soviet-made robot of the 1970's. The robot was able to move,
reproduce the pre-recorded sounds, imitate the clever conversation using the built-in
radio station and demonstrate movies on the built-in screen. It was used in various
11
shows.The word robot was introduced by Czech writer Karel Capek in his play R.U.R.
(Rossum's Universal Robots) which was written in 1920 (See also Robots in literature for
details of the play). However, the verb robotovat, meaning "to work" or "to slave", and
the noun robota (meaning corvée) used in the Czech and Slovak languages, has been
used since the early 10th century. It was suggested that the word robot had been coined
by Karel Čapek's brother, painter and writer Josef Čapek.
An early automaton was created 1738 by Jacques de Vaucanson, who created a
mechanical duck that was able to eat grain, flap its wings, and excrete.
The first human to be killed by a robot was 37 year-old Kenji Urada, a Japanese factory
worker, in 1981. According the Economist.com, Urada "climbed over a safety fence at a
Kawasaki plant to carry out some maintenance work on a robot. In his haste, he failed to
switch the robot off properly. Unable to sense him, the robot's powerful hydraulic arm
kept on working and accidentally pushed the engineer into a grinding machine."
Smart Camera
A smart camera is an integrated machine vision system which, in addition to image
capture circuitry, includes a processor, which can extract information fromimageswithout
need for an external processing unit, and interface devices used to make results available
to other devices.
A Smart Camera or „intelligent Camera“ is a self-contained, standalone vision system
with built-in image sensor in the housing of an industrial video camera. It contains all
necessary communication interfaces, e.g. Ethernet. It is not necessarily larger than an
12
industrial or surveillance camera. This architecture has the advantage of a more compact
volume compared to PC-based vision systems and often achieves lower cost, at the
expense of a somewhat simpler (or missing altogether) user interface.
Early smart camera (ca. 1985, in red) with an 8MHz Z80 compared to a modern device
featuring Texas Instruments' C64 @1GHz. A Smart Camera usually consists of several
(but not necessarily all) of the following components:
1. Image sensor (matrix or linear, CCD- or CMOS)
2. Image digitization circuitry
3. Image memory
4. Communication interface (RS232, Ethernet)
5. I/O lines (often optoisolated)
6. Lens holder or built in lens (usually C or C-mount)
Examples Of Applications For Computer Vision
Another way to describe computer vision is in terms of applications areas. One of the
most prominent application fields is medical computer vision or medical image
processing. This area is characterized by the extraction of information from image data
for the purpose of making a medical diagnosis of a patient. Typically image data is in the
form of microscopy images, X-ray images, angiography images, ultrasonic images, and
tomography images. An example of information which can be extracted from such image
data is detection of tumours, arteriosclerosis or other malign changes. It can also be
measurements of organ dimensions, blood flow, etc. This application area also supports
13
medical research by providing new information, e.g., about the structure of the brain, or
about the quality of medical treatments.
A second application area in computer vision is in industry. Here, information is
extracted for the purpose of supporting a manufacturing process. One example is quality
control where details or final products are being automatically inspected in order to find
defects. Another example is measurement of position and orientation of details to be
picked up by a robot arm. See the article on machine vision for more details on this area.
Military applications are probably one of the largest areas for computer vision, even
though only a small part of this work is open to the public. The obvious examples are
detection of enemy soldiers or vehicles and guidance of missiles to a designated target.
More advanced systems for missile guidance send the missile to an area rather than a
specific target, and target selection is made when the missile reaches the area based on
locally acquired image data. Modern military concepts, such as "battlefield
awareness,"imply that various sensors, including image sensors, provide a rich set of
information about a combat scene which can be used to support strategic decisions. In
this case, automatic processing of the data is used to reduce complexity and to fuse
information from multiple sensors to increase reliability.
Artist's Concept of Rover on Mars. Notice the stereo cameras mounted on top of the
Rover. (credit: Maas Digital LLC) One of the newer application areas is autonomous
vehicles, which include submersibles, land-based vehicles (small robots with wheels, cars
or trucks), and aerial vehicles. An unmanned aerial vehicle is often denoted UAV. The
level of autonomy ranges from fully autonomous (unmanned) vehicles to vehicles where
computer vision based systems support a driver or a pilot in various situations. Fully
autonomous vehicles typically use computer vision for navigation, e. g., a UAV looking
for forest fires. Examples of supporting system are obstacle warning systems in cars and
systems for autonomous landing of aircraft. Several car manufacturers have demonstrated
systems for autonomous driving of cars, but this technology has still not reached a level
where it can be put on the market.
14
Software For Computer Vision
Animal
Animal (first implementation: 1988 - revised: 2004) is an interactive environment for
Image processing that is oriented toward the rapid prototyping, testing, and modification
of algorithms. To create ANIMAL (AN IMage ALgebra), XLISP of David Betz was
extended with some new types: sockets, arrays, images, masks, and drawables. The
theoretical framework and the implementation of the working environment is described
in the paper ANIMAL: AN IMage ALgebra.In the theoretical framework of ANIMAL a
digital image is a boundless matrix. However, in the implementation it is bounded by a
rectangular region in the discrete plane and the elements outside the region have a
constant value. The size and position of the region in the plane (focus) is defined by the
coordinates of the rectangle. In this way all the pixels, including those on the border, have
the same number of neighbors (useful in local operators, such as digital filters).
Furthermore, pixelwise commutative operations remain commutative on image level,
independently on focus.
OpenCv
OpenCV is an open source computer vision library developed by Intel. The library is
cross-platform, and runs on both Windows and Linux. It focuses mainly towards real-
time image processing. The application areas include
1. Human-Computer Interface (HCI)
2. Object Identification
3. Segmentation and Recognition
4. Face Recognition
5. Gesture Recognition
6. Motion Tracking
Visualization Toolkit (VTK)
15
Visualization Toolkit (VTK) is an open source, freely available software system for 3D
computer graphics, image processing, and visualization used by thousands of researchers
and developers around the world. VTK consists of a C++ class library, and several
interpreted interface layers including Tcl/Tk, Java, and Python. Professional support and
products for VTK are provided by Kitware, Inc. VTK supports a wide variety
ofvisualization algorithms including scalar, vector, tensor, texture, and volumetric
methods; and advanced modeling techniques such as implicit modelling, polygon
reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation.
Commercial Computer Vision Systems
Automatix Inc., founded in January 1980, was the first company to market industrial
robots with built-in machine vision. Its founders were Victor Scheinman, inventor of the
Stanford arm; Phillippe Villers, Michael Cronin, and Arnold Reinhold of
Computervision; Jake Dias and Dan Nigro of Data General; Gordon VanderBrug, of NBS
and Norman Wittels of Clark University.
Automatix Robots at the Robots 1985 show in Detroit, Michigan. Clockwise from lower
left: AID 600, AID 900 Seamtracker, Yaskawa Motoman.Automatix mostly used robot
mechanisms imported from Hitachi at first and later from Yaskawa and KUKA. It did
design and manufacture a Cartesian robot called the AID-600. The 600 was intended for
use in precision assembly but was adapted for welding use, particularly Tungsten inert
gas welding (TIG), which demands high accuracy and immunity from the intense
electromagnetic interference that the TIG process creates. Automatix was the first
company to market a vision-guided welding robot called Seamtracker. Structured laser
16
light and monochromatic filters were used to allow an image to be seen in the presence of
the welding arc. Another concept, invented by Mr. Scheinman, was RobotWorld, a
system of cooperating small modules suspended from a 2-D linear motor. The product
line was later sold to Yaskawa.
Automatix raised large amounts of venture capital, and went public in 1983, but was not
profitable until the early 1990s. In 1994, Automatix merged with another machine vision
company, Itran Corp., to form Acuity Imaging, Inc. Acuity was acquired by Robotics
Vision Systems Inc. (RVSI) in September 1995. As of 2004, RVSI still supported the
evolved Automatix machine vision package under the PowerVision brand.
RapidEye is a commercial multispectral remote sensing satellite mission being designed
and implemented by MDA for RapidEye AG. The RapidEye sensor images five optical
bands in the 400-850nm range and provides 5m pixel size at nadir. Rapid delivery and
short revisit times are provided through the use of a five-satellite constellation.
Scantron is the name of a United States company that makes and sells Scantron exam
answer sheets and the machines to grade them. The Scantron system usually takes the
form of a "multiple choice, fill-in-the-circle/square/rectangle" form of varying length and
width, from single column 50 answer tests, to multiple 8.5" x 11" page forms used in
standardized testing such as the SAT and ACT. The forms are sensed optically, using
optical mark recognition to detect markings in each place, in a "Scantron Machine" that
tabulates and can automatically grade results. Earlier versions were sensed electrically.
17
A typical 100-answer Scantron answer sheet. This is only half of it (the front side) with
the back side not being shown.Commonly, there are two sides to Scantron answer sheets.
18
They can contain 50 answer blanks, 100 answer blanks, and so on. There is even a
smaller form called a "Quiz Strip" that contains only about 20 answer boxes to bubble-in.
On the larger sheets, there is a space on the back where answers can be manually written
in for separate questions, if a test giver issues them out. The full-sized 8.5" x 11" form
may contain a larger area for using it to work on math formulas, write short answers, etc.
Answers "A" and "B" are commonly used for "True" and "False" questions, as shown in
the image to the right on the top of each row.
Grading of Scantron sheets is performed first by creating an answer key. The answer key
is simply a standard Scantron answer sheet with all of the correct answers filled in, along
with the "key" rectangle at the top of the sheet.Once you have your answer key ready the
Scantron machine is powered on and the answer key is fed through. This stores the
answer key in the memory of the Scantron machine and any further sheets that are fed
through will be graded and marked according to the key in memory. Switching off the
Scantron machine will stop the paper feed and clear the memory.
19
Conclusion
Computer vision, unlike for example factory machine vision, happens in unconstrained
environments, potentially with changing cameras and changing lighting and camera
views. Also, some “objects” such as roads, rivers, bushes, etc. are just difficult to
describe. In these situations, engineering a model a-priori can be difficult. With learning-
based vision, one just “points” the algorithm at the data and useful models for detection,
segmentation, and identification can often be formed. Learning can often easily fuse or
incorporate other sensing modalities such as sound, vibration, or heat. Since cameras and
sensors are becoming cheap and powerful and learning algorithms have a vast appetite
for computational threads, Intel is very interested in enabling geometric and learning-
based vision routines in its OpenCV library since such routines are vast consumers of
computational power.
20
21

Más contenido relacionado

La actualidad más candente

Computer Vision Presentation Artificial Intelligence (AI)
Computer Vision Presentation Artificial Intelligence (AI)Computer Vision Presentation Artificial Intelligence (AI)
Computer Vision Presentation Artificial Intelligence (AI)AshTheMidBenchers
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction Wael Badawy
 
What is computer vision?
What is computer vision?What is computer vision?
What is computer vision?Qentinel
 
Face recognition technology
Face recognition technologyFace recognition technology
Face recognition technologyShubhamLamichane
 
Computer Vision - Artificial Intelligence
Computer Vision - Artificial IntelligenceComputer Vision - Artificial Intelligence
Computer Vision - Artificial IntelligenceACM-KU
 
Face recognition technology - BEST PPT
Face recognition technology - BEST PPTFace recognition technology - BEST PPT
Face recognition technology - BEST PPTSiddharth Modi
 
Smart Voting System with Face Recognition
Smart Voting System with Face RecognitionSmart Voting System with Face Recognition
Smart Voting System with Face RecognitionNikhil Katte
 

La actualidad más candente (20)

Computer Vision Presentation Artificial Intelligence (AI)
Computer Vision Presentation Artificial Intelligence (AI)Computer Vision Presentation Artificial Intelligence (AI)
Computer Vision Presentation Artificial Intelligence (AI)
 
Computer vision
Computer visionComputer vision
Computer vision
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Image recognition
Image recognitionImage recognition
Image recognition
 
What is computer vision?
What is computer vision?What is computer vision?
What is computer vision?
 
Face recognition technology
Face recognition technologyFace recognition technology
Face recognition technology
 
Computer vision
Computer visionComputer vision
Computer vision
 
Computer vision
Computer visionComputer vision
Computer vision
 
Computer Vision - Artificial Intelligence
Computer Vision - Artificial IntelligenceComputer Vision - Artificial Intelligence
Computer Vision - Artificial Intelligence
 
Computer vision ppt
Computer vision pptComputer vision ppt
Computer vision ppt
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
Computer vesion
Computer vesionComputer vesion
Computer vesion
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Computer vision
Computer visionComputer vision
Computer vision
 
Computer vision
Computer visionComputer vision
Computer vision
 
Computer vision
Computer visionComputer vision
Computer vision
 
Face recognition technology - BEST PPT
Face recognition technology - BEST PPTFace recognition technology - BEST PPT
Face recognition technology - BEST PPT
 
Smart Voting System with Face Recognition
Smart Voting System with Face RecognitionSmart Voting System with Face Recognition
Smart Voting System with Face Recognition
 

Similar a Computer vision

Color based image processing , tracking and automation using matlab
Color based image processing , tracking and automation using matlabColor based image processing , tracking and automation using matlab
Color based image processing , tracking and automation using matlabKamal Pradhan
 
Saksham presentation
Saksham presentationSaksham presentation
Saksham presentationSakshamTurki
 
Class PPT based on engineering subject cv.pptx
Class PPT based on engineering subject cv.pptxClass PPT based on engineering subject cv.pptx
Class PPT based on engineering subject cv.pptxDivyaKumari588020
 
Saksham seminar report
Saksham seminar reportSaksham seminar report
Saksham seminar reportSakshamTurki
 
computer vision.pdf
computer vision.pdfcomputer vision.pdf
computer vision.pdfsisaysimon
 
IRJET- Full Body Motion Detection and Surveillance System Application
IRJET-  	  Full Body Motion Detection and Surveillance System ApplicationIRJET-  	  Full Body Motion Detection and Surveillance System Application
IRJET- Full Body Motion Detection and Surveillance System ApplicationIRJET Journal
 
IRJET- Review on Text Recognization of Product for Blind Person using MATLAB
IRJET-  Review on Text Recognization of Product for Blind Person using MATLABIRJET-  Review on Text Recognization of Product for Blind Person using MATLAB
IRJET- Review on Text Recognization of Product for Blind Person using MATLABIRJET Journal
 
IRJET- Text Recognization of Product for Blind Person using MATLAB
IRJET- Text Recognization of Product for Blind Person using MATLABIRJET- Text Recognization of Product for Blind Person using MATLAB
IRJET- Text Recognization of Product for Blind Person using MATLABIRJET Journal
 
Everything You Need to Know About Computer Vision
Everything You Need to Know About Computer VisionEverything You Need to Know About Computer Vision
Everything You Need to Know About Computer VisionKavika Roy
 
Project Report on High Speed Face Recognitions using DCT, RBF nueral network
Project Report on High Speed Face Recognitions using DCT, RBF nueral networkProject Report on High Speed Face Recognitions using DCT, RBF nueral network
Project Report on High Speed Face Recognitions using DCT, RBF nueral networkSagar Rai
 
iIMAGE RECOGNITIONImage recognitionRobert Vasquez
iIMAGE RECOGNITIONImage recognitionRobert VasqueziIMAGE RECOGNITIONImage recognitionRobert Vasquez
iIMAGE RECOGNITIONImage recognitionRobert VasquezMalikPinckney86
 
Recognition system for facial expression by processing images with deep learn...
Recognition system for facial expression by processing images with deep learn...Recognition system for facial expression by processing images with deep learn...
Recognition system for facial expression by processing images with deep learn...TELKOMNIKA JOURNAL
 
Face Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image ProcessingFace Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image Processingpaperpublications3
 
IRJET- Recurrent Neural Network for Human Action Recognition using Star S...
IRJET-  	  Recurrent Neural Network for Human Action Recognition using Star S...IRJET-  	  Recurrent Neural Network for Human Action Recognition using Star S...
IRJET- Recurrent Neural Network for Human Action Recognition using Star S...IRJET Journal
 
Computer Based Human Gesture Recognition With Study Of Algorithms
Computer Based Human Gesture Recognition With Study Of AlgorithmsComputer Based Human Gesture Recognition With Study Of Algorithms
Computer Based Human Gesture Recognition With Study Of AlgorithmsIOSR Journals
 
Detection and Tracking of Objects: A Detailed Study
Detection and Tracking of Objects: A Detailed StudyDetection and Tracking of Objects: A Detailed Study
Detection and Tracking of Objects: A Detailed StudyIJEACS
 

Similar a Computer vision (20)

Color based image processing , tracking and automation using matlab
Color based image processing , tracking and automation using matlabColor based image processing , tracking and automation using matlab
Color based image processing , tracking and automation using matlab
 
Saksham presentation
Saksham presentationSaksham presentation
Saksham presentation
 
Class PPT based on engineering subject cv.pptx
Class PPT based on engineering subject cv.pptxClass PPT based on engineering subject cv.pptx
Class PPT based on engineering subject cv.pptx
 
Saksham seminar report
Saksham seminar reportSaksham seminar report
Saksham seminar report
 
computer vision.pdf
computer vision.pdfcomputer vision.pdf
computer vision.pdf
 
IRJET- Full Body Motion Detection and Surveillance System Application
IRJET-  	  Full Body Motion Detection and Surveillance System ApplicationIRJET-  	  Full Body Motion Detection and Surveillance System Application
IRJET- Full Body Motion Detection and Surveillance System Application
 
F0932733
F0932733F0932733
F0932733
 
IRJET- Review on Text Recognization of Product for Blind Person using MATLAB
IRJET-  Review on Text Recognization of Product for Blind Person using MATLABIRJET-  Review on Text Recognization of Product for Blind Person using MATLAB
IRJET- Review on Text Recognization of Product for Blind Person using MATLAB
 
Ch 1
Ch 1Ch 1
Ch 1
 
AUGMENTED REALITY
AUGMENTED REALITYAUGMENTED REALITY
AUGMENTED REALITY
 
Paper
PaperPaper
Paper
 
IRJET- Text Recognization of Product for Blind Person using MATLAB
IRJET- Text Recognization of Product for Blind Person using MATLABIRJET- Text Recognization of Product for Blind Person using MATLAB
IRJET- Text Recognization of Product for Blind Person using MATLAB
 
Everything You Need to Know About Computer Vision
Everything You Need to Know About Computer VisionEverything You Need to Know About Computer Vision
Everything You Need to Know About Computer Vision
 
Project Report on High Speed Face Recognitions using DCT, RBF nueral network
Project Report on High Speed Face Recognitions using DCT, RBF nueral networkProject Report on High Speed Face Recognitions using DCT, RBF nueral network
Project Report on High Speed Face Recognitions using DCT, RBF nueral network
 
iIMAGE RECOGNITIONImage recognitionRobert Vasquez
iIMAGE RECOGNITIONImage recognitionRobert VasqueziIMAGE RECOGNITIONImage recognitionRobert Vasquez
iIMAGE RECOGNITIONImage recognitionRobert Vasquez
 
Recognition system for facial expression by processing images with deep learn...
Recognition system for facial expression by processing images with deep learn...Recognition system for facial expression by processing images with deep learn...
Recognition system for facial expression by processing images with deep learn...
 
Face Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image ProcessingFace Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image Processing
 
IRJET- Recurrent Neural Network for Human Action Recognition using Star S...
IRJET-  	  Recurrent Neural Network for Human Action Recognition using Star S...IRJET-  	  Recurrent Neural Network for Human Action Recognition using Star S...
IRJET- Recurrent Neural Network for Human Action Recognition using Star S...
 
Computer Based Human Gesture Recognition With Study Of Algorithms
Computer Based Human Gesture Recognition With Study Of AlgorithmsComputer Based Human Gesture Recognition With Study Of Algorithms
Computer Based Human Gesture Recognition With Study Of Algorithms
 
Detection and Tracking of Objects: A Detailed Study
Detection and Tracking of Objects: A Detailed StudyDetection and Tracking of Objects: A Detailed Study
Detection and Tracking of Objects: A Detailed Study
 

Más de Shiva Krishna Chandra Shekar (20)

Airtel final
Airtel finalAirtel final
Airtel final
 
Airtel COMPNAY
Airtel COMPNAYAirtel COMPNAY
Airtel COMPNAY
 
Microsoft data access components
Microsoft data access componentsMicrosoft data access components
Microsoft data access components
 
Ad hoc
Ad hocAd hoc
Ad hoc
 
Mobile adhoc
Mobile adhocMobile adhoc
Mobile adhoc
 
Ldap
LdapLdap
Ldap
 
L2tp1
L2tp1L2tp1
L2tp1
 
Ivrs
IvrsIvrs
Ivrs
 
Ip sec
Ip secIp sec
Ip sec
 
I pod
I podI pod
I pod
 
Internet
InternetInternet
Internet
 
Image compression
Image compressionImage compression
Image compression
 
Hyper thread technology
Hyper thread technologyHyper thread technology
Hyper thread technology
 
Raju html
Raju htmlRaju html
Raju html
 
Raju
RajuRaju
Raju
 
Dba
DbaDba
Dba
 
Di splay systems
Di splay systemsDi splay systems
Di splay systems
 
Ananth3
Ananth3Ananth3
Ananth3
 
Ppt
PptPpt
Ppt
 
Honeypots
HoneypotsHoneypots
Honeypots
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Computer vision

  • 1. COMPUTER VISION Introduction Computer vision is the study and application of methods which allow computers to "understand" image content or content of multidimensional data in general. The term "understand" means here that specific information is being extracted from the image data for a specific purpose: either for presenting it to a human operator (e. g., if cancerous cells have been detected in a microscopy image), or for controlling some process (e. g., an industry robot or an autonomous vehicle). The image data that is fed into a computer vision system is often a digital gray-scale or colour image, but can also be in the form of two or more such images (e. g., from a stereo camera pair), a video sequence, or a 3D volume (e. g., from a tomography device). In most practical computer vision applications, the computers are pre-programmed to solve a particular task, but methods based on learning are now becoming increasingly common. Computer vision can also be described as the complement (but not necessary the opposite) of biological vision. In biological vision and visual perception real vision systems of humans and various animals are studied, resulting in models of how these systems are implemented in terms of neural processing at various levels. State Of The Art Relation between Computer vision and various other fields The field of computer vision can be characterized as immature and diverse. Even though earlier work exists, it was not until the late 1970's that a more focused study of the field 1
  • 2. started when computers could manage the processing of large data sets such as images. However, these studies usually originated from various other fields, and consequently there is no standard formulation of the "computer vision problem". Also, and to an even larger extent, there is no standard formulation of how computer vision problems should be solved. Instead, there exists an abundance of methods for solving various well-defined computer vision tasks, where the methods often are very task specific and seldom can be generalized over a wide range of applications. Many of the methods and applications are still in the state of basic research, but more and more methods have found their way into commercial products, where they often constitute a part of a larger system which can solve complex tasks (e.g., in the area of medical images, or quality control and measurements in industrial processes). A significant part of artificial intelligence deals with planning or deliberation for system which can perform mechanical actions such as moving a robot through some environment. This type of processing typically needs input data provided by a computer vision system, acting as a vision sensor and providing high-level information about the environment and the robot. Other parts which sometimes are described as belonging to artificial intelligence and which are used in relation to computer vision is pattern recognition and learning techniques. As a consequence, computer vision is sometimes seen as a part of the artificial intelligence field. Since a camera can be seen as a light sensor, there are various methods in computer vision based on correspondences between a physical phenomenon related to light and images of that phenomenon. For example, it is possible to extract information about motion in fluids and about waves by analyzing images of these phenomena. Also, a subfield within computer vision deals with the physical process which given a scene of objects, light sources, and camera lenses forms the image in a camera. Consequently, computer vision can also be seen as an extension of physics.A third field which plays an important role is neurobiology, specifically the study of the biological vision system. Over the last century, there has been an extensive study of eyes, neurons, and the brain structures devoted to processing of visual stimuli in both humans and various animals. This has led to a coarse, yet complicated, description of how "real" vision systems 2
  • 3. operate in order to solve certain vision related tasks. These results have led to a subfield within computer vision where artificial systems are designed to mimic the processing and behaviour of biological systems, at different levels of complexity. Also, some of the learning-based methods developed within computer vision have their background in biology. Yet another field related to computer vision is signal processing. Many existing methods for processing of one-variable signals, typically temporal signals, can be extended in a natural way to processing of two-variable signals or multi-variable signals in computer vision. However, because of the specific nature of images there are many methods developed within computer vision which have no counterpart in the processing of one- variable signals. A distinct character of these methods is the fact that they are non-linear which, together with the multi-dimensionality of the signal, defines a subfield in signal processing as a part of computer vision. Beside the above mentioned views on computer vision, many of the related research topics can also be studied from a purely mathematical point of view. For example, many methods in computer vision are based on statistics, optimization or geometry. Finally, a significant part of the field is devoted to the implementation aspect of computer vision; how existing methods can be realized in various combinations of software and hardware, or how these methods can be modified in order to gain processing speed without losing too much performance. Related Fields Computer vision, Image processing, Image analysis, Robot vision and Machine vision are closely related fields. If you look inside text books which have either of these names in the title there is a significant overlap in terms of what techniques and applications they cover. This implies that the basic techniques that are used and developed in these fields are more or less identical, something which can be interpreted as there is only one field with different names. On the other hand, it appears to be necessary for research groups, scientific journals, conferences and companies to present or market themselves as 3
  • 4. belonging specifically to one of these fields and, hence, various characterizations which distinguish each of the fields from the others have been presented. The following characterizations appear relevant but should not be taken as universally accepted. Image processing and Image analysis tend to focus on 2D images, how to transform one image to another, e.g., by pixel-wise operations such as contrast enhancement, local operations such as edge extraction or noise removal, or geometrical transformations such as rotating the image. This characterization implies that image processing/analysis neither require assumptions nor produce interpretations about the image content. Computer vision tends to focus on the 3D scene projected onto one or several images, e.g., how to reconstruct structure or other information about the 3D scene from one or several images. Computer vision often relies on more or less complex assumptions about the scene depicted in an image. Machine vision tends to focus on applications, mainly in industry, e.g., vision based autonomous robots and systems for vision based inspection or measurement. This implies that image sensor technologies and control theory often are integrated with the processing of image data to control a robot and that real-time processing is emphasized by means of efficient implementations in hardware and software. There is also a field called Imaging which primarily focus on the process of producing images, but sometimes also deals with processing and analysis of images. For example, Medical imaging contains lots of work on the analysis of image data in medical applications. Finally, pattern recognition is a field which uses various methods to extract information from signals in general, mainly based on statistical approaches. A significant part of this field is devoted to applying these methods to image data.A consequence of this state of affairs is that you can be working in a lab related to one of these fields, apply methods from a second field to solve a problem in a third field and present the result at a conference related to a fourth field! Typical Tasks Of Computer Vision 4
  • 5. Each of the application areas described above employ a range of computer vision tasks; more or less well-defined measurement problems or processing problems, which can be solved using a variety of methods. Some examples of typical computer vision tasks are presented below. Recognition The classical problem in computer vision, image processing and machine vision is that of determining whether or not the image data contains some specific object, feature, or activity. This task can normally be solved robustly and without effort by a human, but is still not satisfactory solved in computer vision for the general case: arbitrary objects in arbitrary situations. The existing methods for dealing with this problem can at best solve it only for specific objects, such as simple geometric objects (e.g., polyhedrons), human faces, printed or hand-written characters, or vehicles, and in specific situations, typically described in terms of well-defined illumination, background, and pose of the object relative to the camera. Different varieties of the recognition problem are described in the literature: • Recognition: one or several pre-specified or learned objects or object classes can be recognized, usually together with their 2D positions in the image or 3D poses in the scene. • Identification: An individual instance of an object is recognized. Examples: identification of a specific person face or fingerprint, or identification of a specific vehicle. • Detection: the image data is scanned for a specific condition. Examples: detection of possible abnormal cells or tissues in medical images or detection of a vehicle in an automatic road toll system. Detection based on relatively simple and fast computations is sometimes used for finding smaller regions of interesting image data which can be further analyzed by more computationally demanding techniques to produce a correct interpretation. Several specialized tasks based on recognition exist, such as: 5
  • 6. • Content-based image retrieval: find all images which has a specific content in a larger set or database of images. • Pose estimation: estimation of the position and orientation of specific object relative to the camera. Example: to allow a robot arm to pick up the objects from the belt. • Optical character recognition (or OCR): images of printed or handwritten text are converted to computer readable text such as ASCII or Unicode. Motion Several tasks relate to motion estimation in which an image sequence is processed to produce an estimate of the local image velocity at each point. Examples of such tasks are • Egomotion: determine the 3D rigid motion of the camera. • Tracking of one or several objects (e.g. vehicles or humans) through the image sequence. • Surveillance: detection of possible activities based on motion. Scene Reconstruction Given two or more images of a scene, or a video, scene reconstruction aims at computing a 3D model of the scene. In the simplest case the model can be a set of 3D points. More sophisticated methods produce a complete 3D surface model. Image Restoration Given an image, an image sequence, or a 3D volume, which has been degraded by noise, image restoration aims at producing the image data without the noise. Examples of noise processes which are considered are sensor noise (e.g., ultrasonic images) and motion blur (e.g., because of a moving camera or moving objects in the scene). Computer Vision Systems 6
  • 7. A typical computer vision system can be divided in the following subsystems: Image acquisition The image or image sequence is acquired with an imaging system (camera,radar,lidar,tomography system). Often the imaging system has to be calibrated before being used. Preprocessing In the preprocessing step, the image is being treated with "low-level"-operations. The aim of this step is to do noise reduction on the image (i.e. to dissociate the signal from the noise) and to reduce the overall amount of data. This is typically being done by employing different (digital)image processing methods such as: 1. Downsampling the image. 2. Applying digital filters 3. Computing the x- and y-gradient (possibly also the time-gradient). 4. Segmenting the image. a. Pixelwise thresholding. 5. Performing an eigentransform on the image a. Fourier transform 6. Doing motion estimation for local regions of the image (also known as optical flow estimation). 7. Estimating disparity in stereo images. 8. Multiresolution analysis Feature extraction 7
  • 8. The aim of feature extraction is to further reduce the data to a set of features, which ought to be invariant to disturbances such as lighting conditions, camera position, noise and distortion. Examples of feature extraction are: 1. Performing edge detection or estimation of local orientation. 2. Extracting corner features. 3. Detecting blob features. 4. Extracting spin images from depth maps. 5. Extracting geons or other three-dimensional primitives, such as superquadrics. 6. Acquiring contour lines and maybe curvature zero crossings. 7. Generating features with the Scale-invariant feature transform. 8. Calculating the Co-occurrence matrix of the image or sub-images to measure texture. Registration The aim of the registration step is to establish correspondence between the features in the acquired set and the features of known objects in a model-database and/or the features of the preceding image. The registration step has to bring up a final hypothesis. To name a few methods: 1. Least squares estimation 2. Hough transform in many variations 3. Geometric hashing 4. Particle filtering Applications Of Computer Vision The following is a non-complete list of applications which are studied in computer vision. In this category, the term application should be interpreted as a high level function which solves a problem at a higher level of complexity. Typically, the various technical problems related to an application can be solved and implemented in different ways. 8
  • 9. Applications Of Computer Vision A facial recognition system is a computer-driven application for automatically identifying a person from a digital image. It does that by comparing selected facial features in the live image and a facial database. It is typically used for security systems and can be compared to other biometrics such as fingerprint or eye iris recognition systems. Popular recognition algorithms include eigenface, fisherface, the Hidden Markov model, and the neuronal motivated Dynamic Link Matching. A newly emerging trend, claimed to achieve previously unseen accuracies, is three-dimensional face recognition. Another emerging trend uses the visual details of the skin, as captured in standard digital or scanned images. Tests on the FERET database, the widely used industry benchmark, showed that this approach is substantially more reliable than previous algorithms. Polly (robot) Polly was a robot created at the MIT Artificial Intelligence Laboratory by Ian Horswill for his PhD, which was published in 1993 as a technical report. It was the first mobile robot to move at animal-like speeds (1m per second) using computer vision for its navigation. It was an example of behavior based robotics. For a few years, Polly was able to give tours of the AI laboratory's seventh floor, using canned speech to point out landmarks such as Anita Flynn's office. The Polly algorithm is a way to navigate in a cluttered space using very low resolution vision to find uncluttered areas to move forward into, assuming that the pixels at the bottom of the frame (the closest to the robot) show an example of an uncluttered area. Since this could be done 60 times a second, the algorithm only needed to discriminate three categories: telling the robot at each instant to go straight, towards the right or towards the left. Mobile robot 9
  • 10. Mobile Robots are automatic machines that are capable of movement in a given environment. Robots generally fall into two classes, linked manipulators (or Industrial robots) and mobile robots. Mobile robots have the capability to move around in their environment and are not fixed to one physical location. In contrast, industrial manipulators usually consist of a jointed arm and gripper assembly (or end effector) that is attached to a fixed surface. The most common class of mobile robots are wheeled robots. A second class of mobile robots includes legged robots while a third smaller class includes aerial robots, usually referred to as unmanned aerial vehicles (UAVs). Mobile robots are the focus of a great deal or current research and almost every major university has one or more labs that focus on mobile robot research. Mobile robots are also found in industry, military and security environments, and appear as consumer products. Robot A humanoid robot manufactured by Toyota "playing" a trumpet The word robot is used to refer to a wide range of machines, the common feature of which is that they are all capable of movement and can be used to perform physical tasks. Robots take on many different forms, ranging from humanoid, which mimic the human form and way of moving, to industrial, whose appearance is dictated by the function they are to perform. Robots can be grouped generally as mobile robots (eg. autonomous vehicles), manipulator robots (eg. industrial robots) and Self reconfigurable robots, which can conform themselves to the task at hand. Robots may be controlled directly by a human, such as remotely-controlled bomb- disposal robots, robotic arms, or shuttles, or may act according to their own decision making ability, provided by artificial intelligence. However, the majority of robots fall in- between these extremes, being controlled by pre-programmed computers. Such robots may include feedback loops such that they can interact with their environment, but do not display actual intelligence. 10
  • 11. The word "robot" is also used in a general sense to mean any machine which mimics the actions of a human (biomimicry), in the physical sense or in the mental sense.It comes from the Czech and Slovak word robota, labour or work (also used in a sense of a serf). The word robot first appeared in Karel Čapek's science fiction play R.U.R. (Rossum's Universal Robots) in 1921. History The construction of the Soviet-made robot of the 1970's. The robot was able to move, reproduce the pre-recorded sounds, imitate the clever conversation using the built-in radio station and demonstrate movies on the built-in screen. It was used in various 11
  • 12. shows.The word robot was introduced by Czech writer Karel Capek in his play R.U.R. (Rossum's Universal Robots) which was written in 1920 (See also Robots in literature for details of the play). However, the verb robotovat, meaning "to work" or "to slave", and the noun robota (meaning corvée) used in the Czech and Slovak languages, has been used since the early 10th century. It was suggested that the word robot had been coined by Karel Čapek's brother, painter and writer Josef Čapek. An early automaton was created 1738 by Jacques de Vaucanson, who created a mechanical duck that was able to eat grain, flap its wings, and excrete. The first human to be killed by a robot was 37 year-old Kenji Urada, a Japanese factory worker, in 1981. According the Economist.com, Urada "climbed over a safety fence at a Kawasaki plant to carry out some maintenance work on a robot. In his haste, he failed to switch the robot off properly. Unable to sense him, the robot's powerful hydraulic arm kept on working and accidentally pushed the engineer into a grinding machine." Smart Camera A smart camera is an integrated machine vision system which, in addition to image capture circuitry, includes a processor, which can extract information fromimageswithout need for an external processing unit, and interface devices used to make results available to other devices. A Smart Camera or „intelligent Camera“ is a self-contained, standalone vision system with built-in image sensor in the housing of an industrial video camera. It contains all necessary communication interfaces, e.g. Ethernet. It is not necessarily larger than an 12
  • 13. industrial or surveillance camera. This architecture has the advantage of a more compact volume compared to PC-based vision systems and often achieves lower cost, at the expense of a somewhat simpler (or missing altogether) user interface. Early smart camera (ca. 1985, in red) with an 8MHz Z80 compared to a modern device featuring Texas Instruments' C64 @1GHz. A Smart Camera usually consists of several (but not necessarily all) of the following components: 1. Image sensor (matrix or linear, CCD- or CMOS) 2. Image digitization circuitry 3. Image memory 4. Communication interface (RS232, Ethernet) 5. I/O lines (often optoisolated) 6. Lens holder or built in lens (usually C or C-mount) Examples Of Applications For Computer Vision Another way to describe computer vision is in terms of applications areas. One of the most prominent application fields is medical computer vision or medical image processing. This area is characterized by the extraction of information from image data for the purpose of making a medical diagnosis of a patient. Typically image data is in the form of microscopy images, X-ray images, angiography images, ultrasonic images, and tomography images. An example of information which can be extracted from such image data is detection of tumours, arteriosclerosis or other malign changes. It can also be measurements of organ dimensions, blood flow, etc. This application area also supports 13
  • 14. medical research by providing new information, e.g., about the structure of the brain, or about the quality of medical treatments. A second application area in computer vision is in industry. Here, information is extracted for the purpose of supporting a manufacturing process. One example is quality control where details or final products are being automatically inspected in order to find defects. Another example is measurement of position and orientation of details to be picked up by a robot arm. See the article on machine vision for more details on this area. Military applications are probably one of the largest areas for computer vision, even though only a small part of this work is open to the public. The obvious examples are detection of enemy soldiers or vehicles and guidance of missiles to a designated target. More advanced systems for missile guidance send the missile to an area rather than a specific target, and target selection is made when the missile reaches the area based on locally acquired image data. Modern military concepts, such as "battlefield awareness,"imply that various sensors, including image sensors, provide a rich set of information about a combat scene which can be used to support strategic decisions. In this case, automatic processing of the data is used to reduce complexity and to fuse information from multiple sensors to increase reliability. Artist's Concept of Rover on Mars. Notice the stereo cameras mounted on top of the Rover. (credit: Maas Digital LLC) One of the newer application areas is autonomous vehicles, which include submersibles, land-based vehicles (small robots with wheels, cars or trucks), and aerial vehicles. An unmanned aerial vehicle is often denoted UAV. The level of autonomy ranges from fully autonomous (unmanned) vehicles to vehicles where computer vision based systems support a driver or a pilot in various situations. Fully autonomous vehicles typically use computer vision for navigation, e. g., a UAV looking for forest fires. Examples of supporting system are obstacle warning systems in cars and systems for autonomous landing of aircraft. Several car manufacturers have demonstrated systems for autonomous driving of cars, but this technology has still not reached a level where it can be put on the market. 14
  • 15. Software For Computer Vision Animal Animal (first implementation: 1988 - revised: 2004) is an interactive environment for Image processing that is oriented toward the rapid prototyping, testing, and modification of algorithms. To create ANIMAL (AN IMage ALgebra), XLISP of David Betz was extended with some new types: sockets, arrays, images, masks, and drawables. The theoretical framework and the implementation of the working environment is described in the paper ANIMAL: AN IMage ALgebra.In the theoretical framework of ANIMAL a digital image is a boundless matrix. However, in the implementation it is bounded by a rectangular region in the discrete plane and the elements outside the region have a constant value. The size and position of the region in the plane (focus) is defined by the coordinates of the rectangle. In this way all the pixels, including those on the border, have the same number of neighbors (useful in local operators, such as digital filters). Furthermore, pixelwise commutative operations remain commutative on image level, independently on focus. OpenCv OpenCV is an open source computer vision library developed by Intel. The library is cross-platform, and runs on both Windows and Linux. It focuses mainly towards real- time image processing. The application areas include 1. Human-Computer Interface (HCI) 2. Object Identification 3. Segmentation and Recognition 4. Face Recognition 5. Gesture Recognition 6. Motion Tracking Visualization Toolkit (VTK) 15
  • 16. Visualization Toolkit (VTK) is an open source, freely available software system for 3D computer graphics, image processing, and visualization used by thousands of researchers and developers around the world. VTK consists of a C++ class library, and several interpreted interface layers including Tcl/Tk, Java, and Python. Professional support and products for VTK are provided by Kitware, Inc. VTK supports a wide variety ofvisualization algorithms including scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as implicit modelling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation. Commercial Computer Vision Systems Automatix Inc., founded in January 1980, was the first company to market industrial robots with built-in machine vision. Its founders were Victor Scheinman, inventor of the Stanford arm; Phillippe Villers, Michael Cronin, and Arnold Reinhold of Computervision; Jake Dias and Dan Nigro of Data General; Gordon VanderBrug, of NBS and Norman Wittels of Clark University. Automatix Robots at the Robots 1985 show in Detroit, Michigan. Clockwise from lower left: AID 600, AID 900 Seamtracker, Yaskawa Motoman.Automatix mostly used robot mechanisms imported from Hitachi at first and later from Yaskawa and KUKA. It did design and manufacture a Cartesian robot called the AID-600. The 600 was intended for use in precision assembly but was adapted for welding use, particularly Tungsten inert gas welding (TIG), which demands high accuracy and immunity from the intense electromagnetic interference that the TIG process creates. Automatix was the first company to market a vision-guided welding robot called Seamtracker. Structured laser 16
  • 17. light and monochromatic filters were used to allow an image to be seen in the presence of the welding arc. Another concept, invented by Mr. Scheinman, was RobotWorld, a system of cooperating small modules suspended from a 2-D linear motor. The product line was later sold to Yaskawa. Automatix raised large amounts of venture capital, and went public in 1983, but was not profitable until the early 1990s. In 1994, Automatix merged with another machine vision company, Itran Corp., to form Acuity Imaging, Inc. Acuity was acquired by Robotics Vision Systems Inc. (RVSI) in September 1995. As of 2004, RVSI still supported the evolved Automatix machine vision package under the PowerVision brand. RapidEye is a commercial multispectral remote sensing satellite mission being designed and implemented by MDA for RapidEye AG. The RapidEye sensor images five optical bands in the 400-850nm range and provides 5m pixel size at nadir. Rapid delivery and short revisit times are provided through the use of a five-satellite constellation. Scantron is the name of a United States company that makes and sells Scantron exam answer sheets and the machines to grade them. The Scantron system usually takes the form of a "multiple choice, fill-in-the-circle/square/rectangle" form of varying length and width, from single column 50 answer tests, to multiple 8.5" x 11" page forms used in standardized testing such as the SAT and ACT. The forms are sensed optically, using optical mark recognition to detect markings in each place, in a "Scantron Machine" that tabulates and can automatically grade results. Earlier versions were sensed electrically. 17
  • 18. A typical 100-answer Scantron answer sheet. This is only half of it (the front side) with the back side not being shown.Commonly, there are two sides to Scantron answer sheets. 18
  • 19. They can contain 50 answer blanks, 100 answer blanks, and so on. There is even a smaller form called a "Quiz Strip" that contains only about 20 answer boxes to bubble-in. On the larger sheets, there is a space on the back where answers can be manually written in for separate questions, if a test giver issues them out. The full-sized 8.5" x 11" form may contain a larger area for using it to work on math formulas, write short answers, etc. Answers "A" and "B" are commonly used for "True" and "False" questions, as shown in the image to the right on the top of each row. Grading of Scantron sheets is performed first by creating an answer key. The answer key is simply a standard Scantron answer sheet with all of the correct answers filled in, along with the "key" rectangle at the top of the sheet.Once you have your answer key ready the Scantron machine is powered on and the answer key is fed through. This stores the answer key in the memory of the Scantron machine and any further sheets that are fed through will be graded and marked according to the key in memory. Switching off the Scantron machine will stop the paper feed and clear the memory. 19
  • 20. Conclusion Computer vision, unlike for example factory machine vision, happens in unconstrained environments, potentially with changing cameras and changing lighting and camera views. Also, some “objects” such as roads, rivers, bushes, etc. are just difficult to describe. In these situations, engineering a model a-priori can be difficult. With learning- based vision, one just “points” the algorithm at the data and useful models for detection, segmentation, and identification can often be formed. Learning can often easily fuse or incorporate other sensing modalities such as sound, vibration, or heat. Since cameras and sensors are becoming cheap and powerful and learning algorithms have a vast appetite for computational threads, Intel is very interested in enabling geometric and learning- based vision routines in its OpenCV library since such routines are vast consumers of computational power. 20
  • 21. 21