This document discusses computer vision and its capabilities. It notes that computer vision is ready for mainstream use, evolving faster than expected, and a gateway for AI applications. Computer vision allows extracting useful information from visual data, and examples of its applications include vehicle navigation, farming, visual search, and monitoring brand exposure. It is changing everything by unlocking the visual world for computers. Computer vision also creates richer experiences by understanding facial expressions and emotions. Getting started with computer vision is easier than expected using commercial cloud APIs that require no training data.
2. ● It’s ready for prime time
● It’s evolving faster than you think
● It’s a gateway capability
● It’s changing everything
● It creates a richer experience
● It’s easier to use than you think
COMPUTER VISION
O
U
R
JO
U
R
N
EY
12. IMAGE SIZING
VISION API FEATURE RECOMMENDED SIZE NOTES
FACE_DETECTION 1600 x 1200 Distance between eyes is most important
LANDMARK_DETECTION 640 x 480
LOGO_DETECTION 640 x 480
LABEL_DETECTION 640 x 480
TEXT_DETECTION 1024 x 768
OCR requires more resolution to detect
characters
SAFE_SEARCH_DETECTION 640 x 480
cloud.google.com/vision/docs/supported-files
13. WHERE IS THE A.I.?
In the system’s ability to determine objects and
context within each image.
Compared to having people view and judge images
manually.
Remember: A.I. is subjective
14. WHAT IS COMPUTER VISION?
And just to clarify…
Visual recognition — exploring face detection,
emotion recognition, text extraction, damage
identification, context awareness, and more.
15. EXAMPLES IN USE TODAY
Manufacturing
● Ensure products are positioned correctly on an
assembly line
Visual auditing
● Monitor for compliance or deterioration in fleet of
trucks, planes, or windmills
● Train classifiers to understand what defects look
like
Insurance
● Quickly process claims by classifying images of
claims
Social listening
● Track buzz about your company on social media
Security
● Monitor for activity, instantly classify objects as
threat or not
Social commerce
● Use an image of a food dish to find out which
restaurant serves it and find reviews
● Use a travel photo to find vacation suggestions
based on similar experiences
● Use a house image to find similar homes that are
for sale
Retail
● Take a photo of a favorite outfit to find stores with
those clothes in stock or on sale
● Use a travel image to find retail suggestions in that
area
Education
● Create image-based applications to educate about
taxonomies
● Use pictures to find educational material on similar
subjects
R
ESO
U
R
C
ES
16. Computers have already matched — and exceeded —
human capabilities when it comes to understanding
visual information.
Now it’s just a question of finding uses and applying
this new capability in productive ways.
READY FOR PRIME TIME
R
EC
A
P
18. ALREADY BETTER THAN HUMANS
February, 2015 — “...researchers say their system
achieved a 4.94% error rate... In previous experiments,
humans have achieved an estimated 5.1% error rate.”
microsoft.com/en-us/research/blog/microsoft-researchers-algorithm-sets-imagenet-challenge-m
ilestone
19. AND COMING SOON...CREATE
Write a text description of an
existing image.
Given a text description,
generate an image from scratch.
petapixel.com/2016/09/23/googles-image-captioning-ai
-can-describe-photos-94-accuracy youtu.be/rAbhypxs1qQ?t=5s
[ 2016 ]
21. Computer vision is already better than humans at
identifying objects within images.
And they just started learning how to create images
from scratch...and they’re already pretty good at it.
FASTER THAN YOU THINK
R
EC
A
P
23. If you’re looking for a place to begin your A.I. journey,
this is a good starting point...
24. THE PROBLEM
We live in a visual world, yet capturing useful
information from images has historically required
human vision — which can be slow and costly.
25. THE GOAL
But if we could extract that useful information
through computer vision, it could provide
invaluable insight for business.
26. THE SOLUTION
An intelligent visual recognition service that
automatically analyzes and identifies objects and
scenes in image files (video, etc.).
27. A HIGH PROFILE EXAMPLE
Facial recognition
systems are coming on
strong and being used
in a wide variety of
applications.
youtu.be/K4u4Dpl6NKk?t=1m9s
28. Visual recognition is 1 of 2 critical capabilities that will
allow artificial intelligence to integrate into and
empower our world in ways we’ve only dreamed of.
It allows computers to interact with humans on our
own terms — to integrate into our daily lives.
GATEWAY CAPABILITY
R
EC
A
P
Speech is the other one.
37. IT’S EVEN CHANGING ‘PEOPLE
WATCHING’
medium.com/homeland-security/no-longer-just-another-face-in-the-crowd-15e1c74fe24
38. Giving computers access to the visual world will
empower our work and lives.
Amplifying human productivity in endless ways.
IT’S CHANGING EVERYTHING
R
EC
A
P
40. “Customer
experience is the
new battlefield.”
~ Gartner, 2015
accenture.com/us-en/insight-artificial-intelligence-ui
EXPERIENCE ABOVE ALL
41. VIP TREATMENT
A top customer walks into your store.
The system instantly recognizes them, issues a
personalized greeting, and alerts an attendant.
And/or…
Allows you to track store visits just as we track web
visits in Google Analytics (faces vs. IP addresses).
44. A RICHER EXPERIENCE
Computers can now
see, read — and act
upon — the full
spectrum of our
communication.
Text Only
Text + Speech
Text + Speech + Vision
Adding...
Voice tone
Voice inflection
Adding...
Facial expressions
Body language
Language sentiment analysis
en.wikipedia.org/wiki/Albert_Mehrabian
7%
38%
55%
R
EC
A
P
45. It’s often said that the verbal and audible elements of
communication only make up 45% of what is being
said.
A RICHER EXPERIENCE
47. PICK YOUR STARTING POINT
A.I. M
aturity
No training data required!
Purpose-Built Platform, Their Training Data
Commercial Platform, Their Training Data
Commercial Platform, Your Training Data
In-House Platform, Your Training Data
48. NOTHING NEW HERE
Most of the major cloud providers
have purpose-built A.I.-powered APIs.
Many have a free tier.
And they work exactly like
every other API you’re already using.
49. R
ESO
U
R
C
ESLOTS OF COMMERCIAL OPTIONS
Jumping back to our demo above, here are some
alternative commercial APIs…
● IBM Watson Visual Recognition
● Microsoft Computer Vision API
● Clarifai
52. Start with a pre-trained cloud API (no training data
required).
Most cloud providers offer a free tier. So start thinking
about the different ways to use computer vision.
Then just start testing and see what you can do.
EASIER THAN YOU THINK
R
EC
A
P
Don't get me wrong, there's an insane amount of complexity behind the scenes.
But fortunately, you need to know that stuff to take advantage of A.I.
54. HOW-TO GUIDES
● Building Voice-Enabled Products With Amazon Alexa
● Cognitive Customer Engagement Using IBM Watson
● Harnessing Visual Data Using Google Cloud
● Building a Recommendation Engine Using Microsoft Azure
● Predicting Marketing Campaign Response Using Amazon Machine Learning
● Unleashing A.I.-Powered Conversation With IBM Watson
● Get into the Mind of Your Customer Using Google’s Sentiment Analysis Tools
● Discover Your Customers’ Deepest Feelings Using Microsoft Facial Recognition
● Give Your Products the Power of Speech Using Amazon Polly
● Computers Are Opening Their Eyes — and They’re Already Better at Seeing Than We Are
● How to Predict When You’re Going to Lose a Subscriber
● The Future of Business is a Digital Spokesperson — Let’s Build a Preview Using Microsoft’s Bot
Framework
● Predicting Personality Traits from Content Using IBM Watson
R
ESO
U
R
C
ES
How to build the demo
app in this session
55. ● Computer speech is ready for prime time
● It’s coming faster than you think
● It’s a gateway capability
● It’s changing everything
● It creates a richer experience
● It’s easier to use than you think
JOURNEY’S END
R
EC
A
P
56. COMING UP...
Laying the foundation
● Cutting Through the Hype
2 A.I. Technologies that will have the greatest impact
● Computer Speech
● Computer Vision
2 A.I. Applications with the quickest R.O.I.
● Predictive Engagement
● Predictive Personalization
STA
Y
TU
N
ED
57. QUESTIONS OR COMMENTS?
Gigaom A.I. Team: ai@gigaom.com
Workshop Facilitator: chris.mohritz@10xeffect.com
C
O
N
TA
C
T