SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
© 2011 IBM Corporation
IBM Research
What is Watson – An Overview
Michael Perrone
Mgr., Multicore Computing
IBM TJ Watson Research Center
1
With special thanks to the
IBM Research DeepQA Team!
© 2011 IBM Corporation
IBM Research
Informed Decision Making: Search vs. Expert Q&A
Decision Maker
Search Engine
Finds Documents containing Keywords
Delivers Documents based on Popularity
Has Question
Distills to 2-3 Keywords
Reads Documents, Finds
Answers
Finds & Analyzes Evidence
Expert
Understands Question
Produces Possible Answers & Evidence
Delivers Response, Evidence & Confidence
Analyzes Evidence, Computes Confidence
Asks NL Question
Considers Answer & Evidence
Decision Maker
© 2011 IBM Corporation
IBM Research
Want to Play Chess or Just Chat?
Chess
– A finite, mathematically well-defined search space
– Limited number of moves and states
– All the symbols are completely grounded in the mathematical rules of the game
Human Language
– Words by themselves have no meaning
– Only grounded in human cognition
– Words navigate, align and communicate an infinite space of intended meaning
– Computers can not ground words to human experiences to derive meaning
© 2011 IBM Corporation
IBM Research
Easy Questions?
4 IBM Confidential
(ln(12,546,798 * π)) ^ 2 / 34,567.46 =
Owner Serial Number
David Jones 45322190-AK
Serial Number Type Invoice #
45322190-AK LapTop INV10895
Invoice # Vendor Payment
INV10895 MyBuy $104.56
David Jones
David Jones =
0.00885
Select Payment where Owner=“David Jones” and Type(Product)=“Laptop”,
Dave Jones
David Jones
≠
© 2011 IBM Corporation
IBM Research
Hard Questions?
Computer programs are natively explicit, fast and exacting in their calculation over
numbers and symbols….But Natural Language is implicit, highly contextual,
ambiguous and often imprecise.
Where was X born?
One day, from among his city views of Ulm, Otto chose a water color to
send to Albert Einstein as a remembrance of Einstein´s birthplace.
X ran this?
If leadership is an art then surely Jack Welch has proved himself a
master painter during his tenure at GE.
Person Birth Place
A. Einstein ULM
Person Organization
J. Welch GE
Structured
Unstructured
© 2011 IBM Corporation
IBM Research
The Jeopardy! Challenge: A compelling and notable way to drive and
measure the technology of automatic Question Answering along 5 Key Dimensions
$600
In cell division, mitosis
splits the nucleus &
cytokinesis splits this
liquid cushioning the
nucleus
$200
If you're standing, it's the
direction you should
look to check out the
wainscoting.	
  
$2000
Of the 4 countries in the
world that the U.S. does
not have diplomatic
relations with, the one
that’s farthest north
$1000
The first person
mentioned by name in
‘The Man in the Iron
Mask’ is this hero of a
previous book by the
same author.
© 2011 IBM Corporation
IBM Research
Basic Game Play
Technology Classics The Great
Outdoors
Speak of
the Dickens
Mind Your
Manners
Before and
After
$200 $200 $200 $200 $200 $200
$400 $400 $400 $400 $400 $400
$600 $600 $600 $600 $600 $600
$800 $800 $800 $800 $800 $800
$1000 $1000 $1000 $1000 $1000 $1000
6 Categories
5 Levels of
Difficulty
1 of 3 Players Selects a Clue
Host reads Clue out loud
ALL POLICEMEN CAN THANK
STEPHANIE KWOLEK FOR
HER INVENTION OF THIS
POLYMER FIBER, 5 TIMES
TOUGHER THAN STEEL
TECHNOLOGY
 All Players compete to answer
 1st to buzz-in gets to answer
 IF correct
earns $ value
selects Next Clue
 IF wrong
 loses $ value
 other players buzz again
(rebounds)
 Two Rounds Per Game + Final Question
 ONE Daily Double in First Round, TWO in 2nd Round
© 2011 IBM Corporation
IBM Research
8
Broad Domain
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
he
film
group
capital
woman
song
singer
show
composer
title
fruit
planet
there
person
language
holiday
color
place
son
tree
line
product
birds
animals
site
lady
province
dog
substance
insect
way
founder
senator
form
disease
someone
maker
father
words
object
writer
novelist
heroine
dish
post
month
vegetable
sign
countries
hat
bay
Our Focus is on reusable NLP technology for analyzing volumes of as-is text.
Structured sources (DBs and KBs) are used to help interpret the text.
We do NOT attempt to anticipate all questions and build specialized databases.
In a random sample of 20,000 questions we found
2,500 distinct types*. The most frequent occurring <3% of the time.
The distribution has a very long tail.
And for each these types 1000’s of different things may be asked.
*13% are non-distinct (e.g., it, this, these or NA)
Even going for the head of the tail will
barely make a dent
© 2011 IBM Corporation
IBM Research
Automatic Learning From “Reading”
Officials Submit Resignations (.7)
People earn degrees at schools (0.9)
Inventors patent inventions (.8)
Volumes of Text Syntactic Frames Semantic Frames
Vessels Sink (0.7)
People sink 8-balls (0.5) (in pool/0.8)
subject verb
object
Sentence
Parsing
Generalization &
Statistical Aggregation
Fluid is a liquid (.6)
Liquid is a fluid (.5)
© 2011 IBM Corporation
IBM Research
Evaluating Possibilities and Their Evidence
Is(“Cytoplasm”, “liquid”) = 0.2
Is(“organelle”, “liquid”) = 0.1
In cell division, mitosis splits the nucleus & cytokinesis
splits this liquid cushioning the nucleus.
Is(“vacuole”, “liquid”) = 0.2
Is(“plasma”, “liquid”) = 0.7
“Cytoplasm is a fluid surrounding the nucleus…”
Wordnet  Is_a(Fluid, Liquid)  ?
Learned  Is_a(Fluid, Liquid)  yes.
↑
 Organelle
 Vacuole
 Cytoplasm
 Plasma
 Mitochondria
 Blood …
Many candidate answers (CAs) are generated from many different searches
Each possibility is evaluated according to different dimensions of evidence.
Just One piece of evidence is if the CA is of the right type. In this case a “liquid”.
© 2011 IBM Corporation
IBM Research Keyword Evidence
IBM Confidential
celebrated
India
In May
1898
400th
anniversary
arrival in
Portugal
India
In May
Gary
explorer
celebrated
anniversary
in Portugal
Keyword Matching
Keyword Matching
Keyword Matching
Keyword Matching
Keyword Matching
11
arrived in
In May, Gary arrived in
India after he celebrated
his anniversary in Portugal.
In May 1898 Portugal celebrated
the 400th anniversary of this
explorer’s arrival in India.
This evidence
suggests “Gary” is
the answer BUT the
system must learn
that keyword
matching may be
weak relative to other
types of evidence
© 2011 IBM Corporation
IBM Research
celebrated
May 1898 400th anniversary
arrival
in
In May 1898 Portugal celebrated
the 400th anniversary of this
explorer’s arrival in India.
Portugal
landed in
27th May 1498
Vasco da Gama
Temporal
Reasoning
Statistical
Paraphrasing
GeoSpatial
Reasoning
explorer
On the 27th of May 1498, Vasco da
Gama landed in Kappad Beach
Kappad Beach
Para-
phrases
Geo-KB
Date
Math
12
India
Stronger
evidence can
be much
harder to find
and score. The evidence is still not 100% certain.
Search Far and Wide
Explore many hypotheses
Find Judge Evidence
Many inference algorithms
Deeper Evidence
© 2011 IBM Corporation
IBM Research
Not Just for Fun
A long, tiresome speech delivered by a frothy pie topping
Answer:
Meringue
Harangue
Harangue
Meringue
.
Diatribe
.
.
.
.
Whipped Cream
.
.
.
.
.
Category: Edible Rhyme Time
13
Some Questions require
Decomposition and Synthesis
© 2011 IBM Corporation
IBM Research
In 1968
When "60 Minutes" premiered this man was U.S. president.
this man was U.S. president.
Lyndon B Johnson
?
The DeepQA architecture attempts different decompositions and
recursively applies the QA algorithms
14
Divide and Conquer
(Typical in Final Jeopardy!) Must identify and solve
sub-questions from different
sources to answer
the top level question
© 2011 IBM Corporation
IBM Research
The Missing Link
On hearing of the discovery of George Mallory's body, he told reporters he still thinks he was first.
TV remote controls,
Buttons
Shirts, Telephones
Mt
Everest
He was first
Edmund
Hillary
© 2011 IBM Corporation
IBM Research
16
The Best Human Performance: Our Analysis Reveals the Winner’s Cloud
Winning Human
Performance
2007 QA Computer System
Grand Champion
Human Performance
Top human
players are
remarkably
good.
Each dot represents an actual historical human Jeopardy! game
More Confident Less Confident
Computers?
© 2011 IBM Corporation
IBM Research DeepQA: The Technology Behind Watson
Massively Parallel Probabilistic Evidence-Based Architecture
Generates and scores many hypotheses using a combination of 1000’s Natural Language
Processing, Information Retrieval, Machine Learning and Reasoning Algorithms.
These gather, evaluate, weigh and balance different types of evidence to deliver the answer with
the best support it can find.
. . .
Answer
Scoring
Models
Answer &
Confidence
Question
Evidence
Sources
Models
Models
Models
Models
Models
Primary
Search
Candidate
Answer
Generation
Hypothesis
Generation
Hypothesis and
Evidence Scoring
Final Confidence
Merging &
Ranking
Synthesis
Answer
Sources
Question &
Topic
Analysis
Evidence
Retrieval
Deep
Evidence
Scoring
Learned Models
help combine and
weigh the Evidence
Hypothesis
Generation
Hypothesis and Evidence
Scoring
Question
Decomposition
1000’s of
Pieces of Evidence
Multiple
Interpretations
100,000’s Scores from
many Deep Analysis
Algorithms
100’s
sources
100’s Possible
Answers
Balance
& Combine
© 2011 IBM Corporation
IBM Research
Some Growing Pains (early answers)
THE AMERICAN DREAM
Decades before Lincoln, Daniel Webster spoke of government "made
for", "made by" & "answerable to" them
NEW YORK TIMES HEADLINES
An exclamation point was warranted for the "end of" this! In 1918
MILESTONES
In 1994, 25 years after this event, 1 participant said, "For one crowning
moment, we were creatures of the cosmic ocean”
THE QUEEN'S ENGLISH
Give a Brit a tinkle when you get into town & you've done this
The Big Bang
Urinate
No One
A sentence
Apollo 11 moon landing
The People
WW I
Call on the phone
© 2011 IBM Corporation
IBM Research
Some Growing Pains (early answers)
WORLD FACTS
The Denmark Strait separates these 2 islands by about 200 miles
Greenland and Taiwan
Greenland and Iceland
BOXINGS TERMS
Rhyming Term for a hit below the belt
Wang bang
Low blow
FATHERLY NICKNAMES
This Frenchman was "The Father of Bacteriology" How Tasty Was My
Little Frenchman
Louis Pasteur
(no category)
What to grasshoppers eat?
Kosher
???
© 2011 IBM Corporation
IBM Research
Grouping Features to produce Evidence Profiles
Clue: Chile shares its longest land border with this country.
-0.2
0
0.2
0.4
0.6
0.8
1 Argentina Bolivia Bolivia is more
Popular due to a
commonly
discussed
border dispute..
Positive Evidence
Negative Evidence
© 2011 IBM Corporation
IBM Research
Evidence: Time, Popularity, Source, Classification etc.
Clue: You’ll find Bethel College and a Seminary in this “holy” Minnesota city.
Saint Paul
South Bend
There’s a Bethel College and a Seminary in both
cities. System is not weighing location evidence
high enough to give St. Paul the edge.
© 2011 IBM Corporation
IBM Research
Evidence: Puns
Clue: You’ll find Bethel College and a Seminary in this “holy” Minnesota city.
Saint Paul
South Bend
Humans may get this based on the pun since St.
Paul since is a “holy” city. We added a Pun Scorer
that discovers and scores Pun relationships.
© 2011 IBM Corporation
IBM Research
Categories are not as simple as the seem
Watson uses statistical machine learning to discover that Jeopardy!
categories are only weak indicators of the answer type.
St. Petersburg is home to
Florida's annual
tournament in this game
popular on shipdecks
(Shuffleboard)
U.S. CITIES
Rochester, New York
grew because of its
location on this
(the Erie Canal)
From India, the shashpar
was a multi-bladed
version of this spiked club
(a mace)
Country Clubs
A French riot policeman
may wield this, simply the
French word for "stick“
(a baton)
Archibald MacLeish?
based his verse play "J.B."
on this book of the Bible
(Job)
Authors
In 1928 Elie Wiesel was
born in Sighet, a
Transylvanian village in
this country
(Romania)
© 2011 IBM Corporation
IBM Research
In-Category Learning CELEBRATIONS
OF THE MONTH
	
  What	
  the	
  Jeopardy!	
  Clue	
  is	
  asking	
  for	
  is	
  NOT	
  always	
  obvious	
  
	
  Watson	
  can	
  try	
  to	
  infer	
  the	
  type	
  of	
  thing	
  being	
  asked	
  for	
  from	
  the	
  previous	
  answers.	
  
	
  In	
  this	
  example	
  aBer	
  seeing	
  2	
  correct	
  answers	
  Watson	
  starts	
  to	
  dynamically	
  learn	
  a	
  confidence	
  
that	
  the	
  quesFon	
  is	
  asking	
  for	
  something	
  that	
  it	
  can	
  classify	
  as	
  a	
  “month”.	
  
Clue Type Watson’s Answer Correct Answer
D-DAY ANNIVERSARY & MAGNA
CARTA DAY
day Runnymede June
NATIONAL PHILANTHROPY DAY &
ALL SOULS' DAY
day Day of the
Dead
November
NATIONAL TEACHER DAY &
KENTUCKY DERBY DAY
Day/month(.2) Churchill
Downs
May
ADMINISTRATIVE PROFESSIONALS
DAY & NATIONAL CPAS GOOF-OFF
DAY
day / month(.6) April April
NATIONAL MAGIC DAY & NEVADA
ADMISSION DAY
day / month(.8) October October
© 2011 IBM Corporation
IBM Research
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Precision
% Answered
Baseline 12/06
v0.1 12/07
v0.3 08/08
v0.5 05/09
v0.6 10/09
v0.8 11/10
v0.4 12/08
DeepQA: Incremental Progress in Answering Precision
on the Jeopardy Challenge: 6/2007-11/2010
v0.2 05/08
IBM Watson
Playing in the Winners Cloud
V0.7 04/10
© 2011 IBM Corporation
IBM Research
Betting
Game Strategy
Managing the Luck of the Draw
IBM Confidential
Dynamic
Confidence
Clue
Selection
Risk
Management
Modulate
Aggressiveness
Direct Impact
on Earnings
DDs & FJ
Learn at lower
risk
Prior
Probabilities
Performance
Relative to
Competition
Common
Betting
Patterns
Bankroll Bet
© 2011 IBM Corporation
IBM Research
One Jeopardy! question can take 2 hours on a single 2.6Ghz Core
Optimized & Scaled out on 2,880-Core Power750 using UIMA-AS,
Watson is answering in 2-6 seconds.
Question
100s Possible
Answers
1000’s of
Pieces of Evidence
Multiple
Interpretations
100,000’s scores from many simultaneous
Text Analysis Algorithms
100s sources
. . .
Hypothesis
Generation
Hypothesis and
Evidence Scoring
Final Confidence
Merging &
Ranking
Synthesis
Question &
Topic
Analysis
Question
Decomposition
Hypothesis
Generation
Hypothesis and Evidence
Scoring
Answer &
Confidence
© 2011 IBM Corporation
IBM Research
Precision, Confidence & Speed
Deep Analytics – Combining many analytics in a
novel architecture, we achieved very high levels of
Precision and Confidence over a huge variety of as-is
content.
Speed – By optimizing Watson’s computation for
Jeopardy! on over 2,800 POWER7 processing cores
we went from 2 hours per question on a single
CPU to an average of just 3 seconds.
Results – in 55 real-time sparring games against
former Tournament of Champion Players last year,
Watson put on a very competitive performance in all
games -- placing 1st in 71% of the them!
© 2011 IBM Corporation
IBM Research
IBM Confidential
Next Steps
 Demonstrate Watson’s Champion-Level Performance
 Broader Scientific Research: Open-Advancement of QA
–Publish Detailed Experimental Results – What worked, What failed and why
–Facilitate Broader Collaborative Research with Universities
–Advance a common architecture and platform for intelligent QA systems
 Business Applications
–Work with clients to understand and demonstrate how to impact real-world
Business Applications with DeepQA technology
29
© 2011 IBM Corporation
IBM Research
Potential Business Applications
Tech Support: Help-desk, Contact Centers
Healthcare / Life Sciences: Diagnostic Assistance, Evidenced-
Based, Collaborative Medicine
Enterprise Knowledge Management and Business
Intelligence
Government: Improved Information Sharing
and Security
• Each	
  dimension	
  contributes	
  to	
  supporFng	
  or	
  refuFng	
  hypotheses	
  based	
  on	
  
– Strength	
  of	
  evidence	
  	
  
– Importance	
  of	
  dimension	
  for	
  diagnosis	
  (learned	
  from	
  training	
  data)	
  
• Evidence	
  dimensions	
  are	
  combined	
  to	
  produce	
  an	
  overall	
  confidence	
  
Evidence	
  Profiles	
  from	
  disparate	
  data	
  is	
  a	
  powerful	
  idea	
  
PosiFve	
  
Evidence	
  
NegaFve	
  
Evidence	
  
Overall Confidence
© 2011 IBM Corporation
IBM Research
Watson and Power Systems
 Watson represents the future of systems design, where workload optimized
systems are tailored to fit the requirements of a new era of smarter solutions.
 Watson is a workload optimized system designed for complex analytics, made
possible by integrating massively parallel POWER7 processors, Linux and
DeepQA software to answer Jeopardy! questions in under three seconds.
 Building Watson on commercially available POWER7 systems and Linux
ensures the acceleration of businesses adopting workload optimized systems
in industries where knowledge acquisition and analytics are important.
– Rice University uses the Power 755 with Linux for the massive analytics
processing behind its cancer research program.
– GHY International, a provider of U.S. and Canadian customs brokerage
services, uses the Power 750 with AIX, IBM i and Linux to deploy new
business services in as little as five minutes.
Power 750
© 2011 IBM Corporation
IBM Research
 90 x IBM Power 7501 servers
 2880 POWER7 cores
 POWER7 3.55 GHz chip
 500 GB per sec on-chip bandwidth
 10 Gb Ethernet network
 15 Terabytes of memory
 20 Terabytes of disk, clustered
 Can operate at 80 Teraflops
 Runs IBM DeepQA software
 Scales out with and searches vast amounts of unstructured
information with UIMA & Hadoop open source components
 SUSE Linux provides a cost-effective open platform which is
performance-optimized to exploit POWER 7 systems
 10 racks include servers, networking, shared disk system,
cluster controllers
Watson Workload Optimized System (Power 750)
1 Note that the Power 750 featuring POWER7 is a commercially available
server that runs AIX, IBM i and Linux and has been in market since Feb 2010
© 2011 IBM Corporation
IBM Research
In summary, Watson is good but…
Watson
– 10 compute racks
– 80kW of power
– 20 tons of cooling
IBM Confidential
Human
– 1 brain - fits in a shoe box
– Can run on a tuna-fish sandwich
– Can be cooled with a hand-held paper fan.
© 2011 IBM Corporation
IBM Research
Questions?
IBM Confidential
http://w3.ibm.com/ibm/resource/res_watson_grand_challenge.html
More info:
© 2011 IBM Corporation
IBM Research
BACK UPS
IBM Confidential
© 2011 IBM Corporation
IBM Research
Watson’s
QA Engine
2,880
IBM Power750
Compute
Cores
15 TB of
Memory
Strategy
Text-to-Speech
Jeopardy!
Game
Control
System
Human Player
2
Clue Grid
Watson’s
Game
Controller
Real-Time Game Configuration
Used in Sparring and Exhibition Games
Clues, Scores & Other Game Data
Insulated and
Self-Contained
Answers &
Confidences
Human Player
1
Clue &
Category
Decisions to
Buzz and Bet
Analysis of natural language content
equivalent to 1 Million Books
© 2011 IBM Corporation
IBM Research
Watson Does and Does NOT
Does
 Speaks Clue Selections
 Speaks Responses
 Physically Presses the Button
 Gets the clue electronically
when you see it
Does NOT
 Hear
– Can answer with same wrong answer
 See
 Process Audi/Visual Clues
– These are excluded from the contest
 Buzz Perfectly
– Watson is quick but not the quickest
– Variable time to compute
confidences and answers
– Confidence-Weighed Buzzing
– Not always fast or confident enough
© 2011 IBM Corporation
IBM Research
ABSTRACT
The TV quiz show Jeopardy! is famous for providing contestants
answers to which they must supply the correct questions in order to
win. Contestants must be fast, have an almost encyclopedic
knowledge of the world, and they must be able to handle clues that
are vague, involve double-meanings and frequently rely on puns.
Early this year, Jeopardy! aired a match involving the two all-time
most successful Jeopardy! contestants and Watson, a artificial
intelligence system designed by IBM. Watson won the Jeopardy!
match by a wide margin and in doing so, brought the leading edge
of computer technology a little closer to human abilities. This
presentation will describe the supercomputer implementation of
Watson use for the Jeopardy! match and the challenges overcome
to create a computer capable of accurately answering open-ended
natural language questions in real-time - typically in under 3
seconds.
IBM Confidential

Más contenido relacionado

Similar a What is Watson – An Overvie.pdf

ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson
ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of WatsonESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson
ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson
eswcsummerschool
 
Sis sat 1000 josh dreller
Sis sat 1000 josh drellerSis sat 1000 josh dreller
Sis sat 1000 josh dreller
MediaPost
 
Sp14 cs188 lecture 1 - introduction
Sp14 cs188 lecture 1  - introductionSp14 cs188 lecture 1  - introduction
Sp14 cs188 lecture 1 - introduction
Amer Noureddin
 
AI: Feats, Limits and Caveats - Monojit - Opening Keynote AI Dev Days 2018
AI: Feats, Limits and Caveats - Monojit - Opening Keynote AI Dev Days 2018AI: Feats, Limits and Caveats - Monojit - Opening Keynote AI Dev Days 2018
AI: Feats, Limits and Caveats - Monojit - Opening Keynote AI Dev Days 2018
CodeOps Technologies LLP
 

Similar a What is Watson – An Overvie.pdf (20)

Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022
 
The Math and Tech Quiz 2015 (Prelim+Final+Answers)
The Math and Tech Quiz 2015 (Prelim+Final+Answers)The Math and Tech Quiz 2015 (Prelim+Final+Answers)
The Math and Tech Quiz 2015 (Prelim+Final+Answers)
 
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2
 
IT Quiz 2019 - Prelims
IT Quiz 2019 - PrelimsIT Quiz 2019 - Prelims
IT Quiz 2019 - Prelims
 
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICSBig Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
 
Nipa
NipaNipa
Nipa
 
ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson
ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of WatsonESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson
ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6
 
Sis sat 1000 josh dreller
Sis sat 1000 josh drellerSis sat 1000 josh dreller
Sis sat 1000 josh dreller
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
NLP 101 + Chatbots
NLP 101 + ChatbotsNLP 101 + Chatbots
NLP 101 + Chatbots
 
Psychology of Usability
Psychology of UsabilityPsychology of Usability
Psychology of Usability
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIData Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
 
Masterminds [IT Quiz]
Masterminds [IT Quiz]Masterminds [IT Quiz]
Masterminds [IT Quiz]
 
Sp14 cs188 lecture 1 - introduction
Sp14 cs188 lecture 1  - introductionSp14 cs188 lecture 1  - introduction
Sp14 cs188 lecture 1 - introduction
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Artificial Intelligence for Undergrads
Artificial Intelligence for UndergradsArtificial Intelligence for Undergrads
Artificial Intelligence for Undergrads
 
AI: Feats, Limits and Caveats - Monojit - Opening Keynote AI Dev Days 2018
AI: Feats, Limits and Caveats - Monojit - Opening Keynote AI Dev Days 2018AI: Feats, Limits and Caveats - Monojit - Opening Keynote AI Dev Days 2018
AI: Feats, Limits and Caveats - Monojit - Opening Keynote AI Dev Days 2018
 

Último

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Último (20)

Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 

What is Watson – An Overvie.pdf

  • 1. © 2011 IBM Corporation IBM Research What is Watson – An Overview Michael Perrone Mgr., Multicore Computing IBM TJ Watson Research Center 1 With special thanks to the IBM Research DeepQA Team!
  • 2. © 2011 IBM Corporation IBM Research Informed Decision Making: Search vs. Expert Q&A Decision Maker Search Engine Finds Documents containing Keywords Delivers Documents based on Popularity Has Question Distills to 2-3 Keywords Reads Documents, Finds Answers Finds & Analyzes Evidence Expert Understands Question Produces Possible Answers & Evidence Delivers Response, Evidence & Confidence Analyzes Evidence, Computes Confidence Asks NL Question Considers Answer & Evidence Decision Maker
  • 3. © 2011 IBM Corporation IBM Research Want to Play Chess or Just Chat? Chess – A finite, mathematically well-defined search space – Limited number of moves and states – All the symbols are completely grounded in the mathematical rules of the game Human Language – Words by themselves have no meaning – Only grounded in human cognition – Words navigate, align and communicate an infinite space of intended meaning – Computers can not ground words to human experiences to derive meaning
  • 4. © 2011 IBM Corporation IBM Research Easy Questions? 4 IBM Confidential (ln(12,546,798 * π)) ^ 2 / 34,567.46 = Owner Serial Number David Jones 45322190-AK Serial Number Type Invoice # 45322190-AK LapTop INV10895 Invoice # Vendor Payment INV10895 MyBuy $104.56 David Jones David Jones = 0.00885 Select Payment where Owner=“David Jones” and Type(Product)=“Laptop”, Dave Jones David Jones ≠
  • 5. © 2011 IBM Corporation IBM Research Hard Questions? Computer programs are natively explicit, fast and exacting in their calculation over numbers and symbols….But Natural Language is implicit, highly contextual, ambiguous and often imprecise. Where was X born? One day, from among his city views of Ulm, Otto chose a water color to send to Albert Einstein as a remembrance of Einstein´s birthplace. X ran this? If leadership is an art then surely Jack Welch has proved himself a master painter during his tenure at GE. Person Birth Place A. Einstein ULM Person Organization J. Welch GE Structured Unstructured
  • 6. © 2011 IBM Corporation IBM Research The Jeopardy! Challenge: A compelling and notable way to drive and measure the technology of automatic Question Answering along 5 Key Dimensions $600 In cell division, mitosis splits the nucleus & cytokinesis splits this liquid cushioning the nucleus $200 If you're standing, it's the direction you should look to check out the wainscoting.   $2000 Of the 4 countries in the world that the U.S. does not have diplomatic relations with, the one that’s farthest north $1000 The first person mentioned by name in ‘The Man in the Iron Mask’ is this hero of a previous book by the same author.
  • 7. © 2011 IBM Corporation IBM Research Basic Game Play Technology Classics The Great Outdoors Speak of the Dickens Mind Your Manners Before and After $200 $200 $200 $200 $200 $200 $400 $400 $400 $400 $400 $400 $600 $600 $600 $600 $600 $600 $800 $800 $800 $800 $800 $800 $1000 $1000 $1000 $1000 $1000 $1000 6 Categories 5 Levels of Difficulty 1 of 3 Players Selects a Clue Host reads Clue out loud ALL POLICEMEN CAN THANK STEPHANIE KWOLEK FOR HER INVENTION OF THIS POLYMER FIBER, 5 TIMES TOUGHER THAN STEEL TECHNOLOGY  All Players compete to answer  1st to buzz-in gets to answer  IF correct earns $ value selects Next Clue  IF wrong  loses $ value  other players buzz again (rebounds)  Two Rounds Per Game + Final Question  ONE Daily Double in First Round, TWO in 2nd Round
  • 8. © 2011 IBM Corporation IBM Research 8 Broad Domain 0.00% 0.50% 1.00% 1.50% 2.00% 2.50% 3.00% he film group capital woman song singer show composer title fruit planet there person language holiday color place son tree line product birds animals site lady province dog substance insect way founder senator form disease someone maker father words object writer novelist heroine dish post month vegetable sign countries hat bay Our Focus is on reusable NLP technology for analyzing volumes of as-is text. Structured sources (DBs and KBs) are used to help interpret the text. We do NOT attempt to anticipate all questions and build specialized databases. In a random sample of 20,000 questions we found 2,500 distinct types*. The most frequent occurring <3% of the time. The distribution has a very long tail. And for each these types 1000’s of different things may be asked. *13% are non-distinct (e.g., it, this, these or NA) Even going for the head of the tail will barely make a dent
  • 9. © 2011 IBM Corporation IBM Research Automatic Learning From “Reading” Officials Submit Resignations (.7) People earn degrees at schools (0.9) Inventors patent inventions (.8) Volumes of Text Syntactic Frames Semantic Frames Vessels Sink (0.7) People sink 8-balls (0.5) (in pool/0.8) subject verb object Sentence Parsing Generalization & Statistical Aggregation Fluid is a liquid (.6) Liquid is a fluid (.5)
  • 10. © 2011 IBM Corporation IBM Research Evaluating Possibilities and Their Evidence Is(“Cytoplasm”, “liquid”) = 0.2 Is(“organelle”, “liquid”) = 0.1 In cell division, mitosis splits the nucleus & cytokinesis splits this liquid cushioning the nucleus. Is(“vacuole”, “liquid”) = 0.2 Is(“plasma”, “liquid”) = 0.7 “Cytoplasm is a fluid surrounding the nucleus…” Wordnet  Is_a(Fluid, Liquid)  ? Learned  Is_a(Fluid, Liquid)  yes. ↑  Organelle  Vacuole  Cytoplasm  Plasma  Mitochondria  Blood … Many candidate answers (CAs) are generated from many different searches Each possibility is evaluated according to different dimensions of evidence. Just One piece of evidence is if the CA is of the right type. In this case a “liquid”.
  • 11. © 2011 IBM Corporation IBM Research Keyword Evidence IBM Confidential celebrated India In May 1898 400th anniversary arrival in Portugal India In May Gary explorer celebrated anniversary in Portugal Keyword Matching Keyword Matching Keyword Matching Keyword Matching Keyword Matching 11 arrived in In May, Gary arrived in India after he celebrated his anniversary in Portugal. In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India. This evidence suggests “Gary” is the answer BUT the system must learn that keyword matching may be weak relative to other types of evidence
  • 12. © 2011 IBM Corporation IBM Research celebrated May 1898 400th anniversary arrival in In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India. Portugal landed in 27th May 1498 Vasco da Gama Temporal Reasoning Statistical Paraphrasing GeoSpatial Reasoning explorer On the 27th of May 1498, Vasco da Gama landed in Kappad Beach Kappad Beach Para- phrases Geo-KB Date Math 12 India Stronger evidence can be much harder to find and score. The evidence is still not 100% certain. Search Far and Wide Explore many hypotheses Find Judge Evidence Many inference algorithms Deeper Evidence
  • 13. © 2011 IBM Corporation IBM Research Not Just for Fun A long, tiresome speech delivered by a frothy pie topping Answer: Meringue Harangue Harangue Meringue . Diatribe . . . . Whipped Cream . . . . . Category: Edible Rhyme Time 13 Some Questions require Decomposition and Synthesis
  • 14. © 2011 IBM Corporation IBM Research In 1968 When "60 Minutes" premiered this man was U.S. president. this man was U.S. president. Lyndon B Johnson ? The DeepQA architecture attempts different decompositions and recursively applies the QA algorithms 14 Divide and Conquer (Typical in Final Jeopardy!) Must identify and solve sub-questions from different sources to answer the top level question
  • 15. © 2011 IBM Corporation IBM Research The Missing Link On hearing of the discovery of George Mallory's body, he told reporters he still thinks he was first. TV remote controls, Buttons Shirts, Telephones Mt Everest He was first Edmund Hillary
  • 16. © 2011 IBM Corporation IBM Research 16 The Best Human Performance: Our Analysis Reveals the Winner’s Cloud Winning Human Performance 2007 QA Computer System Grand Champion Human Performance Top human players are remarkably good. Each dot represents an actual historical human Jeopardy! game More Confident Less Confident Computers?
  • 17. © 2011 IBM Corporation IBM Research DeepQA: The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture Generates and scores many hypotheses using a combination of 1000’s Natural Language Processing, Information Retrieval, Machine Learning and Reasoning Algorithms. These gather, evaluate, weigh and balance different types of evidence to deliver the answer with the best support it can find. . . . Answer Scoring Models Answer & Confidence Question Evidence Sources Models Models Models Models Models Primary Search Candidate Answer Generation Hypothesis Generation Hypothesis and Evidence Scoring Final Confidence Merging & Ranking Synthesis Answer Sources Question & Topic Analysis Evidence Retrieval Deep Evidence Scoring Learned Models help combine and weigh the Evidence Hypothesis Generation Hypothesis and Evidence Scoring Question Decomposition 1000’s of Pieces of Evidence Multiple Interpretations 100,000’s Scores from many Deep Analysis Algorithms 100’s sources 100’s Possible Answers Balance & Combine
  • 18. © 2011 IBM Corporation IBM Research Some Growing Pains (early answers) THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean” THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this The Big Bang Urinate No One A sentence Apollo 11 moon landing The People WW I Call on the phone
  • 19. © 2011 IBM Corporation IBM Research Some Growing Pains (early answers) WORLD FACTS The Denmark Strait separates these 2 islands by about 200 miles Greenland and Taiwan Greenland and Iceland BOXINGS TERMS Rhyming Term for a hit below the belt Wang bang Low blow FATHERLY NICKNAMES This Frenchman was "The Father of Bacteriology" How Tasty Was My Little Frenchman Louis Pasteur (no category) What to grasshoppers eat? Kosher ???
  • 20. © 2011 IBM Corporation IBM Research Grouping Features to produce Evidence Profiles Clue: Chile shares its longest land border with this country. -0.2 0 0.2 0.4 0.6 0.8 1 Argentina Bolivia Bolivia is more Popular due to a commonly discussed border dispute.. Positive Evidence Negative Evidence
  • 21. © 2011 IBM Corporation IBM Research Evidence: Time, Popularity, Source, Classification etc. Clue: You’ll find Bethel College and a Seminary in this “holy” Minnesota city. Saint Paul South Bend There’s a Bethel College and a Seminary in both cities. System is not weighing location evidence high enough to give St. Paul the edge.
  • 22. © 2011 IBM Corporation IBM Research Evidence: Puns Clue: You’ll find Bethel College and a Seminary in this “holy” Minnesota city. Saint Paul South Bend Humans may get this based on the pun since St. Paul since is a “holy” city. We added a Pun Scorer that discovers and scores Pun relationships.
  • 23. © 2011 IBM Corporation IBM Research Categories are not as simple as the seem Watson uses statistical machine learning to discover that Jeopardy! categories are only weak indicators of the answer type. St. Petersburg is home to Florida's annual tournament in this game popular on shipdecks (Shuffleboard) U.S. CITIES Rochester, New York grew because of its location on this (the Erie Canal) From India, the shashpar was a multi-bladed version of this spiked club (a mace) Country Clubs A French riot policeman may wield this, simply the French word for "stick“ (a baton) Archibald MacLeish? based his verse play "J.B." on this book of the Bible (Job) Authors In 1928 Elie Wiesel was born in Sighet, a Transylvanian village in this country (Romania)
  • 24. © 2011 IBM Corporation IBM Research In-Category Learning CELEBRATIONS OF THE MONTH   What  the  Jeopardy!  Clue  is  asking  for  is  NOT  always  obvious     Watson  can  try  to  infer  the  type  of  thing  being  asked  for  from  the  previous  answers.     In  this  example  aBer  seeing  2  correct  answers  Watson  starts  to  dynamically  learn  a  confidence   that  the  quesFon  is  asking  for  something  that  it  can  classify  as  a  “month”.   Clue Type Watson’s Answer Correct Answer D-DAY ANNIVERSARY & MAGNA CARTA DAY day Runnymede June NATIONAL PHILANTHROPY DAY & ALL SOULS' DAY day Day of the Dead November NATIONAL TEACHER DAY & KENTUCKY DERBY DAY Day/month(.2) Churchill Downs May ADMINISTRATIVE PROFESSIONALS DAY & NATIONAL CPAS GOOF-OFF DAY day / month(.6) April April NATIONAL MAGIC DAY & NEVADA ADMISSION DAY day / month(.8) October October
  • 25. © 2011 IBM Corporation IBM Research 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Precision % Answered Baseline 12/06 v0.1 12/07 v0.3 08/08 v0.5 05/09 v0.6 10/09 v0.8 11/10 v0.4 12/08 DeepQA: Incremental Progress in Answering Precision on the Jeopardy Challenge: 6/2007-11/2010 v0.2 05/08 IBM Watson Playing in the Winners Cloud V0.7 04/10
  • 26. © 2011 IBM Corporation IBM Research Betting Game Strategy Managing the Luck of the Draw IBM Confidential Dynamic Confidence Clue Selection Risk Management Modulate Aggressiveness Direct Impact on Earnings DDs & FJ Learn at lower risk Prior Probabilities Performance Relative to Competition Common Betting Patterns Bankroll Bet
  • 27. © 2011 IBM Corporation IBM Research One Jeopardy! question can take 2 hours on a single 2.6Ghz Core Optimized & Scaled out on 2,880-Core Power750 using UIMA-AS, Watson is answering in 2-6 seconds. Question 100s Possible Answers 1000’s of Pieces of Evidence Multiple Interpretations 100,000’s scores from many simultaneous Text Analysis Algorithms 100s sources . . . Hypothesis Generation Hypothesis and Evidence Scoring Final Confidence Merging & Ranking Synthesis Question & Topic Analysis Question Decomposition Hypothesis Generation Hypothesis and Evidence Scoring Answer & Confidence
  • 28. © 2011 IBM Corporation IBM Research Precision, Confidence & Speed Deep Analytics – Combining many analytics in a novel architecture, we achieved very high levels of Precision and Confidence over a huge variety of as-is content. Speed – By optimizing Watson’s computation for Jeopardy! on over 2,800 POWER7 processing cores we went from 2 hours per question on a single CPU to an average of just 3 seconds. Results – in 55 real-time sparring games against former Tournament of Champion Players last year, Watson put on a very competitive performance in all games -- placing 1st in 71% of the them!
  • 29. © 2011 IBM Corporation IBM Research IBM Confidential Next Steps  Demonstrate Watson’s Champion-Level Performance  Broader Scientific Research: Open-Advancement of QA –Publish Detailed Experimental Results – What worked, What failed and why –Facilitate Broader Collaborative Research with Universities –Advance a common architecture and platform for intelligent QA systems  Business Applications –Work with clients to understand and demonstrate how to impact real-world Business Applications with DeepQA technology 29
  • 30. © 2011 IBM Corporation IBM Research Potential Business Applications Tech Support: Help-desk, Contact Centers Healthcare / Life Sciences: Diagnostic Assistance, Evidenced- Based, Collaborative Medicine Enterprise Knowledge Management and Business Intelligence Government: Improved Information Sharing and Security
  • 31. • Each  dimension  contributes  to  supporFng  or  refuFng  hypotheses  based  on   – Strength  of  evidence     – Importance  of  dimension  for  diagnosis  (learned  from  training  data)   • Evidence  dimensions  are  combined  to  produce  an  overall  confidence   Evidence  Profiles  from  disparate  data  is  a  powerful  idea   PosiFve   Evidence   NegaFve   Evidence   Overall Confidence
  • 32. © 2011 IBM Corporation IBM Research Watson and Power Systems  Watson represents the future of systems design, where workload optimized systems are tailored to fit the requirements of a new era of smarter solutions.  Watson is a workload optimized system designed for complex analytics, made possible by integrating massively parallel POWER7 processors, Linux and DeepQA software to answer Jeopardy! questions in under three seconds.  Building Watson on commercially available POWER7 systems and Linux ensures the acceleration of businesses adopting workload optimized systems in industries where knowledge acquisition and analytics are important. – Rice University uses the Power 755 with Linux for the massive analytics processing behind its cancer research program. – GHY International, a provider of U.S. and Canadian customs brokerage services, uses the Power 750 with AIX, IBM i and Linux to deploy new business services in as little as five minutes. Power 750
  • 33. © 2011 IBM Corporation IBM Research  90 x IBM Power 7501 servers  2880 POWER7 cores  POWER7 3.55 GHz chip  500 GB per sec on-chip bandwidth  10 Gb Ethernet network  15 Terabytes of memory  20 Terabytes of disk, clustered  Can operate at 80 Teraflops  Runs IBM DeepQA software  Scales out with and searches vast amounts of unstructured information with UIMA & Hadoop open source components  SUSE Linux provides a cost-effective open platform which is performance-optimized to exploit POWER 7 systems  10 racks include servers, networking, shared disk system, cluster controllers Watson Workload Optimized System (Power 750) 1 Note that the Power 750 featuring POWER7 is a commercially available server that runs AIX, IBM i and Linux and has been in market since Feb 2010
  • 34. © 2011 IBM Corporation IBM Research In summary, Watson is good but… Watson – 10 compute racks – 80kW of power – 20 tons of cooling IBM Confidential Human – 1 brain - fits in a shoe box – Can run on a tuna-fish sandwich – Can be cooled with a hand-held paper fan.
  • 35. © 2011 IBM Corporation IBM Research Questions? IBM Confidential http://w3.ibm.com/ibm/resource/res_watson_grand_challenge.html More info:
  • 36. © 2011 IBM Corporation IBM Research BACK UPS IBM Confidential
  • 37. © 2011 IBM Corporation IBM Research Watson’s QA Engine 2,880 IBM Power750 Compute Cores 15 TB of Memory Strategy Text-to-Speech Jeopardy! Game Control System Human Player 2 Clue Grid Watson’s Game Controller Real-Time Game Configuration Used in Sparring and Exhibition Games Clues, Scores & Other Game Data Insulated and Self-Contained Answers & Confidences Human Player 1 Clue & Category Decisions to Buzz and Bet Analysis of natural language content equivalent to 1 Million Books
  • 38. © 2011 IBM Corporation IBM Research Watson Does and Does NOT Does  Speaks Clue Selections  Speaks Responses  Physically Presses the Button  Gets the clue electronically when you see it Does NOT  Hear – Can answer with same wrong answer  See  Process Audi/Visual Clues – These are excluded from the contest  Buzz Perfectly – Watson is quick but not the quickest – Variable time to compute confidences and answers – Confidence-Weighed Buzzing – Not always fast or confident enough
  • 39. © 2011 IBM Corporation IBM Research ABSTRACT The TV quiz show Jeopardy! is famous for providing contestants answers to which they must supply the correct questions in order to win. Contestants must be fast, have an almost encyclopedic knowledge of the world, and they must be able to handle clues that are vague, involve double-meanings and frequently rely on puns. Early this year, Jeopardy! aired a match involving the two all-time most successful Jeopardy! contestants and Watson, a artificial intelligence system designed by IBM. Watson won the Jeopardy! match by a wide margin and in doing so, brought the leading edge of computer technology a little closer to human abilities. This presentation will describe the supercomputer implementation of Watson use for the Jeopardy! match and the challenges overcome to create a computer capable of accurately answering open-ended natural language questions in real-time - typically in under 3 seconds. IBM Confidential