SlideShare una empresa de Scribd logo
1 de 52
Machine Learning for Information Extraction: An Overview Kamal Nigam Google Pittsburgh With input, slides and suggestions from William Cohen, Andrew McCallum and Ion Muslea
Example: A Problem Genomics job Mt. Baker,  the school district Baker Hostetler , the company Baker,  a job opening
Example: A Solution
Job Openings: Category =  Food Services Keyword =  Baker   Location =  Continental  U.S.
Extracting Job Openings from the Web  Title:  Ice Cream Guru Description:  If you dream of cold creamy… Contact:   [email_address] Category:  Travel/Hospitality Function:  Food Services
Potential Enabler of Faceted Search
Lots of Structured Information in Text
IE from Research Papers
What is Information Extraction? ,[object Object]
What is Information Extraction? ,[object Object],[object Object]
What is Information Extraction? ,[object Object],[object Object],[object Object]
What is Information Extraction? ,[object Object],[object Object],[object Object],[object Object]
What is Information Extraction? ,[object Object],[object Object],[object Object],[object Object],[object Object]
IE Posed as a Machine Learning Task ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],00  :  pm  Place  :  Wean  Hall  Rm  5409  Speaker  :  Sebastian  Thrun prefix contents suffix … …
Good Features for Information Extraction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],begins-with-number begins-with-ordinal begins-with-punctuation begins-with-question-word begins-with-subject blank contains-alphanum contains-bracketed-number contains-http contains-non-space contains-number contains-pipe contains-question-mark contains-question-word ends-with-question-mark first-alpha-is-capitalized indented indented-1-to-4 indented-5-to-10 more-than-one-third-space only-punctuation prev-is-blank prev-begins-with-ordinal shorter-than-30 Creativity and Domain Knowledge Required!
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Good Features for Information Extraction Is Capitalized  Is Mixed Caps  Is All Caps  Initial Cap Contains Digit All lowercase  Is Initial Punctuation Period Comma Apostrophe Dash Preceded by HTML tag Character n-gram classifier  says string is a person  name (80% accurate) In stopword list (the, of, their, etc) In honorific list (Mr, Mrs, Dr, Sen, etc) In person suffix list (Jr, Sr, PhD, etc) In name particle list  (de, la, van, der, etc) In Census lastname list; segmented by P(name) In Census firstname list; segmented by P(name) In locations lists (states, cities, countries) In company name list (“J. C. Penny”) In list of company suffixes (Inc, & Associates, Foundation) Creativity and Domain Knowledge Required!
IE History ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Landscape of ML Techniques for IE: Any of these models can be used to capture words, formatting or both. Classify Candidates Abraham Lincoln  was born in  Kentucky . Classifier which class? Sliding Window Abraham Lincoln was born in Kentucky. Classifier which class? Try alternate window sizes: Boundary Models Abraham Lincoln was born in Kentucky. Classifier which class? BEGIN END BEGIN END BEGIN Finite State Machines Abraham Lincoln was born in Kentucky. Most likely state sequence? Wrapper Induction <b><i> Abraham Lincoln </i></b> was born in Kentucky. Learn and apply pattern for a website <b> <i> PersonName
Sliding Windows & Boundary Detection
Information Extraction by Sliding Windows GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
Information Extraction by Sliding Windows GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
Information Extraction by Sliding Window GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
Information Extraction by Sliding Window GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
Information Extraction by Sliding Window GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
Information Extraction with Sliding Windows [Freitag 97, 98; Soderland 97; Califf 98] 00  :  pm  Place  :  Wean  Hall  Rm  5409  Speaker  :  Sebastian  Thrun w  t-m w  t-1 w  t w  t+n w  t+n+1 w  t+n+m prefix contents suffix … … ,[object Object],[object Object],[object Object],[object Object],[object Object],courseNumber(X)  :-   tokenLength(X,=,2),  every(X, inTitle, false),  some(X, A, <previousToken>, inTitle, true), some(X, B, <>. tripleton, true)
Rule-learning approaches to sliding-window classification: Summary ,[object Object],[object Object],[object Object]
IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s.  As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
BWI:  Learning to detect boundaries ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Freitag & Kushmerick, AAAI 2000]
BWI:  Learning to detect boundaries ,[object Object],[object Object],[object Object],[object Object]
BWI: Learning to detect boundaries Field F1  Person Name: 30% Location: 61% Start Time: 98%
Problems with Sliding Windows  and Boundary Finders ,[object Object],[object Object],[object Object],[object Object]
Finite State Machines
Hidden Markov Models S t - 1 S t O t S t+1 O t +1 O t - 1 ... ... Finite state model Graphical model Parameters: for all states  S={s 1 ,s 2 ,…} Start state probabilities:  P(s t  ) Transition probabilities:  P(s t |s t-1  ) Observation (emission) probabilities:  P(o t |s t  ) Training: Maximize probability of training observations (w/ prior) HMMs are the standard sequence modeling tool in  genomics, music, speech, NLP, … ... transitions observations o 1   o 2   o 3   o 4   o 5   o 6   o 7   o 8 Generates: State sequence Observation sequence Usually a multinomial over  atomic, fixed alphabet
IE with Hidden Markov Models Yesterday Lawrence Saul spoke this example sentence. Yesterday   Lawrence Saul   spoke this example sentence. Person name:  Lawrence Saul   Given a sequence of observations: and a trained HMM: Find the most likely state sequence:  (Viterbi) Any words said to be generated by the designated “person name” state extract as a person name:
Generative Extraction with HMMs ,[object Object],[object Object],[McCallum, Nigam, Seymore & Rennie ‘00]
HMM Example: “Nymble” Other examples of HMMs in IE:  [Leek ’97; Freitag & McCallum ’99; Seymore et al. 99] Task: Named Entity Extraction Train on 450k words of news wire text. Case  Language  F1  . Mixed  English 93% Upper English 91% Mixed Spanish 90% [Bikel, et al 97] Person Org Other (Five other name classes) start-of-sentence end-of-sentence Transition probabilities Observation probabilities P(s t  | s t-1 ,  o t-1   ) P(o t  | s t  ,  s t-1   ) Back-off to: Back-off to: P(s t  | s t-1  ) P(s t  ) P(o t  | s t  ,  o t-1   ) P(o t  | s t  ) P(o t  ) or Results:
Regrets from Atomic View of Tokens Would like richer representation of text:  multiple overlapping features, whole chunks of text. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Problems with Richer Representation and a Generative Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conditional Sequence Models ,[object Object],[object Object],[object Object],[object Object],[object Object]
Conditional Markov Models S t - 1 S t O t S t+1 O t +1 O t - 1 ... ... Generative (traditional HMM) ... transitions observations S t - 1 S t O t S t+1 O t +1 O t - 1 ... ... Conditional ... transitions observations Standard belief propagation: forward-backward procedure. Viterbi and Baum-Welch follow naturally. Maximum Entropy Markov Models  [McCallum, Freitag & Pereira, 2000] MaxEnt POS Tagger  [Ratnaparkhi, 1996] SNoW-based Markov Model  [Punyakanok & Roth, 2000]
Exponential Form  for “Next State” Function Capture dependency on s t-1  with |S|  independent functions, P s t-1 (s t |o t ). Each state contains a “next-state classifier” that, given the next observation, produces a  probability of the next state, P s t-1 (s t |o t ). s t-1 s t Recipe: - Labeled data is assigned to transitions. - Train each state’s exponential model by maximum entropy weight feature
Label Bias Problem ,[object Object],Pr(0123|rob) = Pr(1|0,r)/Z1 * Pr(2|1,o)/Z2 * Pr(3|2,b)/Z3 = 0.5 * 1 * 1 Pr(0453|rib) = Pr(4|0,r)/Z1’ * Pr(5|4,i)/Z2’ * Pr(3|5,b)/Z3’ = 0.5 * 1 *1 Pr(0123|rib)=1 Pr(0453|rob)=1
From HMMs to MEMMs to CRFs HMM MEMM CRF S t-1 S t O t S t+1 O t+1 O t-1 S t-1 S t O t S t+1 O t+1 O t-1 S t-1 S t O t S t+1 O t+1 O t-1 ... ... ... ... ... ... (A special case of MEMMs and CRFs.) Conditional Random Fields  (CRFs) [Lafferty, McCallum, Pereira ‘2001]
Conditional Random Fields (CRFs) S t S t+1 S t+2 O = O t , O t+1 , O t+2 , O t+3 , O t+4 S t+3 S t+4 Markov on  s , conditional dependency on  o. Hammersley-Clifford-Besag theorem stipulates that the CRF has this form—an exponential function of the cliques in the graph. Assuming that the dependency structure of the states is tree-shaped (linear chain is a trivial tree), inference can be done by dynamic programming in time O(|o| |S| 2 )—just like HMMs.
Training CRFs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Sha & Pereira 2002] & [Malouf 2002]
Sample IE Applications of CRFs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Examples of Recent CRF Research ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Further Reading about CRFs ,[object Object],[object Object]

Más contenido relacionado

Destacado

CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...butest
 
Azure Machine Learning 101
Azure Machine Learning 101Azure Machine Learning 101
Azure Machine Learning 101Renato Jovic
 
EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye viewRoelof Pieters
 
How to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningHow to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningSri Ambati
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 

Destacado (7)

CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
Azure Machine Learning 101
Azure Machine Learning 101Azure Machine Learning 101
Azure Machine Learning 101
 
EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
How to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningHow to win data science competitions with Deep Learning
How to win data science competitions with Deep Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 

Similar a mlas06_nigam_tie_01.ppt

DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-rankingFELIX75
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptbutest
 
Kwan History of Computing 2011
Kwan History of Computing 2011Kwan History of Computing 2011
Kwan History of Computing 2011Stephen Kwan
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Semantic Web
Semantic WebSemantic Web
Semantic Webbutest
 
Big, Open, Data and Semantics for Real-World Application Near You
Big, Open, Data and Semantics for Real-World Application Near YouBig, Open, Data and Semantics for Real-World Application Near You
Big, Open, Data and Semantics for Real-World Application Near YouBiplav Srivastava
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud ComputingRahul Pola
 
Cloud computing
Cloud computingCloud computing
Cloud computingBasil John
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.butest
 
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionAI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionDr. Haxel Consult
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search ComponentMario Flecha
 
Ellen König - Machine learning for the curious but scared - Codemotion Berlin...
Ellen König - Machine learning for the curious but scared - Codemotion Berlin...Ellen König - Machine learning for the curious but scared - Codemotion Berlin...
Ellen König - Machine learning for the curious but scared - Codemotion Berlin...Codemotion
 
Knowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchKnowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchNeo4j
 
Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Jan Sifra
 
Computational Biology, Part 4 Protein Coding Regions
Computational Biology, Part 4 Protein Coding RegionsComputational Biology, Part 4 Protein Coding Regions
Computational Biology, Part 4 Protein Coding Regionsbutest
 

Similar a mlas06_nigam_tie_01.ppt (20)

ppt
pptppt
ppt
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-ranking
 
DB and IR Integration
DB and IR IntegrationDB and IR Integration
DB and IR Integration
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
 
Kwan History of Computing 2011
Kwan History of Computing 2011Kwan History of Computing 2011
Kwan History of Computing 2011
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Semantic Web
Semantic WebSemantic Web
Semantic Web
 
Big, Open, Data and Semantics for Real-World Application Near You
Big, Open, Data and Semantics for Real-World Application Near YouBig, Open, Data and Semantics for Real-World Application Near You
Big, Open, Data and Semantics for Real-World Application Near You
 
Session1
Session1Session1
Session1
 
Session1
Session1Session1
Session1
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
 
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionAI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search Component
 
Ellen König - Machine learning for the curious but scared - Codemotion Berlin...
Ellen König - Machine learning for the curious but scared - Codemotion Berlin...Ellen König - Machine learning for the curious but scared - Codemotion Berlin...
Ellen König - Machine learning for the curious but scared - Codemotion Berlin...
 
Knowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchKnowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based Search
 
Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)
 
Computational Biology, Part 4 Protein Coding Regions
Computational Biology, Part 4 Protein Coding RegionsComputational Biology, Part 4 Protein Coding Regions
Computational Biology, Part 4 Protein Coding Regions
 

Más de butest

1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 
Download
DownloadDownload
Downloadbutest
 

Más de butest (20)

1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 
Download
DownloadDownload
Download
 

mlas06_nigam_tie_01.ppt

  • 1. Machine Learning for Information Extraction: An Overview Kamal Nigam Google Pittsburgh With input, slides and suggestions from William Cohen, Andrew McCallum and Ion Muslea
  • 2. Example: A Problem Genomics job Mt. Baker, the school district Baker Hostetler , the company Baker, a job opening
  • 4. Job Openings: Category = Food Services Keyword = Baker Location = Continental U.S.
  • 5. Extracting Job Openings from the Web Title: Ice Cream Guru Description: If you dream of cold creamy… Contact: [email_address] Category: Travel/Hospitality Function: Food Services
  • 6. Potential Enabler of Faceted Search
  • 7. Lots of Structured Information in Text
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. Landscape of ML Techniques for IE: Any of these models can be used to capture words, formatting or both. Classify Candidates Abraham Lincoln was born in Kentucky . Classifier which class? Sliding Window Abraham Lincoln was born in Kentucky. Classifier which class? Try alternate window sizes: Boundary Models Abraham Lincoln was born in Kentucky. Classifier which class? BEGIN END BEGIN END BEGIN Finite State Machines Abraham Lincoln was born in Kentucky. Most likely state sequence? Wrapper Induction <b><i> Abraham Lincoln </i></b> was born in Kentucky. Learn and apply pattern for a website <b> <i> PersonName
  • 19. Sliding Windows & Boundary Detection
  • 20. Information Extraction by Sliding Windows GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 21. Information Extraction by Sliding Windows GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 22. Information Extraction by Sliding Window GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 23. Information Extraction by Sliding Window GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 24. Information Extraction by Sliding Window GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 25.
  • 26.
  • 27. IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 28. IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 29. IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 30. IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 31. IE by Boundary Detection GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. CMU UseNet Seminar Announcement
  • 32.
  • 33.
  • 34. BWI: Learning to detect boundaries Field F1 Person Name: 30% Location: 61% Start Time: 98%
  • 35.
  • 37. Hidden Markov Models S t - 1 S t O t S t+1 O t +1 O t - 1 ... ... Finite state model Graphical model Parameters: for all states S={s 1 ,s 2 ,…} Start state probabilities: P(s t ) Transition probabilities: P(s t |s t-1 ) Observation (emission) probabilities: P(o t |s t ) Training: Maximize probability of training observations (w/ prior) HMMs are the standard sequence modeling tool in genomics, music, speech, NLP, … ... transitions observations o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 Generates: State sequence Observation sequence Usually a multinomial over atomic, fixed alphabet
  • 38. IE with Hidden Markov Models Yesterday Lawrence Saul spoke this example sentence. Yesterday Lawrence Saul spoke this example sentence. Person name: Lawrence Saul Given a sequence of observations: and a trained HMM: Find the most likely state sequence: (Viterbi) Any words said to be generated by the designated “person name” state extract as a person name:
  • 39.
  • 40. HMM Example: “Nymble” Other examples of HMMs in IE: [Leek ’97; Freitag & McCallum ’99; Seymore et al. 99] Task: Named Entity Extraction Train on 450k words of news wire text. Case Language F1 . Mixed English 93% Upper English 91% Mixed Spanish 90% [Bikel, et al 97] Person Org Other (Five other name classes) start-of-sentence end-of-sentence Transition probabilities Observation probabilities P(s t | s t-1 , o t-1 ) P(o t | s t , s t-1 ) Back-off to: Back-off to: P(s t | s t-1 ) P(s t ) P(o t | s t , o t-1 ) P(o t | s t ) P(o t ) or Results:
  • 41.
  • 42.
  • 43.
  • 44. Conditional Markov Models S t - 1 S t O t S t+1 O t +1 O t - 1 ... ... Generative (traditional HMM) ... transitions observations S t - 1 S t O t S t+1 O t +1 O t - 1 ... ... Conditional ... transitions observations Standard belief propagation: forward-backward procedure. Viterbi and Baum-Welch follow naturally. Maximum Entropy Markov Models [McCallum, Freitag & Pereira, 2000] MaxEnt POS Tagger [Ratnaparkhi, 1996] SNoW-based Markov Model [Punyakanok & Roth, 2000]
  • 45. Exponential Form for “Next State” Function Capture dependency on s t-1 with |S| independent functions, P s t-1 (s t |o t ). Each state contains a “next-state classifier” that, given the next observation, produces a probability of the next state, P s t-1 (s t |o t ). s t-1 s t Recipe: - Labeled data is assigned to transitions. - Train each state’s exponential model by maximum entropy weight feature
  • 46.
  • 47. From HMMs to MEMMs to CRFs HMM MEMM CRF S t-1 S t O t S t+1 O t+1 O t-1 S t-1 S t O t S t+1 O t+1 O t-1 S t-1 S t O t S t+1 O t+1 O t-1 ... ... ... ... ... ... (A special case of MEMMs and CRFs.) Conditional Random Fields (CRFs) [Lafferty, McCallum, Pereira ‘2001]
  • 48. Conditional Random Fields (CRFs) S t S t+1 S t+2 O = O t , O t+1 , O t+2 , O t+3 , O t+4 S t+3 S t+4 Markov on s , conditional dependency on o. Hammersley-Clifford-Besag theorem stipulates that the CRF has this form—an exponential function of the cliques in the graph. Assuming that the dependency structure of the states is tree-shaped (linear chain is a trivial tree), inference can be done by dynamic programming in time O(|o| |S| 2 )—just like HMMs.
  • 49.
  • 50.
  • 51.
  • 52.

Notas del editor

  1. Not only did this support extremely useful and targeted search and data navigation, with highly summarized display
  2. CiteSeer, which we all know and love, is made possible by the automatic extraction of title and author information from research paper headers and references in order to build the reference graph that we use to find related work.
  3. ML… although this is an area where ML has not yet trounced the hand-built systems. In some of the latest evaluations, hand-built shared 1 st place with a ML. Now many companies making a business from IE (from the Web): WasBang, Inxight, Intelliseek, ClearForest.
  4. Technical approach? At CMU I worked w/ HMMs. HMMs: standard tool…, It is a model that makes certain Markov and independence assumptions as depicted in this Graphical Model. Pr certain states are the val of this random variable… = this product of independent pr. Emissions are traditionally atomic entities, like words. Big part of power is the context of finite state: Viterbi We say this is generative model because it has parameters for probability of producing observed emissions (given state)
  5. Both in academia and in industry, FSMs are the dominant method of information extraction. HMM acts as a generative model of a research paper header. It has trad. parameters for… Emissions are multinomial distributions over words Params set to maximize likelihood of emissions in training data. HMMs work better than several alternative methods, but...
  6. 1. Prefer richer… one that allows for multiple, arbitrary... In many cases, for example names and other large vocabulary fields, the identity of word alone is not very predictive because you probably have never seen it before. For example…”Wisneiwski” … not city In many other cases, such as FAQ seg, the relevant features are not really the words at all, but attributes of the line or para as a whole… emissions=whole lines of text at a time
  7. Technical approach? At CMU I worked w/ HMMs. HMMs: standard tool…, It is a model that makes certain Markov and independence assumptions as depicted in this Graphical Model. Pr certain states are the val of this random variable… = this product of independent pr. Emissions are traditionally atomic entities, like words. Big part of power is the context of finite state: Viterbi We say this is generative model because it has parameters for probability of producing observed emissions (given state)
  8. 04/26/10
  9. MEMMs do have one technical problem: biased towards states with few siblings. No time to explain intricasies now. CRFs = generalization of MEMMs: move normalizer Don’t let all the equations scare you: this slide is easier than it looks. Here is the old, traditional HMM: Prob of state and obs sequence is product of each state trans, and observation emission in the sequence. MEMMs: as we discussed, replaces with conditional, where we no longer generate the obs. We write this as an exponential function Only difference between MEMMs and CRFs is that we move the normalizer outside the product. Inference in MRFs usually requires nasty Gibbs sampling, but due to special structure, clever DP comes to the rescue again.
  10. Graphical model on previous slide was a special case. In general, observations don’t have to be chopped up, and feature functions can ask arbitrary questions about any range of the observation sequence. Not going to get into method for learning lambda weights from training data, but simply a matter of maximum likelihood: This is what trying to maximize, differentiate, and solve with Conjugate Gradient. Again, DP makes it efficient.