SlideShare una empresa de Scribd logo
1 de 47
Descargar para leer sin conexión
Active Learning on Question
Answering with Dialogues
Shen Gao
Content
● Question Answering
● Data Collection
● Active Learning
● User Interaction
● System Architecture
● Results & Future Work
● “Question Answering”
Question Answering
Question Answering is a Computer Science discipline focuses on building
automated systems which are able to answer questions from human in
natural language.
Question Answering
Model
Passage
Question
Answer
Question Answering Data Sets
Text Source QA Source
Quasar-T Search Engine (Google / Bing) Trivia
Search QA Search Engine (Google / Bing) Jeopardy!
SQuAD Wikipedia Articles Annotation
Why Dialogue?
● Natural
● Machine User Interaction
● Availability
○ Transcripts
○ Texting
● Little previous work
Source: Statistics Brain
Question Answering in Dialogue
● TV Series Friends
● 10 Seasons
● 236 Episodes
● 3000 + Scenes
● Datasets from Character Mining
○ JSON formatted data
○ Tokenized
○ Season - Episode - Scene - Utterance
○ Plots available for 44% scenes
Classification on Question Types
● Based on type of answer:
○ Categorical (Multichoice) - Binary (Polar)
○ Continuous (Span of text)
● Based on Inference
○ Explicit
○ Implicit
● Based on answerability (newly introduced in sQuAD):
○ Unanswerable
○ Answerable
Explicit Questions
Q1: Does the job interview includes cooking a salad?
Implicit Questions
Q2 Is the interviewer picky?
Explicit vs Implicit
● The contextual similarity between question and answer
● The amount of inference needed to resolve
● Q1: Explicit; Abundant Similarity; Little Inference
● Q2: Implicit; Little / None Similarity; Substantial Inference
Annotation Tool
Annotation
● Annotation Phases:
○ Experimental Phase - Small Data Chunk
○ Production Phase - All Data
● Tasks per phase:
○ Question & Answer Generation
○ Verification - Inter-Annotator Agreement
Experimental
Revision
On
Template
Stable
High
IAT
Production
Challenges on Annotation
● Ambiguous Pronouns:
○ Example: In a scene having Chandler and Joey: Is he excited about the date?
● Exact wording from the original text
● Low Agreement measured
● Attempted Resolution:
○ Update instructions
○ Integrate Plots in Scene
○ Reduce the number of Questions
Evaluation Metric
● Binary Questions - Exact Match (EM)
● Continuous Questions - F1 Score
Results from Annotation
● Second Round:
○ Added Plot
○ Updated Instructions
● Third Round: Dropped # of Questions
● Random guess would give 50%!
● Cannot obtain high quality data
Change in Path
Dialogue QA
Continuous
Binary
Annotation Model Dev Analysis
Annotation Model Dev Analysis
Active
Learning
System
Dev
Online
Production
Analysis
Active Learning
● Active Learning is a sub-branch of Machine Learning in which the learning
system will interactively query the user to obtain the desired data from user.
● The goal of our system is to:
○ Collect data for model needed for improvement
○ Improve the model by applying these data
● What we offer:
○ Answer queries from user
○ Learn from user
● What user provide:
○ Annotation on the data
Baseline Model
● BERT (Bidirectional Encoder
Representations from Transformers)
from Google AI
● Contextual vs Context Free
○ Bank account
○ River Bank
Pre-train
Network
Contextual
Representation
Downstream
Model
Output
Baseline Model
● Unprecedented results in sQuAD
● Power of Bidirectional Flow
○ Versus Left->Right; Right->Left
○ Allows learning a word from all
of its context
● Masked training
User Interaction - Tutorial
User Interaction - Post Question
User Interaction - Receive Answer
User Interaction - Correct Answer
User Guidance
● Which Scene the user needs to work on
○ Ensure all scenes are evenly annotated
● Which Type of question the user needs to work on
○ Type we have least data on
○ Type the model performed worst
● User Experience: Too Monotonous?
User Guidance
● Scene Selection
● Randomly select from least
annotated
● Type Selection
● Use Probability Function to
Control randomly Select
User Guidance
● Constant c is used to linearly scale the probabilities
● Describes the degree of discrepancy between question types
User Guidance
● Train - Train the model
● Dev - Obtain stat for guidance
● Test - Evaluate Performance
● Test Statistics never shown to system
Tech Stack Overview
● Front-End: HTML, Javascript, JQuery,CSS
● Back-End: Django backend Framework (Routing, Request Parsing, ORM), python
● Database: mySQL Database
● Machine Learning Service: Tensorflow
● Deployment: AWS EC2 instance
Model View Controller (MVC)
● View: User Interface
● Controller: Logic
● Model: Data Storage
Controller get-scene
scene, type
post-question
answer
post-correction
● REST API
● Unauthenticated
● GET get-scene
● POST post-question
● POST post-answer
Controller - Security
● Server needs to know which question
user is changing
● Dummy id could create loophole
● Allow malicious user to change the
response from others
● Session is anonymous, unauthenticated
post-correction:
question-id: 1
question-id:
3/26-s1-e1-c1-1
post-correction:
question-id: 1
question-id:
3/26-s1-e1-c1-1
Controller - Security
● Solution - Hashing + Salt
● Password should not be stored in plain text
● Salt mitigates brute-force attack
● Hash also prevents secret disclosure:
○ Prevents user from know how we compute the
hash
● The hash itself is returned to user
Django Object-Relational Mapping (ORM)
● Mapping Between Database Language and Programming Language
● SQL <-> Python
● Apply structural changes to Database
● Query Database in Programming Language
● Widely used in industry & Reduce Error
Database Schema
Optimization on DB
● Indexing on fields need query
○ hash in User Response
○ count in Scene
● Delay in Database writes:
Receive
Request
Handle
Request
Return
Response
Database
IO
Concurrency on DB
● Two users could work on the same
question type / scene
● Increment the count at the same time
● Pessimistic Row-Level Locking
○ Must acquire lock before write
○ Prevents dirty write
BERT Service
● Performance
○ Reduce Overhead
● Concurrency
○ Modularize into workers
○ Synchronize
● Update
BERT Service - Predict
● Workers
○ Dedicated Model
○ Dedicated Local Space for compute
● Worker Array - Size N
● Mutex Array - Size N
● Semaphore - N available
● Acquire Semaphore first
● Then acquire mutex
● Exception Handling ensure no deadlock
W W W W W
Semaphore
M M M MM
BERT Service - Train
● Query DB for new responses
● Check batch size
● Train with batch
● Populate new worker array
● Change pointers
BERT Service
W W W W W
W W W W W
Snapshot
● Keep track of model progress
● Cron Jobs
● Use the latest worker to test against
○ dev dataset
○ test dataset
● Record:
○ Respective performance
○ Counts
○ User-Model F1
Production
● Advertised through email to students in the department
● Collected data for 7 days
● Will continue online in future
Result - System Performance
● Measured by average of 100
requests
● Predict interface measured by 100
randomly selected scenes with test
questions
● Performance in deployment
environment
Results - Data Collection
● Collected 151 responses
● Concentrated on weak types (72.18% vs 50.64%)
● No evaluation improvement yet
● 1.76% of training data
Result - User-Model F1
● Model cannot learn from its own
prediction
● Denotes reverse of similarity
between model response and user
input
Future Work
● Funding
● Current Major Limitation: Responses
● More advertising through:
○ Community of NLP
○ Community of Friends
“Question Answering”

Más contenido relacionado

Similar a Active Learning on Question Answering with Dialogues

Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or realityAwantik Das
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Karthik Deivasigamani
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous
 
Automatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLAutomatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLHimadri Mishra
 
Moving from BI to AI : For decision makers
Moving from BI to AI : For decision makersMoving from BI to AI : For decision makers
Moving from BI to AI : For decision makerszekeLabs Technologies
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DSRoopesh Kohad
 
Overcome the Reign of Chaos
Overcome the Reign of ChaosOvercome the Reign of Chaos
Overcome the Reign of ChaosMichael Stockerl
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark mldatamantra
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingDatabricks
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsDataWorks Summit
 
Testing Tools Online Training.pdf
Testing Tools Online Training.pdfTesting Tools Online Training.pdf
Testing Tools Online Training.pdfSpiritsoftsTraining
 

Similar a Active Learning on Question Answering with Dialogues (20)

Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Automatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLAutomatic Machine Learning, AutoML
Automatic Machine Learning, AutoML
 
Moving from BI to AI : For decision makers
Moving from BI to AI : For decision makersMoving from BI to AI : For decision makers
Moving from BI to AI : For decision makers
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
Build machine learning pipelines from research to production
Build machine learning pipelines from research to productionBuild machine learning pipelines from research to production
Build machine learning pipelines from research to production
 
Overcome the Reign of Chaos
Overcome the Reign of ChaosOvercome the Reign of Chaos
Overcome the Reign of Chaos
 
Aws autopilot
Aws autopilotAws autopilot
Aws autopilot
 
SKLearn Workshop.pptx
SKLearn Workshop.pptxSKLearn Workshop.pptx
SKLearn Workshop.pptx
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark ml
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated Training
 
MongoDB Online Training.pdf
MongoDB Online Training.pdfMongoDB Online Training.pdf
MongoDB Online Training.pdf
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
 
Testing Tools Online Training.pdf
Testing Tools Online Training.pdfTesting Tools Online Training.pdf
Testing Tools Online Training.pdf
 
CSSC ML Workshop
CSSC ML WorkshopCSSC ML Workshop
CSSC ML Workshop
 

Más de Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Jinho Choi
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Jinho Choi
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionJinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning RepresentationJinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingJinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet SimilaritiesJinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical RelationsJinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementJinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingJinho Choi
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueJinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingJinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological SortJinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseJinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsJinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyJinho Choi
 

Más de Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 

Último

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Último (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Active Learning on Question Answering with Dialogues

  • 1. Active Learning on Question Answering with Dialogues Shen Gao
  • 2. Content ● Question Answering ● Data Collection ● Active Learning ● User Interaction ● System Architecture ● Results & Future Work ● “Question Answering”
  • 3. Question Answering Question Answering is a Computer Science discipline focuses on building automated systems which are able to answer questions from human in natural language.
  • 5. Question Answering Data Sets Text Source QA Source Quasar-T Search Engine (Google / Bing) Trivia Search QA Search Engine (Google / Bing) Jeopardy! SQuAD Wikipedia Articles Annotation
  • 6. Why Dialogue? ● Natural ● Machine User Interaction ● Availability ○ Transcripts ○ Texting ● Little previous work Source: Statistics Brain
  • 7. Question Answering in Dialogue ● TV Series Friends ● 10 Seasons ● 236 Episodes ● 3000 + Scenes ● Datasets from Character Mining ○ JSON formatted data ○ Tokenized ○ Season - Episode - Scene - Utterance ○ Plots available for 44% scenes
  • 8. Classification on Question Types ● Based on type of answer: ○ Categorical (Multichoice) - Binary (Polar) ○ Continuous (Span of text) ● Based on Inference ○ Explicit ○ Implicit ● Based on answerability (newly introduced in sQuAD): ○ Unanswerable ○ Answerable
  • 9. Explicit Questions Q1: Does the job interview includes cooking a salad?
  • 10. Implicit Questions Q2 Is the interviewer picky?
  • 11. Explicit vs Implicit ● The contextual similarity between question and answer ● The amount of inference needed to resolve ● Q1: Explicit; Abundant Similarity; Little Inference ● Q2: Implicit; Little / None Similarity; Substantial Inference
  • 13. Annotation ● Annotation Phases: ○ Experimental Phase - Small Data Chunk ○ Production Phase - All Data ● Tasks per phase: ○ Question & Answer Generation ○ Verification - Inter-Annotator Agreement Experimental Revision On Template Stable High IAT Production
  • 14. Challenges on Annotation ● Ambiguous Pronouns: ○ Example: In a scene having Chandler and Joey: Is he excited about the date? ● Exact wording from the original text ● Low Agreement measured ● Attempted Resolution: ○ Update instructions ○ Integrate Plots in Scene ○ Reduce the number of Questions
  • 15. Evaluation Metric ● Binary Questions - Exact Match (EM) ● Continuous Questions - F1 Score
  • 16. Results from Annotation ● Second Round: ○ Added Plot ○ Updated Instructions ● Third Round: Dropped # of Questions ● Random guess would give 50%! ● Cannot obtain high quality data
  • 17. Change in Path Dialogue QA Continuous Binary Annotation Model Dev Analysis Annotation Model Dev Analysis Active Learning System Dev Online Production Analysis
  • 18. Active Learning ● Active Learning is a sub-branch of Machine Learning in which the learning system will interactively query the user to obtain the desired data from user. ● The goal of our system is to: ○ Collect data for model needed for improvement ○ Improve the model by applying these data ● What we offer: ○ Answer queries from user ○ Learn from user ● What user provide: ○ Annotation on the data
  • 19. Baseline Model ● BERT (Bidirectional Encoder Representations from Transformers) from Google AI ● Contextual vs Context Free ○ Bank account ○ River Bank Pre-train Network Contextual Representation Downstream Model Output
  • 20. Baseline Model ● Unprecedented results in sQuAD ● Power of Bidirectional Flow ○ Versus Left->Right; Right->Left ○ Allows learning a word from all of its context ● Masked training
  • 21. User Interaction - Tutorial
  • 22. User Interaction - Post Question
  • 23. User Interaction - Receive Answer
  • 24. User Interaction - Correct Answer
  • 25. User Guidance ● Which Scene the user needs to work on ○ Ensure all scenes are evenly annotated ● Which Type of question the user needs to work on ○ Type we have least data on ○ Type the model performed worst ● User Experience: Too Monotonous?
  • 26. User Guidance ● Scene Selection ● Randomly select from least annotated ● Type Selection ● Use Probability Function to Control randomly Select
  • 27. User Guidance ● Constant c is used to linearly scale the probabilities ● Describes the degree of discrepancy between question types
  • 28. User Guidance ● Train - Train the model ● Dev - Obtain stat for guidance ● Test - Evaluate Performance ● Test Statistics never shown to system
  • 29. Tech Stack Overview ● Front-End: HTML, Javascript, JQuery,CSS ● Back-End: Django backend Framework (Routing, Request Parsing, ORM), python ● Database: mySQL Database ● Machine Learning Service: Tensorflow ● Deployment: AWS EC2 instance
  • 30. Model View Controller (MVC) ● View: User Interface ● Controller: Logic ● Model: Data Storage
  • 31. Controller get-scene scene, type post-question answer post-correction ● REST API ● Unauthenticated ● GET get-scene ● POST post-question ● POST post-answer
  • 32. Controller - Security ● Server needs to know which question user is changing ● Dummy id could create loophole ● Allow malicious user to change the response from others ● Session is anonymous, unauthenticated post-correction: question-id: 1 question-id: 3/26-s1-e1-c1-1 post-correction: question-id: 1 question-id: 3/26-s1-e1-c1-1
  • 33. Controller - Security ● Solution - Hashing + Salt ● Password should not be stored in plain text ● Salt mitigates brute-force attack ● Hash also prevents secret disclosure: ○ Prevents user from know how we compute the hash ● The hash itself is returned to user
  • 34. Django Object-Relational Mapping (ORM) ● Mapping Between Database Language and Programming Language ● SQL <-> Python ● Apply structural changes to Database ● Query Database in Programming Language ● Widely used in industry & Reduce Error
  • 36. Optimization on DB ● Indexing on fields need query ○ hash in User Response ○ count in Scene ● Delay in Database writes: Receive Request Handle Request Return Response Database IO
  • 37. Concurrency on DB ● Two users could work on the same question type / scene ● Increment the count at the same time ● Pessimistic Row-Level Locking ○ Must acquire lock before write ○ Prevents dirty write
  • 38. BERT Service ● Performance ○ Reduce Overhead ● Concurrency ○ Modularize into workers ○ Synchronize ● Update
  • 39. BERT Service - Predict ● Workers ○ Dedicated Model ○ Dedicated Local Space for compute ● Worker Array - Size N ● Mutex Array - Size N ● Semaphore - N available ● Acquire Semaphore first ● Then acquire mutex ● Exception Handling ensure no deadlock W W W W W Semaphore M M M MM
  • 40. BERT Service - Train ● Query DB for new responses ● Check batch size ● Train with batch ● Populate new worker array ● Change pointers BERT Service W W W W W W W W W W
  • 41. Snapshot ● Keep track of model progress ● Cron Jobs ● Use the latest worker to test against ○ dev dataset ○ test dataset ● Record: ○ Respective performance ○ Counts ○ User-Model F1
  • 42. Production ● Advertised through email to students in the department ● Collected data for 7 days ● Will continue online in future
  • 43. Result - System Performance ● Measured by average of 100 requests ● Predict interface measured by 100 randomly selected scenes with test questions ● Performance in deployment environment
  • 44. Results - Data Collection ● Collected 151 responses ● Concentrated on weak types (72.18% vs 50.64%) ● No evaluation improvement yet ● 1.76% of training data
  • 45. Result - User-Model F1 ● Model cannot learn from its own prediction ● Denotes reverse of similarity between model response and user input
  • 46. Future Work ● Funding ● Current Major Limitation: Responses ● More advertising through: ○ Community of NLP ○ Community of Friends