Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

Saurabh Saxena
Saurabh SaxenaEnterprise Java and Machine Learning engineer
Deep learning enabled Question
Answering models
PROJECT WORK PRESENTATION
Saurabh Saxena
2015HT12604
Introduction
DEEP LEARNING AND QUESTION ANSWERING SYSTEMS
Deep learning
 What is Deep learning?
Deep learning is a new area of Machine learning research that uses multi-
layered Artificial neural networks. The objective is to learn multiple levels of
representation and abstraction that help to make sense of data such as images,
sound, and text. It is and is becoming increasingly relevant because of three
key reasons :
 An infinitely flexible function – universal function approximation via Neural networks
 All-purpose parameter fitting – using gradient descent and its derivative algorithms
 Fast and scalable – availability of cheap GPUs for fast matrix multiplications
 Typical applications of Deep learning
 Convolution Neural Networks(CNN) in Computer vision and machine translation
 Recurrent Neural Network(RNN) like LSTM/GRU in language modeling
 Tree Neural Networks(TNN) in sentiment analysis
 Reinforcement learning in Game playing and intelligent agents
4
Basic Building blocks of deep learning
Most DL Networks (including Question Answering models) are composed
out of these basic building blocks:
• Fully Connected Network
• Word Embedding
• Convolutional Neural Network
• Recurrent Neural Network
General Architecture of a Deep model
 What is a Question Answering System?
The basic idea of an automated QA system is to extract information from
documents and given a user query provide a short and concise answer that will
meet user’s information needs.
 Traditional QA systems are basically of 2 types :
 Information Retrieval(IR) based QA – Match and ranking based broad domain
QA using mostly unstructured data, example -> Search engines
 Knowledge-based(KB) QA – semantic representation of query using structured
data like triple stores or SQL example -> Freebase , DBPedia, and Wolfram
alpha
 Question types
 Factoid questions – DeepMind CNN/DailyMail datset
 Cloze style questions – MCTest dataset and bAbI
 Open domain question answering – WikiQA and LAMBADA
QA systems
QA scenarios
QA scenarios
QA scenarios
QA scenarios
Motivations – What deep learning can
do for QA systems ?
 Traditional QA pipeline relies a lot on manual feature engineering. The aim of
deep learning models is to eliminate this.
 Aim to build systems that can directly read documents and then answer
questions based on those documents.
 RNNs have been successful in language modeling and generation but could
not achieve much success in QA as they cannot store enough context in their
hidden states . To answer complex questions models require supporting facts
far back in the past.
 Suffer from vanishing gradient problem if too many time-steps are used.
 Solution - incorporate explicit Memory in the model and a way to address
that memory for read and write.
Memory networks for QA
AND THEIR VARIANTS
What are Memory Networks ?
 Class of models that combine large memory with learning component that
can read and write to it.
 Incorporates reasoning with attention over memory (RAM).
 Most ML has limited memory which is more-or-less all that’s needed for
“low level” tasks e.g. object detection.
 Long-term memory is required to read a story and then e.g. answer
questions about it.
 It is also required for dialog: to remember previous dialog (short- and
long-term), and respond.
 Models are scalable - can store and read large amount of data in memory
- entire KB
All MemNN have four component networks (which may or
may not have shared parameters):
 I: (input feature map) convert incoming data to the internal feature
representation.
 G: (generalization) update memories given new input.
 O: produce new output (in feature representation space) given the
memories.
 R: (response) convert output O into response seen by the outside world
Step 1: controller converts incoming data to internal
feature representation (I)
Step 2: write head updates the memories and writes the data
into memory (G)
Step 3: given the external input, the read head reads
the memory and fetches relevant data (O)
Step 4: controller combines the external data with
memory contents returned by read head to generate
output (O, R)
State-of-the art Memory Networks
Datasets to train Deep QA models
BABI , LAMBADA , MCTEST AND MORE…
Datasets available to train/test QA
models
 Facebook bAbI Simplequestions– A set of 20 tasks for testing text understanding
and reasoning. For each task, there are 10000 questions for training, and 1000 for
testing. Each task tests the machine on a specific skill set.
https://research.fb.com/downloads/babi/
 Facebook bAbI Chidlren's Book Test(CBT)- Text passages and corresponding
questions drawn from Project Gutenberg Children's books. 669,343 training
questions , 8,000 dev questions and 10,000 test questions
 MCTest - consists of 500 stories and 2000 questions. By being fictional, the answer
typically can be found only in the story itself. Requires machines to answer
multiple-choice reading comprehension questions about fictional stories, directly
tackling the high-level goal of open-domain machine comprehension.
http://research.microsoft.com/en-us/um/redmond/projects/mctest/
 Language Modeling Broadened to Account for Discourse Aspects(LAMBADA
dataset) - consists of 10,022 passages, divided into 4,869 development and 5,153
test passages (extracted from 1,331 and 1,332 disjoint novels, respectively). The
average passage consists of 4.6 sentences in the context plus 1 target sentence, for
a total length of 75.4 tokens (dev) / 75 tokens (test).
http://clic.cimec.unitn.it/lambada/
 DeepMind CNN and DailyMail dataset - Collection of news articles and
corresponding cloze queriesEach dataset contains many documents (90k and 197k
each), and each document has on average 4 questions approximately. Each
question is a sentence with one missing word/phrase which can be found from the
accompanying document/context
http://cs.nyu.edu/~kcho/DMQA/
 Stanford Question answering Dataset (SQuAD) - reading comprehension dataset
consisting of questions posed by crowd-workers on a set of Wikipedia articles. The
answer to every question is a segment of text, or span, from the corresponding
reading passage. There are 100,000+ question-answer pairs on 500+ articles.
https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/
 AI2 Science Exams - Elementary science questions from US state and regional
science exam. 170 multi-state and 108 4th grade questions.
http://allenai.org/data/science-exam-questions.html
 WikiQA - 3047 questions sampled from Bing query logs. Each question associated
with a Wikipedia page. All sentences in the summary paragraph of the page
become the candidate answers. Only 1/3rd questions have a correct answer in the
candidate answer set.
https://www.microsoft.com/en-us/research/publication/wikiqa-a-challenge-
dataset-for-open-domain-question-answering/
Facebook bAbI dataset – 20 tasks
• Single supporting fact
• Two supporting facts
• Three supporting facts
• Two argument relations
• Three argument relations
• Yes/No questions
• Counting
• Lists/sets
• Simple Negation
• Indefinite Knowledge
• Basic Coreference
• Conjunction
• Compound Coreference
• Time Reasoning
• Basic Deduction
• Basic Induction
• Positional Reasoning
• Size Reasoning
• Path Finding
• Agent’s Motivations
20 tasks in brief..
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
End-to-End MemNN
Dynamic MemNN
Key-value MemNN Architecture
Experimental Setup to train deep
models
GPU, THEANO, KERAS , CUDA , CUDNN AND MORE…
Component Description
Operating System Ubuntu 16.04 VM on Intel Octa core CPU with 6.5 GB RAM
Graphics Card NVIDIDA Testla K80 with 12 GB Ram and 2080 CUDA cores
Graphics Toolkit CUDA 8.0 with CuDNN 6.0
Python Package Manager Anaconda (Continuum Analytics) for Python 2.7
Deep learning library Keras v2.0.2
with Theano v0.9.0 backend
Other python modules  Bcolz v1.0.0 for fast saving/loading of trained weights
 Numpy v1.12.1 for all multi-dimensional numeric manipulations
 Scikit-learn v0.18.1 for preprocessing, pipelining, feature-extraction, decomposition , dataset
splits and all general non-deep machine algorithms
 Cpickle for saving model
 NLTK toolkit for traditional linguistic tasks
 Matplotlib v2.0.0 – for visualizing data
 Pydot v1.0.28 and GraphViz v2.38.0– for visualizing deep models
 Openblas 0.2.19 – for fast linear algebra operations
 Pandas v0.19.2 for structured data manipulation
 Protobuf 3.0.0 for protocol buffering
 Flask v0.12 for web display
Experimental setup in Google Cloud
Compute Engine setup in Google Cloud
GPU details
Training Summary
MODELS, TEST ACCURACY AND MORE…
Model summary for bAbi Task#1
Training summary for bAbI Task#1 – one supporting fact
Training summary for bAbI Task#2 – 2 supporting fact
Joint training on all 20 tasks simultaneously
Demo on bAbi tasks -
Correct answers
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Demo – Incorrect answer
Future work
 Train Dynamic Memory network on bAbi dataset
 Train Key-value memory network on bAbi dataset
 Evaluate the performance of current models on other datasets like
LAMBADA and Stanford SQUAD
 Explore the possibility of transfer learning so that models trained on open
source datasets can be applied to corporate datasets with only fine tuning
 Explore the use of trained models in dialog modeling for Helpdesk
Question answering
Thanks
1 de 39

Recomendados

Deep Learning Models for Question Answering por
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question AnsweringSujit Pal
14.4K vistas44 diapositivas
Artificial Intelligence, Machine Learning and Deep Learning por
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
22.9K vistas37 diapositivas
Sentence representations and question answering (YerevaNN) por
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)YerevaNN research lab
3.2K vistas33 diapositivas
[KDD 2018 tutorial] End to-end goal-oriented question answering systems por
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
8.1K vistas200 diapositivas
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning por
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud
8.6K vistas204 diapositivas
MaxEnt (Loglinear) Models - Overview por
MaxEnt (Loglinear) Models - OverviewMaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - Overviewananth
2K vistas13 diapositivas

Más contenido relacionado

La actualidad más candente

NLP Classifier Models & Metrics por
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & MetricsSanghamitra Deb
1.5K vistas42 diapositivas
An introduction to Machine Learning (and a little bit of Deep Learning) por
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
2.5K vistas52 diapositivas
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07) por
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Márton Miháltz
999 vistas20 diapositivas
Deep Learning For Practitioners, lecture 2: Selecting the right applications... por
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
1K vistas6 diapositivas
Advance deep learning por
Advance deep learningAdvance deep learning
Advance deep learningaliaKhan71
103 vistas10 diapositivas
Deep learning based recommender systems (lab seminar paper review) por
Deep learning based recommender systems (lab seminar paper review)Deep learning based recommender systems (lab seminar paper review)
Deep learning based recommender systems (lab seminar paper review)hyunsung lee
1K vistas36 diapositivas

La actualidad más candente(20)

NLP Classifier Models & Metrics por Sanghamitra Deb
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & Metrics
Sanghamitra Deb1.5K vistas
An introduction to Machine Learning (and a little bit of Deep Learning) por Thomas da Silva Paula
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)
Thomas da Silva Paula2.5K vistas
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07) por Márton Miháltz
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Márton Miháltz999 vistas
Deep Learning For Practitioners, lecture 2: Selecting the right applications... por ananth
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
ananth1K vistas
Advance deep learning por aliaKhan71
Advance deep learningAdvance deep learning
Advance deep learning
aliaKhan71103 vistas
Deep learning based recommender systems (lab seminar paper review) por hyunsung lee
Deep learning based recommender systems (lab seminar paper review)Deep learning based recommender systems (lab seminar paper review)
Deep learning based recommender systems (lab seminar paper review)
hyunsung lee1K vistas
Deep Learning for NLP: An Introduction to Neural Word Embeddings por Roelof Pieters
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Roelof Pieters20.1K vistas
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ... por Sujit Pal
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Sujit Pal7.7K vistas
Transfer Learning and Fine-tuning Deep Neural Networks por PyData
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
PyData10.5K vistas
Creating AnswerBot with Keras and TensorFlow (TensorBeat) por Avkash Chauhan
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Avkash Chauhan17.1K vistas
Natural Language Processing Advancements By Deep Learning: A Survey por Rimzim Thube
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
Rimzim Thube78 vistas
Deep Learning, an interactive introduction for NLP-ers por Roelof Pieters
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
Roelof Pieters13.7K vistas
Deep learning with tensorflow por Charmi Chokshi
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
Charmi Chokshi1.5K vistas
Deep Learning Made Easy with Deep Features por Turi, Inc.
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
Turi, Inc.2.2K vistas
Deep Learning for NLP Applications por Samiur Rahman
Deep Learning for NLP ApplicationsDeep Learning for NLP Applications
Deep Learning for NLP Applications
Samiur Rahman2K vistas
ODSC East: Effective Transfer Learning for NLP por indico data
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
indico data417 vistas
NLP Bootcamp por Anuj Gupta
NLP BootcampNLP Bootcamp
NLP Bootcamp
Anuj Gupta1.6K vistas
Generating Natural-Language Text with Neural Networks por Jonathan Mugan
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
Jonathan Mugan52.8K vistas
ML DL AI DS BD - An Introduction por Dony Riyanto
ML DL AI DS BD - An IntroductionML DL AI DS BD - An Introduction
ML DL AI DS BD - An Introduction
Dony Riyanto8.4K vistas

Similar a Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

A Platform for Accelerating Machine Learning Applications por
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning ApplicationsNVIDIA Taiwan
1.1K vistas32 diapositivas
Introduction of Deep Learning por
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep LearningMyungjin Lee
6K vistas30 diapositivas
Distributed deep learning_over_spark_20_nov_2014_ver_2.8 por
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Vijay Srinivas Agneeswaran, Ph.D
2.5K vistas48 diapositivas
Final training course por
Final training courseFinal training course
Final training courseNoor Dhiya
34 vistas63 diapositivas
Distributed Deep Learning + others for Spark Meetup por
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupVijay Srinivas Agneeswaran, Ph.D
2.1K vistas43 diapositivas
BigDL webinar - Deep Learning Library for Spark por
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkDESMOND YUEN
245 vistas56 diapositivas

Similar a Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk(20)

A Platform for Accelerating Machine Learning Applications por NVIDIA Taiwan
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning Applications
NVIDIA Taiwan1.1K vistas
Introduction of Deep Learning por Myungjin Lee
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep Learning
Myungjin Lee6K vistas
Final training course por Noor Dhiya
Final training courseFinal training course
Final training course
Noor Dhiya34 vistas
BigDL webinar - Deep Learning Library for Spark por DESMOND YUEN
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for Spark
DESMOND YUEN245 vistas
Deep Learning on Qubole Data Platform por Shivaji Dutta
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data Platform
Shivaji Dutta787 vistas
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23 por Tomasz Sikora
Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Tomasz Sikora124 vistas
Big Data Analytics (ML, DL, AI) hands-on por Dony Riyanto
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
Dony Riyanto1.1K vistas
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 por Herman Wu
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Herman Wu648 vistas
Deep Learning libraries and first experiments with Theano por Vincenzo Lomonaco
Deep Learning libraries and first experiments with TheanoDeep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with Theano
Vincenzo Lomonaco7.7K vistas
BERT QnA System for Airplane Flight Manual por ArkaGhosh65
BERT QnA System for Airplane Flight ManualBERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight Manual
ArkaGhosh6538 vistas
Deep Learning and Watson Studio por Sasha Lazarevic
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
Sasha Lazarevic979 vistas
Deep learning for NLP and Transformer por Arvind Devaraj
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj1.4K vistas
Synthetic dialogue generation with Deep Learning por S N
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
S N778 vistas
Deep Dive on Deep Learning (June 2018) por Julien SIMON
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
Julien SIMON56.4K vistas
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod... por inside-BigData.com
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
inside-BigData.com1K vistas

Último

Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ... por
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...ShapeBlue
121 vistas15 diapositivas
Future of AR - Facebook Presentation por
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook PresentationRob McCarty
54 vistas27 diapositivas
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... por
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc
130 vistas29 diapositivas
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... por
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...ShapeBlue
105 vistas15 diapositivas
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... por
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...ShapeBlue
69 vistas29 diapositivas
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... por
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...ShapeBlue
48 vistas17 diapositivas

Último(20)

Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ... por ShapeBlue
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
ShapeBlue121 vistas
Future of AR - Facebook Presentation por Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty54 vistas
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... por TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc130 vistas
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... por ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue105 vistas
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... por ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue69 vistas
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... por ShapeBlue
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
ShapeBlue48 vistas
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates por ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue178 vistas
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... por ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue120 vistas
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... por ShapeBlue
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
ShapeBlue114 vistas
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online por ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue154 vistas
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT por ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue138 vistas
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... por ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue86 vistas
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... por ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue52 vistas
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... por ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue97 vistas
Why and How CloudStack at weSystems - Stephan Bienek - weSystems por ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue172 vistas
NTGapps NTG LowCode Platform por Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu287 vistas
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue por ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue68 vistas
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... por ShapeBlue
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
ShapeBlue74 vistas
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading... por The Digital Insurer
Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... por ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue128 vistas

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

  • 1. Deep learning enabled Question Answering models PROJECT WORK PRESENTATION Saurabh Saxena 2015HT12604
  • 2. Introduction DEEP LEARNING AND QUESTION ANSWERING SYSTEMS
  • 3. Deep learning  What is Deep learning? Deep learning is a new area of Machine learning research that uses multi- layered Artificial neural networks. The objective is to learn multiple levels of representation and abstraction that help to make sense of data such as images, sound, and text. It is and is becoming increasingly relevant because of three key reasons :  An infinitely flexible function – universal function approximation via Neural networks  All-purpose parameter fitting – using gradient descent and its derivative algorithms  Fast and scalable – availability of cheap GPUs for fast matrix multiplications  Typical applications of Deep learning  Convolution Neural Networks(CNN) in Computer vision and machine translation  Recurrent Neural Network(RNN) like LSTM/GRU in language modeling  Tree Neural Networks(TNN) in sentiment analysis  Reinforcement learning in Game playing and intelligent agents
  • 4. 4 Basic Building blocks of deep learning Most DL Networks (including Question Answering models) are composed out of these basic building blocks: • Fully Connected Network • Word Embedding • Convolutional Neural Network • Recurrent Neural Network
  • 5. General Architecture of a Deep model
  • 6.  What is a Question Answering System? The basic idea of an automated QA system is to extract information from documents and given a user query provide a short and concise answer that will meet user’s information needs.  Traditional QA systems are basically of 2 types :  Information Retrieval(IR) based QA – Match and ranking based broad domain QA using mostly unstructured data, example -> Search engines  Knowledge-based(KB) QA – semantic representation of query using structured data like triple stores or SQL example -> Freebase , DBPedia, and Wolfram alpha  Question types  Factoid questions – DeepMind CNN/DailyMail datset  Cloze style questions – MCTest dataset and bAbI  Open domain question answering – WikiQA and LAMBADA QA systems
  • 11. Motivations – What deep learning can do for QA systems ?  Traditional QA pipeline relies a lot on manual feature engineering. The aim of deep learning models is to eliminate this.  Aim to build systems that can directly read documents and then answer questions based on those documents.  RNNs have been successful in language modeling and generation but could not achieve much success in QA as they cannot store enough context in their hidden states . To answer complex questions models require supporting facts far back in the past.  Suffer from vanishing gradient problem if too many time-steps are used.  Solution - incorporate explicit Memory in the model and a way to address that memory for read and write.
  • 12. Memory networks for QA AND THEIR VARIANTS
  • 13. What are Memory Networks ?  Class of models that combine large memory with learning component that can read and write to it.  Incorporates reasoning with attention over memory (RAM).  Most ML has limited memory which is more-or-less all that’s needed for “low level” tasks e.g. object detection.  Long-term memory is required to read a story and then e.g. answer questions about it.  It is also required for dialog: to remember previous dialog (short- and long-term), and respond.  Models are scalable - can store and read large amount of data in memory - entire KB
  • 14. All MemNN have four component networks (which may or may not have shared parameters):  I: (input feature map) convert incoming data to the internal feature representation.  G: (generalization) update memories given new input.  O: produce new output (in feature representation space) given the memories.  R: (response) convert output O into response seen by the outside world Step 1: controller converts incoming data to internal feature representation (I) Step 2: write head updates the memories and writes the data into memory (G) Step 3: given the external input, the read head reads the memory and fetches relevant data (O) Step 4: controller combines the external data with memory contents returned by read head to generate output (O, R)
  • 16. Datasets to train Deep QA models BABI , LAMBADA , MCTEST AND MORE…
  • 17. Datasets available to train/test QA models  Facebook bAbI Simplequestions– A set of 20 tasks for testing text understanding and reasoning. For each task, there are 10000 questions for training, and 1000 for testing. Each task tests the machine on a specific skill set. https://research.fb.com/downloads/babi/  Facebook bAbI Chidlren's Book Test(CBT)- Text passages and corresponding questions drawn from Project Gutenberg Children's books. 669,343 training questions , 8,000 dev questions and 10,000 test questions  MCTest - consists of 500 stories and 2000 questions. By being fictional, the answer typically can be found only in the story itself. Requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. http://research.microsoft.com/en-us/um/redmond/projects/mctest/
  • 18.  Language Modeling Broadened to Account for Discourse Aspects(LAMBADA dataset) - consists of 10,022 passages, divided into 4,869 development and 5,153 test passages (extracted from 1,331 and 1,332 disjoint novels, respectively). The average passage consists of 4.6 sentences in the context plus 1 target sentence, for a total length of 75.4 tokens (dev) / 75 tokens (test). http://clic.cimec.unitn.it/lambada/  DeepMind CNN and DailyMail dataset - Collection of news articles and corresponding cloze queriesEach dataset contains many documents (90k and 197k each), and each document has on average 4 questions approximately. Each question is a sentence with one missing word/phrase which can be found from the accompanying document/context http://cs.nyu.edu/~kcho/DMQA/
  • 19.  Stanford Question answering Dataset (SQuAD) - reading comprehension dataset consisting of questions posed by crowd-workers on a set of Wikipedia articles. The answer to every question is a segment of text, or span, from the corresponding reading passage. There are 100,000+ question-answer pairs on 500+ articles. https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/  AI2 Science Exams - Elementary science questions from US state and regional science exam. 170 multi-state and 108 4th grade questions. http://allenai.org/data/science-exam-questions.html  WikiQA - 3047 questions sampled from Bing query logs. Each question associated with a Wikipedia page. All sentences in the summary paragraph of the page become the candidate answers. Only 1/3rd questions have a correct answer in the candidate answer set. https://www.microsoft.com/en-us/research/publication/wikiqa-a-challenge- dataset-for-open-domain-question-answering/
  • 20. Facebook bAbI dataset – 20 tasks • Single supporting fact • Two supporting facts • Three supporting facts • Two argument relations • Three argument relations • Yes/No questions • Counting • Lists/sets • Simple Negation • Indefinite Knowledge • Basic Coreference • Conjunction • Compound Coreference • Time Reasoning • Basic Deduction • Basic Induction • Positional Reasoning • Size Reasoning • Path Finding • Agent’s Motivations
  • 21. 20 tasks in brief..
  • 27. Experimental Setup to train deep models GPU, THEANO, KERAS , CUDA , CUDNN AND MORE…
  • 28. Component Description Operating System Ubuntu 16.04 VM on Intel Octa core CPU with 6.5 GB RAM Graphics Card NVIDIDA Testla K80 with 12 GB Ram and 2080 CUDA cores Graphics Toolkit CUDA 8.0 with CuDNN 6.0 Python Package Manager Anaconda (Continuum Analytics) for Python 2.7 Deep learning library Keras v2.0.2 with Theano v0.9.0 backend Other python modules  Bcolz v1.0.0 for fast saving/loading of trained weights  Numpy v1.12.1 for all multi-dimensional numeric manipulations  Scikit-learn v0.18.1 for preprocessing, pipelining, feature-extraction, decomposition , dataset splits and all general non-deep machine algorithms  Cpickle for saving model  NLTK toolkit for traditional linguistic tasks  Matplotlib v2.0.0 – for visualizing data  Pydot v1.0.28 and GraphViz v2.38.0– for visualizing deep models  Openblas 0.2.19 – for fast linear algebra operations  Pandas v0.19.2 for structured data manipulation  Protobuf 3.0.0 for protocol buffering  Flask v0.12 for web display Experimental setup in Google Cloud
  • 29. Compute Engine setup in Google Cloud
  • 31. Training Summary MODELS, TEST ACCURACY AND MORE…
  • 32. Model summary for bAbi Task#1
  • 33. Training summary for bAbI Task#1 – one supporting fact Training summary for bAbI Task#2 – 2 supporting fact
  • 34. Joint training on all 20 tasks simultaneously
  • 35. Demo on bAbi tasks - Correct answers
  • 38. Future work  Train Dynamic Memory network on bAbi dataset  Train Key-value memory network on bAbi dataset  Evaluate the performance of current models on other datasets like LAMBADA and Stanford SQUAD  Explore the possibility of transfer learning so that models trained on open source datasets can be applied to corporate datasets with only fine tuning  Explore the use of trained models in dialog modeling for Helpdesk Question answering