SlideShare una empresa de Scribd logo
1 de 29
RV College of
Engineering
Go, change the world
1
Dr. G. Shobha
Professor, CSE Department
RV College of Engineering, Bengaluru - 59
Natural Language to SQL Query conversion using
Machine Learning Techniques on HPCC Systems
Platform
RV College of
Engineering
PRESENTATION CONTENTS
2
• Introduction and Motivation
• Components involved in NLP for NL to SQL Conversion
• Rule Based Architecture for NL to SQL conversion
• Machine Learning Based Architecture to Enrich NL for SQl
Conversion
• HPCC Systems Architecture
• Results & Conclusions
RV College of
Engineering
Introduction and Motivation
3
Key Factors of NL to SQL
Go, change the world
• Databases serve as the forefront for most systems today.
• Structured query language (SQL) is used to access and manipulate the
data stored in a relational database.
• Most end users have limited knowledge of SQL and thus face
difficulties in accessing such
• Critical to access the data
• Learn the Querying language and understand the various syntax
RV College of
Engineering
4
Components Involved in NLP for NL to SQL
Components of NLP
NLP
Part of Computer Science and Artificial Intelligence
which deals with Human Languages
Go, change the world
RV College of
Engineering
Rule Based Architecture for NL to SQl Conversion
5
Go, change the world
RV College of
Engineering
Rule Based Architecture for NL to SQl Conversion
6
Preprocessor
• Tokenizes the natural language input.
• Remove the redundant tokens
• The output of the preprocessor is duplicated
and supplied to two major components
- Entity Recognizer
- Intent Recognizer
Entity Recognizer
• entity extractor
• a classifier
• a filter.
Go, change the world
RV College of
Engineering
Rule Based Architecture for NL to SQl Conversion
7
Entity Extractor
• uses parts of speech tagging and a date parser to extract important
keywords from the sentence
• strong probable to form relation names, attribute names or data
• These are then fed into a classifier along with the user defined schema
mappings of relation names and attribute names.
Classifier
• The classifier uses various checks such as Direct, Concatenation, N gram,
hypernyms, synonyms to discriminate the keywords into relation names,
attribute names and residual keywords.
Filter
• The residual words are filtered to extract the words that form part of the data items of
the SQL query.
Go, change the world
RV College of
Engineering
Rule Based Architecture for NL to SQl Conversion
8
Intent Recognizer
• Process of creating a template of the SQL
query by performing checks for each SQL
clause.
• Various techniques such as the context
identification, distance metric, keyword
spotting, grammar rules etc. are applied to
check for the existence of a particular clause.
Go, change the world
RV College of
Engineering
Rule Based Architecture for NL to SQl Conversion
9
Challenges faced
• Specific Schema
• Identification of partial or implied data values
• Identification of descriptive values
Go To Solution : Machine Learning Techniques for NL to SQL
Go, change the world
RV College of
Engineering
10
Technologies Involved in Machine Learning for NLP to SQL
Feedforward neural networks
Recurrent Neural Networks (RNNs)
• Networks with feedback loops (recurrent edges)
• Output at current time step depends on current input as well
• as previous state (via recurrent edges)
Training RNNs
Problem: can’t capture long-term dependencies due to vanishing/exploding gradients during backpropagation
Go, change the world
RV College of
Engineering
11
Technologies Involved in ML for NLP to SQL
Go To Solution : Long Short Term Memory Model
A type of RNN architecture that addresses the vanishing/exploding gradient problem and allows learning of
long-term dependencies
Recently risen to prominence with state-of-the-art performance in speech recognition, language modeling, translation,
image captioning
Go, change the world
RV College of
Engineering
12
Technologies Involved in Machine Learning for NLP to SQL
RV College of
Engineering
13
Machine Learning Based Architecture to
Enrich NL for SQl Conversion
Go, change the world
RV College of
Engineering
14
Data Set Extraction
Go, change the world
• Data extracted from RDBMS
• Apache Common CSV Library - used to extract the dataset in the
form of CSV file
• Attributes which contain descriptive values’ (Ex: Experience,
Description. etc) is also provided as input.
• Three separate components work synchronously to extract
maximum latent information from the dataset, which can either
be used to enrich the natural language or be stored to use during
conversion.
Partial and Implied Values
• Pre-processing techniques
• Embedding Layer
• Long Short Term Memory
• Classification of Inputs
Machine Learning for Implied Data Values
RV College of
Engineering
15
Pre-processing techniques
Go, change the world
Machine Learning for Implied Data Values
RV College of
Engineering
16
Embedding Layer
Go, change the world
Machine Learning for Implied Data Values
RV College of
Engineering
17
LSTM Model
Go, change the world
Machine Learning for Implied Data Values
RV College of
Engineering
18
Proposed Model – Implied Data Values
Classification of Inputs
• The input Natural Language query is tokenized and
split into different sequences.
• Sequences of 1 word (1-gram) up to sequences of n
words (n-gram, where n is determined by the number
of tokens) is considered for prediction.
• The largest sequences and its classification are
considered (i.e., sub-sequences are ignored).
The final, high confidence classifications given by the
LSTM model can be used in multiple ways, couple of
them are outlined below:
• Enrich the Natural Language query
• Store the data values and attribute names
Go, change the world
RV College of
Engineering
19
Elastic Search –Descriptive Values
Go, change the world
Elastic Search
Stop Analyzer : Discards the Stop words
Ex :
Input: Get the doctors with masters degree
Analyzer: Get doctors masters degree
English Language Analyzer:
converts the words of the input query to its
root word.
Ex:
Input: Show all products which are red bikes.
Analyzer: Show all product which road bike
Components of Elastic Search
1. Analyzers
• The extracted CSV file is used to create an index in
Elastic Search.
• Elastic Search’s Bulk API provides the necessary
functions that can create and store large data
simultaneously.
RV College of
Engineering
20
Proposed Model – Descriptive Values
Go, change the world
Components of Elastic Search
2. Searching through multiple attributes
3. Generation of suitable fieldname-value pair in
WHERE clause
Multiple columns can be searched in Elastic
Search by using “multi_match” keyword
{ “query”:
{ “multi_match”:
{ “query”: input query,
“fields”:[list of descriptive
column names];
}
}
}
WHERE fieldname1 = value1 AND fieldname2 =
value2 AND.… fieldnameN = valueN
RV College of
Engineering
21
Proposed Model – Descriptive Values
Go, change the world
RV College of
Engineering
HPCC Systems Platform
22
Key Factors of HPCC Systems
Platform
Go, change the world
Go To Solutions : Synchronous Combination of Hybrid Machine Learning Model,
Elastic Search, WordNet , HPCC Systems Platform
• Highly integrated system environment
- capabilities from raw data processing to high-
performance queries and data analysis using a
common language;
• Optimized cluster approach
- provides high performance at a much lower system
cost than other system alternatives
• Stable and reliable processing environment proven in
production applications for varied organizations over a
15-year period;
• Innovative data-centric programming language (ECL)
• High-level of fault resilience and capabilities
• Suitable for a wide range of data-intensive
RV College of
Engineering
Introduction and Motivation
23
Go, change the world
RV College of
Engineering
24
Results
Input Natural
Language Query
Enriched Natural
Language Query
Output SQL Query
show all unmarried
customers who are
men
show all single Gender
'male' customers
SELECT * FROM
t_cstmrs WHERE
LOWER( MaritalStatus )
= 'single' AND LOWER(
Gender ) = 'male'
Names of customers
who have graduated
and from germany
or france
FullName Names of
customers who have
Education 'graduate
degree' and from
CountryRegion
'germany' or
CountryRegion 'france'
SELECT
t_cstmrs.FullName
FROM t_cstmrs INNER
JOIN t_ggrphy ON
t_ggrphy.GeographyKey
=
t_cstmrs.GeographyKey
WHERE LOWER (
t_ggrphy.CountryRegion
) = 'germany' OR
LOWER
(t_ggrphy.CountryRegion
) = 'france' ) AND
(LOWER(
t_cstmrs.Education ) =
'graduate degree' )
Go, change the world
RV College of
Engineering
25
Results
get the price of red or dark helmet
get the price of Color 'red' or Color
‘black' ProductSubCategoryName
'helmet'
SELECT ListPrice , Color FROM
t_prdsubcat INNER JOIN t_prds ON
t_prdsubcat.ProductSubCategoryKey =
t_prds.ProductSubCategoryKey WHERE
LOWER( Color ) = 'red' OR LOWER(
Color ) = 'black'
how much does tire tube cost
how much does ProductName ‘road tire
tube’ cost
SELECT ListPrice , ProductName FROM
t_prds WHERE LOWER( ProductName ) =
'road tire tube'
get the orders from new south wales
australia
get the orders from StateProvince 'new
south wales' CountryRegion 'australia'
SELECT t_saldtls.OrderQuantity,
t_ggrphy.CountryRegion, t_
t_cstmrs.FullName , t_ggrphy.StateProvince
FROM t_ggrphy INNER JOIN t_cstmrs ON
t_cstmrs.GeographyKey =
t_ggrphy.GeographyKey INNER JOIN
t_saldtls ON t_cstmrs.CustomerKey =
t_saldtls.CustomerKey WHERE LOWER(
t_cstmrs.StateProvince) = 'new south wales'
AND LOWER( t_ggrphy.CountryRegion ) =
'australia'
show subtotal of orders for helmet
show subtotal of orders for
ProductSubCategoryName 'helmet’
SELECT SUM( t_saldtls.SalesOrderint )
FROM t_prds INNER JOIN t_saldtls
ON t_prds.ProductKey =
t_saldtls.ProductKey WHERE LOWER(
t_prds.ProductName ) = 'helmet'
Go, change the world
RV College of
Engineering
26
Results – Descriptive values
Go, change the world
Select an item with mountain wheel for entry-
level rider.
SELECT * FROM t_prds WHERE t_prds.Description = 'Replacement mountain wheel for entry-level rider.'
Name the items which have pioneering frame
technology as the HQ steel frame.
SELECT t_prds.ProductName FROM t_prds WHERE t_prds.Description = 'The same pioneering frame
technology is used to give you the highest value as the HQ steel frame.'
RV College of
Engineering
27
Conclusion
• Partial and implied data values in the natural language queries are identified by a trained hybrid
ML model.
• WordNet is also used as a safety net to understand implied data values where the vocabulary of
the input relational database is not expressive.
• Descriptive values are identified with the help of Elastic Search.
• The accuracy of the system is 91.7% on IMDb database
Go, change the world
RV College of
Engineering
28
Acknowledge
Students of RVCE
1. Shubham Phal
2. Yatish H R
3. Tanmay Hukkeri
4. Akshar Prasad
5. Sourabh S Badhya
6. Yashwanth YS
7. Shetty Rohan
RV College of
Engineering
29
Go, change the world

Más contenido relacionado

La actualidad más candente

Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionRrubaa Panchendrarajan
 
NLP Project Presentation
NLP Project PresentationNLP Project Presentation
NLP Project PresentationAryak Sengupta
 
Machine Learning vs. Deep Learning
Machine Learning vs. Deep LearningMachine Learning vs. Deep Learning
Machine Learning vs. Deep LearningBelatrix Software
 
Natural language processing
Natural language processingNatural language processing
Natural language processingHansi Thenuwara
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining ConceptsDung Nguyen
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Saeedeh Shekarpour
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
 
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented GenerationDataScienceConferenc1
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfPo-Chuan Chen
 
Text summarization
Text summarizationText summarization
Text summarizationkareemhashem
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayesDhwaj Raj
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdfcaa28steve
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...ssuser4edc93
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesUKM university
 
Natural language processing
Natural language processingNatural language processing
Natural language processingAbash shah
 
Nlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesNlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesankit_ppt
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfDavid Rostcheck
 
Machine learning || Introduction || Main Components || Examples || Techniques
Machine learning || Introduction || Main Components || Examples || TechniquesMachine learning || Introduction || Main Components || Examples || Techniques
Machine learning || Introduction || Main Components || Examples || TechniquesSamra Shahzadi
 

La actualidad más candente (20)

Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity Recognition
 
NLP Project Presentation
NLP Project PresentationNLP Project Presentation
NLP Project Presentation
 
Machine Learning vs. Deep Learning
Machine Learning vs. Deep LearningMachine Learning vs. Deep Learning
Machine Learning vs. Deep Learning
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
 
Text summarization
Text summarizationText summarization
Text summarization
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayes
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdf
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniques
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Nlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesNlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniques
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
 
Machine learning || Introduction || Main Components || Examples || Techniques
Machine learning || Introduction || Main Components || Examples || TechniquesMachine learning || Introduction || Main Components || Examples || Techniques
Machine learning || Introduction || Main Components || Examples || Techniques
 

Similar a Natural Language to SQL Query conversion using Machine Learning Techniques on HPCC Systems

Intelligent query converter a domain independent interfacefor conversion
Intelligent query converter a domain independent interfacefor conversionIntelligent query converter a domain independent interfacefor conversion
Intelligent query converter a domain independent interfacefor conversionIAEME Publication
 
Expressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingExpressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingAlexandre Riazanov
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital.AI
 
Ryan-Symposium-v5
Ryan-Symposium-v5Ryan-Symposium-v5
Ryan-Symposium-v5Kevin Ryan
 
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)Bob Ward
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersMichael Rys
 
Introduction to SoapUI day 1
Introduction to SoapUI day 1Introduction to SoapUI day 1
Introduction to SoapUI day 1Qualitest
 
Soap UI - Getting started
Soap UI - Getting startedSoap UI - Getting started
Soap UI - Getting startedQualitest
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilDatabricks
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computingBAINIDA
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Herman Wu
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Databricks
 
New c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNew c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNico Ludwig
 

Similar a Natural Language to SQL Query conversion using Machine Learning Techniques on HPCC Systems (20)

Intelligent query converter a domain independent interfacefor conversion
Intelligent query converter a domain independent interfacefor conversionIntelligent query converter a domain independent interfacefor conversion
Intelligent query converter a domain independent interfacefor conversion
 
Expressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingExpressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query Rewriting
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
Ryan-Symposium-v5
Ryan-Symposium-v5Ryan-Symposium-v5
Ryan-Symposium-v5
 
Clean architecture
Clean architectureClean architecture
Clean architecture
 
Database part2-
Database part2-Database part2-
Database part2-
 
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Resume database programmer oracle-2018
Resume database programmer oracle-2018Resume database programmer oracle-2018
Resume database programmer oracle-2018
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for Developers
 
Introduction to SoapUI day 1
Introduction to SoapUI day 1Introduction to SoapUI day 1
Introduction to SoapUI day 1
 
Soap UI - Getting started
Soap UI - Getting startedSoap UI - Getting started
Soap UI - Getting started
 
Santosh_Nayak_CV
Santosh_Nayak_CVSantosh_Nayak_CV
Santosh_Nayak_CV
 
oodb.ppt
oodb.pptoodb.ppt
oodb.ppt
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computing
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
 
New c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNew c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_iv
 

Más de HPCC Systems

Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsHPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsHPCC Systems
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn HPCC Systems
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingHPCC Systems
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle ChangesHPCC Systems
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index HPCC Systems
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningHPCC Systems
 
Expanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesExpanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesHPCC Systems
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsHPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch HPCC Systems
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem HPCC Systems
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis ToolHPCC Systems
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony HPCC Systems
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterHPCC Systems
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...HPCC Systems
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...HPCC Systems
 
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...HPCC Systems
 

Más de HPCC Systems (20)

Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex Systems
 
Welcome
WelcomeWelcome
Welcome
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon Cutting
 
Path to 8.0
Path to 8.0 Path to 8.0
Path to 8.0
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle Changes
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine Learning
 
Docker Support
Docker Support Docker Support
Docker Support
 
Expanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesExpanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network Capabilities
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis Tool
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL Neater
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
 
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
 

Último

20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 

Último (20)

20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 

Natural Language to SQL Query conversion using Machine Learning Techniques on HPCC Systems

  • 1. RV College of Engineering Go, change the world 1 Dr. G. Shobha Professor, CSE Department RV College of Engineering, Bengaluru - 59 Natural Language to SQL Query conversion using Machine Learning Techniques on HPCC Systems Platform
  • 2. RV College of Engineering PRESENTATION CONTENTS 2 • Introduction and Motivation • Components involved in NLP for NL to SQL Conversion • Rule Based Architecture for NL to SQL conversion • Machine Learning Based Architecture to Enrich NL for SQl Conversion • HPCC Systems Architecture • Results & Conclusions
  • 3. RV College of Engineering Introduction and Motivation 3 Key Factors of NL to SQL Go, change the world • Databases serve as the forefront for most systems today. • Structured query language (SQL) is used to access and manipulate the data stored in a relational database. • Most end users have limited knowledge of SQL and thus face difficulties in accessing such • Critical to access the data • Learn the Querying language and understand the various syntax
  • 4. RV College of Engineering 4 Components Involved in NLP for NL to SQL Components of NLP NLP Part of Computer Science and Artificial Intelligence which deals with Human Languages Go, change the world
  • 5. RV College of Engineering Rule Based Architecture for NL to SQl Conversion 5 Go, change the world
  • 6. RV College of Engineering Rule Based Architecture for NL to SQl Conversion 6 Preprocessor • Tokenizes the natural language input. • Remove the redundant tokens • The output of the preprocessor is duplicated and supplied to two major components - Entity Recognizer - Intent Recognizer Entity Recognizer • entity extractor • a classifier • a filter. Go, change the world
  • 7. RV College of Engineering Rule Based Architecture for NL to SQl Conversion 7 Entity Extractor • uses parts of speech tagging and a date parser to extract important keywords from the sentence • strong probable to form relation names, attribute names or data • These are then fed into a classifier along with the user defined schema mappings of relation names and attribute names. Classifier • The classifier uses various checks such as Direct, Concatenation, N gram, hypernyms, synonyms to discriminate the keywords into relation names, attribute names and residual keywords. Filter • The residual words are filtered to extract the words that form part of the data items of the SQL query. Go, change the world
  • 8. RV College of Engineering Rule Based Architecture for NL to SQl Conversion 8 Intent Recognizer • Process of creating a template of the SQL query by performing checks for each SQL clause. • Various techniques such as the context identification, distance metric, keyword spotting, grammar rules etc. are applied to check for the existence of a particular clause. Go, change the world
  • 9. RV College of Engineering Rule Based Architecture for NL to SQl Conversion 9 Challenges faced • Specific Schema • Identification of partial or implied data values • Identification of descriptive values Go To Solution : Machine Learning Techniques for NL to SQL Go, change the world
  • 10. RV College of Engineering 10 Technologies Involved in Machine Learning for NLP to SQL Feedforward neural networks Recurrent Neural Networks (RNNs) • Networks with feedback loops (recurrent edges) • Output at current time step depends on current input as well • as previous state (via recurrent edges) Training RNNs Problem: can’t capture long-term dependencies due to vanishing/exploding gradients during backpropagation Go, change the world
  • 11. RV College of Engineering 11 Technologies Involved in ML for NLP to SQL Go To Solution : Long Short Term Memory Model A type of RNN architecture that addresses the vanishing/exploding gradient problem and allows learning of long-term dependencies Recently risen to prominence with state-of-the-art performance in speech recognition, language modeling, translation, image captioning Go, change the world
  • 12. RV College of Engineering 12 Technologies Involved in Machine Learning for NLP to SQL
  • 13. RV College of Engineering 13 Machine Learning Based Architecture to Enrich NL for SQl Conversion Go, change the world
  • 14. RV College of Engineering 14 Data Set Extraction Go, change the world • Data extracted from RDBMS • Apache Common CSV Library - used to extract the dataset in the form of CSV file • Attributes which contain descriptive values’ (Ex: Experience, Description. etc) is also provided as input. • Three separate components work synchronously to extract maximum latent information from the dataset, which can either be used to enrich the natural language or be stored to use during conversion. Partial and Implied Values • Pre-processing techniques • Embedding Layer • Long Short Term Memory • Classification of Inputs Machine Learning for Implied Data Values
  • 15. RV College of Engineering 15 Pre-processing techniques Go, change the world Machine Learning for Implied Data Values
  • 16. RV College of Engineering 16 Embedding Layer Go, change the world Machine Learning for Implied Data Values
  • 17. RV College of Engineering 17 LSTM Model Go, change the world Machine Learning for Implied Data Values
  • 18. RV College of Engineering 18 Proposed Model – Implied Data Values Classification of Inputs • The input Natural Language query is tokenized and split into different sequences. • Sequences of 1 word (1-gram) up to sequences of n words (n-gram, where n is determined by the number of tokens) is considered for prediction. • The largest sequences and its classification are considered (i.e., sub-sequences are ignored). The final, high confidence classifications given by the LSTM model can be used in multiple ways, couple of them are outlined below: • Enrich the Natural Language query • Store the data values and attribute names Go, change the world
  • 19. RV College of Engineering 19 Elastic Search –Descriptive Values Go, change the world Elastic Search Stop Analyzer : Discards the Stop words Ex : Input: Get the doctors with masters degree Analyzer: Get doctors masters degree English Language Analyzer: converts the words of the input query to its root word. Ex: Input: Show all products which are red bikes. Analyzer: Show all product which road bike Components of Elastic Search 1. Analyzers • The extracted CSV file is used to create an index in Elastic Search. • Elastic Search’s Bulk API provides the necessary functions that can create and store large data simultaneously.
  • 20. RV College of Engineering 20 Proposed Model – Descriptive Values Go, change the world Components of Elastic Search 2. Searching through multiple attributes 3. Generation of suitable fieldname-value pair in WHERE clause Multiple columns can be searched in Elastic Search by using “multi_match” keyword { “query”: { “multi_match”: { “query”: input query, “fields”:[list of descriptive column names]; } } } WHERE fieldname1 = value1 AND fieldname2 = value2 AND.… fieldnameN = valueN
  • 21. RV College of Engineering 21 Proposed Model – Descriptive Values Go, change the world
  • 22. RV College of Engineering HPCC Systems Platform 22 Key Factors of HPCC Systems Platform Go, change the world Go To Solutions : Synchronous Combination of Hybrid Machine Learning Model, Elastic Search, WordNet , HPCC Systems Platform • Highly integrated system environment - capabilities from raw data processing to high- performance queries and data analysis using a common language; • Optimized cluster approach - provides high performance at a much lower system cost than other system alternatives • Stable and reliable processing environment proven in production applications for varied organizations over a 15-year period; • Innovative data-centric programming language (ECL) • High-level of fault resilience and capabilities • Suitable for a wide range of data-intensive
  • 23. RV College of Engineering Introduction and Motivation 23 Go, change the world
  • 24. RV College of Engineering 24 Results Input Natural Language Query Enriched Natural Language Query Output SQL Query show all unmarried customers who are men show all single Gender 'male' customers SELECT * FROM t_cstmrs WHERE LOWER( MaritalStatus ) = 'single' AND LOWER( Gender ) = 'male' Names of customers who have graduated and from germany or france FullName Names of customers who have Education 'graduate degree' and from CountryRegion 'germany' or CountryRegion 'france' SELECT t_cstmrs.FullName FROM t_cstmrs INNER JOIN t_ggrphy ON t_ggrphy.GeographyKey = t_cstmrs.GeographyKey WHERE LOWER ( t_ggrphy.CountryRegion ) = 'germany' OR LOWER (t_ggrphy.CountryRegion ) = 'france' ) AND (LOWER( t_cstmrs.Education ) = 'graduate degree' ) Go, change the world
  • 25. RV College of Engineering 25 Results get the price of red or dark helmet get the price of Color 'red' or Color ‘black' ProductSubCategoryName 'helmet' SELECT ListPrice , Color FROM t_prdsubcat INNER JOIN t_prds ON t_prdsubcat.ProductSubCategoryKey = t_prds.ProductSubCategoryKey WHERE LOWER( Color ) = 'red' OR LOWER( Color ) = 'black' how much does tire tube cost how much does ProductName ‘road tire tube’ cost SELECT ListPrice , ProductName FROM t_prds WHERE LOWER( ProductName ) = 'road tire tube' get the orders from new south wales australia get the orders from StateProvince 'new south wales' CountryRegion 'australia' SELECT t_saldtls.OrderQuantity, t_ggrphy.CountryRegion, t_ t_cstmrs.FullName , t_ggrphy.StateProvince FROM t_ggrphy INNER JOIN t_cstmrs ON t_cstmrs.GeographyKey = t_ggrphy.GeographyKey INNER JOIN t_saldtls ON t_cstmrs.CustomerKey = t_saldtls.CustomerKey WHERE LOWER( t_cstmrs.StateProvince) = 'new south wales' AND LOWER( t_ggrphy.CountryRegion ) = 'australia' show subtotal of orders for helmet show subtotal of orders for ProductSubCategoryName 'helmet’ SELECT SUM( t_saldtls.SalesOrderint ) FROM t_prds INNER JOIN t_saldtls ON t_prds.ProductKey = t_saldtls.ProductKey WHERE LOWER( t_prds.ProductName ) = 'helmet' Go, change the world
  • 26. RV College of Engineering 26 Results – Descriptive values Go, change the world Select an item with mountain wheel for entry- level rider. SELECT * FROM t_prds WHERE t_prds.Description = 'Replacement mountain wheel for entry-level rider.' Name the items which have pioneering frame technology as the HQ steel frame. SELECT t_prds.ProductName FROM t_prds WHERE t_prds.Description = 'The same pioneering frame technology is used to give you the highest value as the HQ steel frame.'
  • 27. RV College of Engineering 27 Conclusion • Partial and implied data values in the natural language queries are identified by a trained hybrid ML model. • WordNet is also used as a safety net to understand implied data values where the vocabulary of the input relational database is not expressive. • Descriptive values are identified with the help of Elastic Search. • The accuracy of the system is 91.7% on IMDb database Go, change the world
  • 28. RV College of Engineering 28 Acknowledge Students of RVCE 1. Shubham Phal 2. Yatish H R 3. Tanmay Hukkeri 4. Akshar Prasad 5. Sourabh S Badhya 6. Yashwanth YS 7. Shetty Rohan