SlideShare una empresa de Scribd logo
1 de 27
2018 HPCC Systems Summit Community Day
Deep Content Learning in Traffic
Prediction and Text Classification
Jingqing Zhang
Prof. Yike Guo
Data Science Institute
Imperial College London
Outline
• Imperial DSI
• Deep Content Learning
• Research Projects
– Traffic Prediction
– Zero-shot Text Classification
• TensorLayer
• HPCC Systems + TensorLayer
The Success of Deep Learning
Johnson, Justin, Andrej Karpathy, and Li Fei-Fei. "Densecap: Fully convolutional localization networks for dense captioning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
CV & NLP Medical Game
Deep Learning + Content Providers
Deep
Learning
Content
Providers
Deep Content
Learning
Deep Content Learning
Environment Perception
Decision
Making
Reasoning
machine learning
deep learning
data
knowledge
logics and rules
decision
suggestion
Content
Providers
Dog
• Huskies usually have a thick
double coat that can be gray,
black, copper red, or white.
Their eyes are typically pale
blue, although they may also
be brown, green, blue, yellow,
or heterochromic.
• Husky
Concrete Projects (Completed So Far)
• P1: Traffic Prediction
– Deep Sequence Learning with Auxiliary Information for Traffic Prediction, KDD 2018
• P2: Zero-shot Text Classification
– Integrating Semantic Knowledge to Tackle Zero-shot Text Classification, submitted for reviews
P1: Deep Sequence Learning with Auxiliary Information for Traffic
Prediction
Marriott
Buckhead
navigation to
by map apps
• Spearman’s rank correlation coefficient with
• 𝜌 = −0.52, P-value= 1.23 × 10−4
HPCC Systems Summit
• How does online info affect traffic ?
Deep Sequence Learning with Auxiliary Information for Traffic Prediction, Binbing Liao, Jingqing Zhang, Chao Wu, Douglas McIlwraith, Tong Chen, Shengwen Yang, Yike
Guo, and Fei Wu, KDD 2018
Solution
Environment Perception
Decision
Making
Reasoning
Sequence learningTraffic data
Query impact
Event discovery
Traffic prediction
Event Discovery in Query Records
• The events discovered by query records can correspond with real events.
Modelling
LSTM LSTM LSTM…
Encoder
𝑣1
Graph
CNN
𝑁𝐵(𝑣1)
Concat
𝑣2
Graph
CNN
𝑁𝐵(𝑣2) 𝑣 𝑡
Graph
CNN
𝑁𝐵(𝑣𝑡)
Concat Concat
LSTM LSTM LSTM…
𝑄𝐼(𝑡 + 𝑡′)𝑄𝐼(𝑡 + 1) 𝑄𝐼(𝑡 + 2)
Encoder for Query Impact
…
𝑣 𝑡+𝑡′
LSTM LSTM LSTM
<END>
<START>
𝑣 𝑡+1
𝑣 𝑡+𝑡′−1 𝑣 𝑡+𝑡′
FC FC
𝐴𝑇(𝑣𝑡+1) 𝐴𝑇(𝑣𝑡+𝑡′)
Concat Concat
Decoder
Traffic
Perception
Sequence Learning
Decision
Making
Reasoning
Query Impact
Result
• It is more challenging to predict traffic when
events happen.
• The query impact is more informative and
closer related to real-time traffic.
 More information is available: https://github.com/JingqingZ/BaiduTraffic
P2: Integrating Semantic Knowledge to Tackle Zero-shot
Text Classification
• Zero-shot Learning: learn about a new category without a training instance
– Which is “Okapi”?
– a zebra-striped four legged animal with a brown torso and a deer-like face
Zero-shot Text Classification
Environment Perception
Decision
Making
Reasoning
Traditional text
classification
Text documents Knowledge Zero-shot text
classification
Imperial College
London is a public
research university
located in London.
Education
Reasoning – Relationship Vectors
ConceptNet
Relationship vectors
– Find the relation between words and
classes without any training data
– Particular types of relations
– The length of shortest path
• In the learning stage, no information about unseen classes
• In the inference stage, the unseen classes are known (label, description), but still no training data
• Can we infer what the documents from unseen classes would look like?
• Can we generate fake documents that look like real data from unseen classes?
Reasoning – Topic Translation
𝑐: Germany
𝑤: Berlin 𝑤′
: ?
𝑐′
: France
Vector Space
Example of Translated Documents
Animal (Original) Mitra perdulca is a species of sea snail a marine gastropod mollusk in the family
Mitridae the miters or miter snails.
Animal  Plant Arecaceae perdulca is a flowering of port aster a naval mollusk gastropod in the
fabaceae Clusiaceae the tiliaceae or rockery amaryllis.
Animal  Athlete Mira perdulca is a swimmer of sailing sprinter an Olympian limpets gastropod in
the basketball Middy the miters or miter skater.
• Not completely understandable, but the translated documents contain the tone of the target class.
Decision Making – Two-phase Inference
Binary
Classification
Fine-grained
Classification
Plants, also called
green plants, are
multicellular
eukaryotes of the
kingdom Plantae.
Seen
Unseen
Plant
Result – Overall Performance
• The proposed two-phase inference with integrated semantic knowledge is promising to tackle the
challenging zero-shot text classification.
 More information about this project will be released soon.
Implementation of Deep Learning
Gaps
TensorFlow: low-level APIs Deep Learning: high-level
neural networks
Industry: high performance
Abstraction
gap
Performance
gap
TensorLayer – What is TensorLayer?
• TensorLayer is an unique TensorFlow wrapper library that can
I. teach deep learning
II. help cutting-edge research
III. run in the real-world
• From late 2016 to present
– > 4000 Stars
– > 1000 Forks
– > 70 Contributors
– on GitHub
HPCC Systems + TensorLayer
HPCC Systems
TensorLayer
Horovod
TensorFlow
Server 1 Server 2
Py3embed
High-level
wrapper Distributed
framework
 Data parallelism
 Synchronous distributed training
 GPU acceleration + CPU input pipeline
HPCC Systems + TensorLayer
#GPU Dataset #Epoch
Batch
Size
Time (s) #Images #Image/sec Accuracy
GPU Mem
(MB)
1 MNIST 50 512 135 2.5M 18.4K 0.98 ~315MB
2 MNIST 50 512 122 2.5M 20.4K 0.99 ~315MB
1 CIFAR 10 50 512 232 2.5M 10.6K 0.69 ~1435MB
2 CIFAR 10 50 512 221 2.5M 11.1K 0.71 ~1435MB
HPCC Systems + TensorLayer
• Still too early to have a conclusion now.
• Future works
– Larger models to test distributed training, e.g. OpenPose.
– Closer integration of HPCC Systems and TensorLayer.
• https://github.com/tensorlayer/openpose-plus
• https://github.com/tensorlayer/tensorlayer/tree/master/examples/distributed_training
Data
Processing
Deployment
Distributed
Training
Summary
Environment Perception
Decision
Making
Reasoning
Data
Processing
Deployment
Distributed
Training
Deep Content Learning
HPCC Systems + TensorLayer
Q & A
Thanks
Jingqing Zhang
Prof. Yike Guo
Data Science Institute
Imperial College London
Find more information, please visit
http://www.doc.ic.ac.uk/~jz9215/

Más contenido relacionado

La actualidad más candente

La actualidad más candente (9)

Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Logics and Ontologies for Portuguese Understanding
Logics and Ontologies for Portuguese UnderstandingLogics and Ontologies for Portuguese Understanding
Logics and Ontologies for Portuguese Understanding
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Lean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicLean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural Logic
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
2012 04-26-ifip-wg.pptx
2012 04-26-ifip-wg.pptx2012 04-26-ifip-wg.pptx
2012 04-26-ifip-wg.pptx
 
Automatic Key Term Extraction and Summarization from Spoken Course Lectures
Automatic Key Term Extraction and Summarization from Spoken Course LecturesAutomatic Key Term Extraction and Summarization from Spoken Course Lectures
Automatic Key Term Extraction and Summarization from Spoken Course Lectures
 

Similar a Deep Content Learning in Traffic Prediction and Text Classification

Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning
Artificial Intelligence Institute at UofSC
 

Similar a Deep Content Learning in Traffic Prediction and Text Classification (20)

The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
ESWC 2015 Closing and "General Chair's minute of Madness"
ESWC 2015 Closing and "General Chair's minute of Madness"ESWC 2015 Closing and "General Chair's minute of Madness"
ESWC 2015 Closing and "General Chair's minute of Madness"
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 
Novi sad ai event 1-2018
Novi sad ai event 1-2018Novi sad ai event 1-2018
Novi sad ai event 1-2018
 
transfer.pptx
transfer.pptxtransfer.pptx
transfer.pptx
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Rise of AI through DL
Rise of AI through DLRise of AI through DL
Rise of AI through DL
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning
 
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008
 
Georgia Tech cse6242 - Intro to Deep Learning and DL4J
Georgia Tech cse6242 - Intro to Deep Learning and DL4JGeorgia Tech cse6242 - Intro to Deep Learning and DL4J
Georgia Tech cse6242 - Intro to Deep Learning and DL4J
 
Deep learning for dummies dec 23 2017
Deep learning for dummies   dec 23 2017Deep learning for dummies   dec 23 2017
Deep learning for dummies dec 23 2017
 

Más de HPCC Systems

Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 

Más de HPCC Systems (20)

Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex Systems
 
Welcome
WelcomeWelcome
Welcome
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon Cutting
 
Path to 8.0
Path to 8.0 Path to 8.0
Path to 8.0
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle Changes
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine Learning
 
Docker Support
Docker Support Docker Support
Docker Support
 
Expanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesExpanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network Capabilities
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis Tool
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL Neater
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
 

Último

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 

Último (20)

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 

Deep Content Learning in Traffic Prediction and Text Classification

  • 1. 2018 HPCC Systems Summit Community Day Deep Content Learning in Traffic Prediction and Text Classification Jingqing Zhang Prof. Yike Guo Data Science Institute Imperial College London
  • 2. Outline • Imperial DSI • Deep Content Learning • Research Projects – Traffic Prediction – Zero-shot Text Classification • TensorLayer • HPCC Systems + TensorLayer
  • 3.
  • 4. The Success of Deep Learning Johnson, Justin, Andrej Karpathy, and Li Fei-Fei. "Densecap: Fully convolutional localization networks for dense captioning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. CV & NLP Medical Game
  • 5. Deep Learning + Content Providers Deep Learning Content Providers Deep Content Learning
  • 6. Deep Content Learning Environment Perception Decision Making Reasoning machine learning deep learning data knowledge logics and rules decision suggestion Content Providers Dog • Huskies usually have a thick double coat that can be gray, black, copper red, or white. Their eyes are typically pale blue, although they may also be brown, green, blue, yellow, or heterochromic. • Husky
  • 7. Concrete Projects (Completed So Far) • P1: Traffic Prediction – Deep Sequence Learning with Auxiliary Information for Traffic Prediction, KDD 2018 • P2: Zero-shot Text Classification – Integrating Semantic Knowledge to Tackle Zero-shot Text Classification, submitted for reviews
  • 8. P1: Deep Sequence Learning with Auxiliary Information for Traffic Prediction Marriott Buckhead navigation to by map apps • Spearman’s rank correlation coefficient with • 𝜌 = −0.52, P-value= 1.23 × 10−4 HPCC Systems Summit • How does online info affect traffic ? Deep Sequence Learning with Auxiliary Information for Traffic Prediction, Binbing Liao, Jingqing Zhang, Chao Wu, Douglas McIlwraith, Tong Chen, Shengwen Yang, Yike Guo, and Fei Wu, KDD 2018
  • 9. Solution Environment Perception Decision Making Reasoning Sequence learningTraffic data Query impact Event discovery Traffic prediction
  • 10. Event Discovery in Query Records • The events discovered by query records can correspond with real events.
  • 11. Modelling LSTM LSTM LSTM… Encoder 𝑣1 Graph CNN 𝑁𝐵(𝑣1) Concat 𝑣2 Graph CNN 𝑁𝐵(𝑣2) 𝑣 𝑡 Graph CNN 𝑁𝐵(𝑣𝑡) Concat Concat LSTM LSTM LSTM… 𝑄𝐼(𝑡 + 𝑡′)𝑄𝐼(𝑡 + 1) 𝑄𝐼(𝑡 + 2) Encoder for Query Impact … 𝑣 𝑡+𝑡′ LSTM LSTM LSTM <END> <START> 𝑣 𝑡+1 𝑣 𝑡+𝑡′−1 𝑣 𝑡+𝑡′ FC FC 𝐴𝑇(𝑣𝑡+1) 𝐴𝑇(𝑣𝑡+𝑡′) Concat Concat Decoder Traffic Perception Sequence Learning Decision Making Reasoning Query Impact
  • 12. Result • It is more challenging to predict traffic when events happen. • The query impact is more informative and closer related to real-time traffic.  More information is available: https://github.com/JingqingZ/BaiduTraffic
  • 13. P2: Integrating Semantic Knowledge to Tackle Zero-shot Text Classification • Zero-shot Learning: learn about a new category without a training instance – Which is “Okapi”? – a zebra-striped four legged animal with a brown torso and a deer-like face
  • 14. Zero-shot Text Classification Environment Perception Decision Making Reasoning Traditional text classification Text documents Knowledge Zero-shot text classification Imperial College London is a public research university located in London. Education
  • 15. Reasoning – Relationship Vectors ConceptNet Relationship vectors – Find the relation between words and classes without any training data – Particular types of relations – The length of shortest path
  • 16. • In the learning stage, no information about unseen classes • In the inference stage, the unseen classes are known (label, description), but still no training data • Can we infer what the documents from unseen classes would look like? • Can we generate fake documents that look like real data from unseen classes? Reasoning – Topic Translation 𝑐: Germany 𝑤: Berlin 𝑤′ : ? 𝑐′ : France Vector Space
  • 17. Example of Translated Documents Animal (Original) Mitra perdulca is a species of sea snail a marine gastropod mollusk in the family Mitridae the miters or miter snails. Animal  Plant Arecaceae perdulca is a flowering of port aster a naval mollusk gastropod in the fabaceae Clusiaceae the tiliaceae or rockery amaryllis. Animal  Athlete Mira perdulca is a swimmer of sailing sprinter an Olympian limpets gastropod in the basketball Middy the miters or miter skater. • Not completely understandable, but the translated documents contain the tone of the target class.
  • 18. Decision Making – Two-phase Inference Binary Classification Fine-grained Classification Plants, also called green plants, are multicellular eukaryotes of the kingdom Plantae. Seen Unseen Plant
  • 19. Result – Overall Performance • The proposed two-phase inference with integrated semantic knowledge is promising to tackle the challenging zero-shot text classification.  More information about this project will be released soon.
  • 21. Gaps TensorFlow: low-level APIs Deep Learning: high-level neural networks Industry: high performance Abstraction gap Performance gap
  • 22. TensorLayer – What is TensorLayer? • TensorLayer is an unique TensorFlow wrapper library that can I. teach deep learning II. help cutting-edge research III. run in the real-world • From late 2016 to present – > 4000 Stars – > 1000 Forks – > 70 Contributors – on GitHub
  • 23. HPCC Systems + TensorLayer HPCC Systems TensorLayer Horovod TensorFlow Server 1 Server 2 Py3embed High-level wrapper Distributed framework  Data parallelism  Synchronous distributed training  GPU acceleration + CPU input pipeline
  • 24. HPCC Systems + TensorLayer #GPU Dataset #Epoch Batch Size Time (s) #Images #Image/sec Accuracy GPU Mem (MB) 1 MNIST 50 512 135 2.5M 18.4K 0.98 ~315MB 2 MNIST 50 512 122 2.5M 20.4K 0.99 ~315MB 1 CIFAR 10 50 512 232 2.5M 10.6K 0.69 ~1435MB 2 CIFAR 10 50 512 221 2.5M 11.1K 0.71 ~1435MB
  • 25. HPCC Systems + TensorLayer • Still too early to have a conclusion now. • Future works – Larger models to test distributed training, e.g. OpenPose. – Closer integration of HPCC Systems and TensorLayer. • https://github.com/tensorlayer/openpose-plus • https://github.com/tensorlayer/tensorlayer/tree/master/examples/distributed_training Data Processing Deployment Distributed Training
  • 27. Q & A Thanks Jingqing Zhang Prof. Yike Guo Data Science Institute Imperial College London Find more information, please visit http://www.doc.ic.ac.uk/~jz9215/

Notas del editor

  1. Hello everyone, It’s my great pleasure to celebrate this community day and introduce research advances at Data Science Institute, Imperial College London. I hope you will enjoy my talk.
  2. This is the outline of my talk. I will firstly introduce ourselves: Imperial College Data Science Institute. I will propose the idea of deep content learning with two projects we have conducted so far. And I will introduce TensorLayer which is a development tool for deep learning models. finally I would like to share some practice we have done to integrate HPCC Systems with TensorLayer.
  3. The Data Science Institute at Imperial College London was launched in 2014. The DSI aims to enhance Imperial's excellence in data-driven research across its faculties. Therefore, we receive support from faculty of engineering, medicine, natural science as well as the business school. The DSI consists of seven parts. One hub and six labs. Each lab has its own focus as you may find in this figure. And the Hub mainly focuses on data management, analysis and also machine learning. I am doing my PhD at the DSI Hub so my research would focus machine learning, deep learning and their applications. As you may notice, [click to next page]
  4. The deep learning has achieved great success in many scenarios including computer vision and natural language processing medical imaging and game playing In many tasks, the deep learning models perform even better than human. For examples, object recognition in images. Those tasks need to be well defined and mostly importantly, a huge amount of data is necessary. However, the tasks that may require semantic understanding, inference, reasoning can be very challenging for deep learning models. For examples, question answering, chatbot, medical diagnosis and etc. So the current AI systems are still far behind the ultimate goal of AI, which is AI should be able to do what human can do. [click to next page]
  5. The good news, nowadays, we not only in the era of big data, we also have lots of content providers. The content providers are the providers that can organise and provide knowledge in general or specific domains. The content they provide is also a kind of data but the data should be better organised , structured and in high quality. [click] A good example is the content provided by Elsevier and LexisNexis. [click] Therefore, we believe the combination of deep learning with the content would be essential in our future AI research. And we call it Deep Content Learning [click next page]
  6. We think in the Deep Content Learning, there are at least three key modules. The perception, the reasoning and the decision making. The perception module is a stage to extract features and representations from data. And this is what machine learning and deep learning are initially defined to do. The reasoning module should include additional knowledge from content provider to infer something related to the scenario. The final decision making module would combine all the results and make the right decision driven the utility. For example [click] Given an image of a dog, the perception module extracts the features of this dog, the colour, the eye colour. The reasoning module should find the knowledge that describes this specific kind of dog. And the decision making module should predict that this is a husky instead just saying that it is a dog. [click next page]
  7. We have conducted two concrete projects under the idea of Deep Content Learning. The traffic prediction with auxiliary information and the zero-shot text classification with semantic knowledge.
  8. As we know, the traffic is normally periodic. There are peak hours when the traffic is heavy and off-peak hours when the traffic is light. In this case, it is easy for models to predict the traffic. However, if a place is holding a public event, like here like today, HPCC Systems summit, the traffic may be not normal again. Because a crowd of people will come here and the traffic nearby will be abnormally heavier and a classic traffic prediction model may fail. But how can we detect such condition. I believe most people nowadays can’t drive without a navigation app like Google Map. If one person is searching this hotel, maybe everything is fine. [click] But if a lot of people is searching this hotel, I am a little worried about the traffic here. [click] This figure shows how the search query from map app is related to the traffic speed. The blue lines are normal condition and the red liens are abnormal condition. As you may find there is very clear negative correlation between traffic speed and online search query. And the statistics has verified this idea. [click to next page]
  9. In this project, we used conventional sequence learning as the perception. We quantified the query impact on traffic and did event discovery in the reasoning module. The decision making integrated all the information and did traffic speed prediction. [click next page]
  10. This table shows the events we discovered from the query records. [click] For example, the row of this table, we find that the number of queries that search capital gym at this period time is much higher than the normal query counts. And we find some other popular locations as well. [click] actually these events can correspond with real public events like concert, forum and attraction. [click next page]
  11. The slide introduces the modelling. pure temporal --> spatial relations --> attributes --> query impact
  12. The key challenge is to transfer knowledge from familiar to unfamiliar classes (generalisation). The research of zero-shot learning can be very useful when the training data of emerging classes is inefficient or even unavailable. And the problem of emerging classes is common in many domains such research topics, social media, advertisement, object recognition and medical diagnosis. Few previous research studied zero-shot text classification.
  13. Recognising text documents of categories that have never been seen in the training stage.
  14. As we have no training data for unseen classes, the model can be biased to the data we have. So the model may not be able to differentiate the difference between seen and unseen classes, especially when there is some semantic overlapping between classes.
  15. Reasons to use TensorFlow Largest user base Widest production adoption Well-maintained documents Battlefield-proof quality But hard to master Low-level interfaces
  16. We hope in the future TensorLayer can be integrated into HPCC Systems to provide powerful
  17. The improvement of distributed training on 2-GPU isn’t significant so far.