As part of the 2018 HPCC Systems Community Day event:
In this talk, Jingqing will introduce recent advances at the Data Science Institute, Imperial College London, and focus on a general framework named Deep Content Learning. Two recent projects will be discussed as examples. In the traffic prediction project, we released a new large-scale traffic dataset with auxiliary information including search queries from Baidu Map app and proposed hybrid models to achieve state-of-the-art prediction accuracy. The other project on zero-shot text classification integrated semantic knowledge and used a two-phase architecture to tackle the challenging zero-shot learning in textual data. The integration of TensorLayer and HPCC Systems will be discussed in the talk.
Jingqing Zhang is a 1st-year PhD (HiPEDS) at Data Science Institute, Imperial College London under the supervision of Prof. Yi-Ke Guo. His research interest includes Text Mining, Data Mining, Deep Learning and their applications. He received his MRes degree in Computing from Imperial College with Distinction in 2017 and BEng in Computer Science and Technology from Tsinghua University in 2016.
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Deep Content Learning in Traffic Prediction and Text Classification
1. 2018 HPCC Systems Summit Community Day
Deep Content Learning in Traffic
Prediction and Text Classification
Jingqing Zhang
Prof. Yike Guo
Data Science Institute
Imperial College London
2. Outline
• Imperial DSI
• Deep Content Learning
• Research Projects
– Traffic Prediction
– Zero-shot Text Classification
• TensorLayer
• HPCC Systems + TensorLayer
3.
4. The Success of Deep Learning
Johnson, Justin, Andrej Karpathy, and Li Fei-Fei. "Densecap: Fully convolutional localization networks for dense captioning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
CV & NLP Medical Game
5. Deep Learning + Content Providers
Deep
Learning
Content
Providers
Deep Content
Learning
6. Deep Content Learning
Environment Perception
Decision
Making
Reasoning
machine learning
deep learning
data
knowledge
logics and rules
decision
suggestion
Content
Providers
Dog
• Huskies usually have a thick
double coat that can be gray,
black, copper red, or white.
Their eyes are typically pale
blue, although they may also
be brown, green, blue, yellow,
or heterochromic.
• Husky
7. Concrete Projects (Completed So Far)
• P1: Traffic Prediction
– Deep Sequence Learning with Auxiliary Information for Traffic Prediction, KDD 2018
• P2: Zero-shot Text Classification
– Integrating Semantic Knowledge to Tackle Zero-shot Text Classification, submitted for reviews
8. P1: Deep Sequence Learning with Auxiliary Information for Traffic
Prediction
Marriott
Buckhead
navigation to
by map apps
• Spearman’s rank correlation coefficient with
• 𝜌 = −0.52, P-value= 1.23 × 10−4
HPCC Systems Summit
• How does online info affect traffic ?
Deep Sequence Learning with Auxiliary Information for Traffic Prediction, Binbing Liao, Jingqing Zhang, Chao Wu, Douglas McIlwraith, Tong Chen, Shengwen Yang, Yike
Guo, and Fei Wu, KDD 2018
12. Result
• It is more challenging to predict traffic when
events happen.
• The query impact is more informative and
closer related to real-time traffic.
More information is available: https://github.com/JingqingZ/BaiduTraffic
13. P2: Integrating Semantic Knowledge to Tackle Zero-shot
Text Classification
• Zero-shot Learning: learn about a new category without a training instance
– Which is “Okapi”?
– a zebra-striped four legged animal with a brown torso and a deer-like face
14. Zero-shot Text Classification
Environment Perception
Decision
Making
Reasoning
Traditional text
classification
Text documents Knowledge Zero-shot text
classification
Imperial College
London is a public
research university
located in London.
Education
15. Reasoning – Relationship Vectors
ConceptNet
Relationship vectors
– Find the relation between words and
classes without any training data
– Particular types of relations
– The length of shortest path
16. • In the learning stage, no information about unseen classes
• In the inference stage, the unseen classes are known (label, description), but still no training data
• Can we infer what the documents from unseen classes would look like?
• Can we generate fake documents that look like real data from unseen classes?
Reasoning – Topic Translation
𝑐: Germany
𝑤: Berlin 𝑤′
: ?
𝑐′
: France
Vector Space
17. Example of Translated Documents
Animal (Original) Mitra perdulca is a species of sea snail a marine gastropod mollusk in the family
Mitridae the miters or miter snails.
Animal Plant Arecaceae perdulca is a flowering of port aster a naval mollusk gastropod in the
fabaceae Clusiaceae the tiliaceae or rockery amaryllis.
Animal Athlete Mira perdulca is a swimmer of sailing sprinter an Olympian limpets gastropod in
the basketball Middy the miters or miter skater.
• Not completely understandable, but the translated documents contain the tone of the target class.
18. Decision Making – Two-phase Inference
Binary
Classification
Fine-grained
Classification
Plants, also called
green plants, are
multicellular
eukaryotes of the
kingdom Plantae.
Seen
Unseen
Plant
19. Result – Overall Performance
• The proposed two-phase inference with integrated semantic knowledge is promising to tackle the
challenging zero-shot text classification.
More information about this project will be released soon.
21. Gaps
TensorFlow: low-level APIs Deep Learning: high-level
neural networks
Industry: high performance
Abstraction
gap
Performance
gap
22. TensorLayer – What is TensorLayer?
• TensorLayer is an unique TensorFlow wrapper library that can
I. teach deep learning
II. help cutting-edge research
III. run in the real-world
• From late 2016 to present
– > 4000 Stars
– > 1000 Forks
– > 70 Contributors
– on GitHub
23. HPCC Systems + TensorLayer
HPCC Systems
TensorLayer
Horovod
TensorFlow
Server 1 Server 2
Py3embed
High-level
wrapper Distributed
framework
Data parallelism
Synchronous distributed training
GPU acceleration + CPU input pipeline
25. HPCC Systems + TensorLayer
• Still too early to have a conclusion now.
• Future works
– Larger models to test distributed training, e.g. OpenPose.
– Closer integration of HPCC Systems and TensorLayer.
• https://github.com/tensorlayer/openpose-plus
• https://github.com/tensorlayer/tensorlayer/tree/master/examples/distributed_training
Data
Processing
Deployment
Distributed
Training
27. Q & A
Thanks
Jingqing Zhang
Prof. Yike Guo
Data Science Institute
Imperial College London
Find more information, please visit
http://www.doc.ic.ac.uk/~jz9215/
Notas del editor
Hello everyone,
It’s my great pleasure to celebrate this community day and introduce research advances at Data Science Institute, Imperial College London. I hope you will enjoy my talk.
This is the outline of my talk.
I will firstly introduce ourselves: Imperial College Data Science Institute.
I will propose the idea of deep content learning with two projects we have conducted so far.
And I will introduce TensorLayer which is a development tool for deep learning models. finally I would like to share some practice we have done to integrate HPCC Systems with TensorLayer.
The Data Science Institute at Imperial College London was launched in 2014.
The DSI aims to enhance Imperial's excellence in data-driven research across its faculties. Therefore, we receive support from faculty of engineering, medicine, natural science as well as the business school.
The DSI consists of seven parts. One hub and six labs. Each lab has its own focus as you may find in this figure. And the Hub mainly focuses on data management, analysis and also machine learning.
I am doing my PhD at the DSI Hub so my research would focus machine learning, deep learning and their applications.
As you may notice, [click to next page]
The deep learning has achieved great success in many scenarios including
computer vision and natural language processing
medical imaging
and game playing
In many tasks, the deep learning models perform even better than human. For examples, object recognition in images. Those tasks need to be well defined and mostly importantly, a huge amount of data is necessary.
However, the tasks that may require semantic understanding, inference, reasoning can be very challenging for deep learning models. For examples, question answering, chatbot, medical diagnosis and etc.
So the current AI systems are still far behind the ultimate goal of AI, which is AI should be able to do what human can do.
[click to next page]
The good news, nowadays, we not only in the era of big data, we also have lots of content providers. The content providers are the providers that can organise and provide knowledge in general or specific domains. The content they provide is also a kind of data but the data should be better organised , structured and in high quality.
[click]
A good example is the content provided by Elsevier and LexisNexis.
[click]
Therefore, we believe the combination of deep learning with the content would be essential in our future AI research.
And we call it Deep Content Learning
[click next page]
We think in the Deep Content Learning, there are at least three key modules. The perception, the reasoning and the decision making.
The perception module is a stage to extract features and representations from data. And this is what machine learning and deep learning are initially defined to do.
The reasoning module should include additional knowledge from content provider to infer something related to the scenario.
The final decision making module would combine all the results and make the right decision driven the utility.
For example [click]
Given an image of a dog, the perception module extracts the features of this dog, the colour, the eye colour.
The reasoning module should find the knowledge that describes this specific kind of dog.
And the decision making module should predict that this is a husky instead just saying that it is a dog.
[click next page]
We have conducted two concrete projects under the idea of Deep Content Learning.
The traffic prediction with auxiliary information and
the zero-shot text classification with semantic knowledge.
As we know, the traffic is normally periodic. There are peak hours when the traffic is heavy and off-peak hours when the traffic is light. In this case, it is easy for models to predict the traffic.
However, if a place is holding a public event, like here like today, HPCC Systems summit, the traffic may be not normal again. Because a crowd of people will come here and the traffic nearby will be abnormally heavier and a classic traffic prediction model may fail. But how can we detect such condition.
I believe most people nowadays can’t drive without a navigation app like Google Map. If one person is searching this hotel, maybe everything is fine.
[click]
But if a lot of people is searching this hotel, I am a little worried about the traffic here.
[click]
This figure shows how the search query from map app is related to the traffic speed.
The blue lines are normal condition and the red liens are abnormal condition. As you may find there is very clear negative correlation between traffic speed and online search query. And the statistics has verified this idea.
[click to next page]
In this project, we used conventional sequence learning as the perception.
We quantified the query impact on traffic and did event discovery in the reasoning module.
The decision making integrated all the information and did traffic speed prediction.
[click next page]
This table shows the events we discovered from the query records.
[click]
For example, the row of this table, we find that the number of queries that search capital gym at this period time is much higher than the normal query counts. And we find some other popular locations as well.
[click]
actually these events can correspond with real public events like concert, forum and attraction.
[click next page]
The slide introduces the modelling.
pure temporal
--> spatial relations
--> attributes
--> query impact
The key challenge is to transfer knowledge from familiar to unfamiliar classes (generalisation).
The research of zero-shot learning can be very useful when the training data of emerging classes is inefficient or even unavailable. And the problem of emerging classes is common in many domains such research topics, social media, advertisement, object recognition and medical diagnosis.
Few previous research studied zero-shot text classification.
Recognising text documents of categories that have never been seen in the training stage.
As we have no training data for unseen classes, the model can be biased to the data we have. So the model may not be able to differentiate the difference between seen and unseen classes, especially when there is some semantic overlapping between classes.
Reasons to use TensorFlow
Largest user base
Widest production adoption
Well-maintained documents
Battlefield-proof quality
But hard to master
Low-level interfaces
We hope in the future TensorLayer can be integrated into HPCC Systems to provide powerful
The improvement of distributed training on 2-GPU isn’t significant so far.