Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Austin,TX Meetup presentation tensorflow final oct 26 2017
1. Deep Learning Lecture Series
IBM Executive Briefing Center
Austin,TX
Session: Introduction to Tensorflow
Presenter: Clarisse Taaffe-Hedglin
clarisse@us.ibm.com
Executive HPC/HPDA Architect
IBM Systems WW Client Centers
5. Exploding Data Sources
ImageNet 10,000,000 labeled images
depicting 10,000+ object categories
CIFAR-10 (RBG)
https://quickdraw.withgoogle.com/data
Learned filter for AlexNet, Krizhevsky et al. 2012
MNIST 0-9
300,000 Labeled images
Over 1000 datasets at:
https://www.kaggle.com/datasets
6. Data & Compute Drive Training & Inference
Training
•Data intensive:
historical data sets
•Compute intensive:
100% accelerated
•Develop a model for use
on the edge as inference
Inference
•Enables the computer
to act in real time
•Low Power
•Out at the edge
7. Technique Increasing in Complexity
Artificial Neural Networks are evolving
Perceptron
GoogLeNet
Recurrent Neural Network
8. 7
Frameworks Address Technique
Frameworks enable developers to build, implement and maintain
machine learning systems, generate new projects and create new
impactful systems (Models).
Analytics tools and AI frameworks implemented by data science
engineers are often driven by researcher and data scientist preferences
9. Models Deployed Across All Industries
Automotive and
Transportation
Security and Public
Safety
Consumer Web,
Mobile, Retail
Medicine and Biology Broadcast, Media and
Entertainment
• Autonomous driving:
• Pedestrian detection
• Accident avoidance
Auto, trucking, heavy
equipment, Tier 1
suppliers
• Video Surveillance
• Image analysis
• Facial recognition and
detection
Local and national
police, public and
private safety/ security
• Image tagging
• Speech recognition
• Natural language
• Sentiment analysis
Hyperscale web
companies, large
retail
• Drug discovery
• Diagnostic assistance
• Cancer cell detection
Pharmaceutical, Medical
equipment, Diagnostic
labs
• Captioning
• Search
• Recommendations
• Real time translation
Consumer facing
companies with large
streaming of existing
media, or real time
content
10. 9
Using a Range of Data Science Software
Tool
%
change
2017
% usage
2016
% usage
Microsoft, CNTK 294% 3.4% 0.9%
Tensorflow 195% 20.2% 6.8%
Microsoft Power BI 84% 10.2% 5.6%
Alteryx 76% 5.3% 3.0%
SQL on Hadoop tools 42% 10.3% 7.3%
Microsoft other tools 40% 2.2% 1.6%
Anaconda 37% 21.8% 16.0%
Caffe 32% 3.1% 2.3%
Orange 30% 4.0% 3.1%
DL4J 30% 2.2% 1.7%
Other Deep Learning Tools 30% 4.8% 3.7%
Microsoft Azure ML 26% 6.4% 5.1%
Source: http://www.kdnuggets.com/2017/05/poll-analytics-data-science-machine-learning-software-leaders.html
Deep Learning tools used by 32% of all
respondents (18% in 2016, 9% in 2015)
12. TensorFlow Overview
Framework developed by Google (Google Brain Team)
Created for machine learning & deep neural networks research
For numerical computation using data flow graphs
Tensorflow is opensource since Nov 2015, released under the Apache 2.0
https://github.com/tensorflow/tensorflow
Very strong developer/user community: 36,790+ forks, 1,100 Contributors
Written in C++, CUDA, some Python; Python and Matlab interfaces
14. 13
TensorFlow Constructs
Model Development
Learning model described by data flow graphs:
Nodes: represent mathematical operations (a.k.a. ops)
• General purpose
• Neural Net
Edges: represent data in N-D Arrays (Tensors)
Backward graph and update are added automatically to
graph
Inference
Execute forward path on graph
• TensorFlow Core is lowest level API for complete programming control
• Higher level APIs available (e.g. skflow as part of Scikit Learn API)
• Higher level abstractions for common patterns, structures and functionality
20. 19
Framework Scalability and Flexibility
Scalability-oriented Flexibility-oriented
▶ Use-cases in mind
▶ New algorithm research
▶ R&D projects for AI products
▶ Problem type
▶ Various specific applications
▶ 10+ k training samples
▶ 1 node with multiple GPUs
▶ Possible bottleneck
▶ Trial-and-error in prototyping
▶ Debugging, profiling & refactoring
▶ (wait time during compilation)
▶ Use-cases in mind
▶ Image/speech recognition system
▶ Fast DL as a service in cloud
▶ Problem type
▶ A few general applications
▶ 10+ million training samples
▶ 10+ nodes cluster w/ fast network
▶ Possible bottleneck
▶ Tuning of well-known algorithms
▶ Distributed computation for
model/data-parallel training
Source: Preferred Networks presentation,
2017 OpenPOWER Developer Congress