SlideShare una empresa de Scribd logo
1 de 120
Descargar para leer sin conexión
Deep Learning Cases:
Image, Text and Control
Grigory Sapunov
Founders Institute
Moscow 19.10.2017
gs@inten.to
AI/ML/DL
● Artificial Intelligence (AI) is a broad field of
study dedicated to complex problem solving.
● Machine Learning (ML) is usually considered
as a subfield of AI. ML is a data-driven
approach focused on creating algorithms that
has the ability to learn from the data without
being explicitly programmed.
● Deep Learning (DL) is a subfield of ML focused
on deep neural networks (NN) able to
automatically learn hierarchical
representations.
“Simple” Image & Video Processing
Typical tasks for CNNs
https://research.facebook.com/blog/learning-to-segment/
Detection task is harder than classification, but both are almost done.
And with better-than-human quality.
Super-human recognition
● Blue: Traditional CV
● Purple: Deep Learning
● Red: Human
Case #1: IJCNN 2011
The German Traffic Sign Recognition Benchmark
● Classification, >40 classes
● >50,000 real-life images
● First Superhuman Visual Pattern Recognition
○ 2x better than humans
○ 3x better than the closest artificial competitor
○ 6x better than the best non-neural method
Method Correct (Error)
1 Committee of CNNs 99.46 % (0.54%)
2 Human Performance 98.84 % (1.16%)
3 Multi-Scale CNNs 98.31 % (1.69%)
4 Random Forests 96.14 % (3.86%)
http://people.idsia.ch/~juergen/superhumanpatternrecognition.html
Case #2: ILSVRC 2010-2015
Large Scale Visual Recognition Challenge (ILSVRC)
● Object detection (200 categories, ~0.5M images)
● Classification + localization (1000 categories, 1.2M images)
Examples: Object Detection
Example: Face Detection + Emotion Classification
Example: Face Detection + Classification + Regression
Examples: Food Recognition
Examples: Computer Vision on the Road
Examples: Pedestrian Detection
Examples: Activity Recognition
Examples: Road Sign Recognition (on mobile!)
More complex Image & Video
Processing
https://www.youtube.com/watch?v=ZJMtDRbqH40
NYU Semantic Segmentation with a Convolutional Network (33 categories)
Semantic Segmentation
Semantic Segmentation
Image Colorization
http://richzhang.github.io/colorization/
Fun: Deep Dream
http://blogs.wsj.com/digits/2016/02/29/googles-computers-paint-like-van-gogh-and-the-art-sells-for-thousands/
More Fun: Neural Style
http://www.boredpanda.com/inceptionism-neural-network-deep-dream-art/
More Fun: Neural Doodle
http://arxiv.org/abs/1603.01768 Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks
(a) Original painting by Renoir, (b) semantic annotations,
(c) desired layout, (d) generated output.
More Fun: Photo-realistic Style Transfer
https://arxiv.org/abs/1703.07511 Deep Photo Style Transfer
More Fun: Photo-realistic Style Transfer
https://arxiv.org/abs/1703.07511 Deep Photo Style Transfer
Generative Adversarial Networks (GANs)
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
https://arxiv.org/abs/1511.06434
Generative Adversarial Networks (GANs)
http://www.evolvingai.org/ppgn
Text Processing / NLP
Deep Learning and NLP
Variety of tasks:
● Finding synonyms
● Fact extraction: people and company names, geography, prices, dates,
product names, …
● Classification: genre and topic detection, positive/negative sentiment
analysis, authorship detection, …
● Machine translation
● Search (written and spoken)
● Question answering
● Dialog systems
● Language modeling, Part of speech recognition
https://code.google.com/archive/p/word2vec/
Example: Semantic Spaces (word2vec, GloVe)
vector('king') - vector('man') + vector('woman') = vector('queen')
http://nlp.stanford.edu/projects/glove/
Example: Semantic Spaces (word2vec, GloVe)
Encoding semantics
Using word2vec instead of word indexes allows you to better deal with the word
meanings (e.g. no need to enumerate all synonyms because their vectors are
already close to each other).
But the naive way to work with word2vec vectors still gives you a “bag of words”
model, where phrases “The man killed the tiger” and “The tiger killed the man” are
equal.
Need models which pay attention to the word ordering: paragraph2vec, sentence
embeddings (using RNN/LSTM), even World2Vec (LeCunn @CVPR2015).
Multi-modal learning
http://arxiv.org/abs/1411.2539 Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
Example: More multi-modal learning
Deep learning cases - Founders Institute/Moscow - 2017.10.19
Caption Generation
http://arxiv.org/abs/1411.4555 “Show and Tell: A Neural Image Caption Generator”
Deep learning cases - Founders Institute/Moscow - 2017.10.19
Example: NeuralTalk and Walk
Ingredients:
● https://github.com/karpathy/neuraltalk2
Project for learning Multimodal Recurrent Neural Networks that describe
images with sentences
● Webcam/notebook
Result:
● https://vimeo.com/146492001
More hacking: NeuralTalk and Walk
Product of the near future: DenseCap and ?
http://arxiv.org/abs/1511.07571 DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Example: Image generation by text
StackGAN: Text to Photo-realistic Image Synthesis with
Stacked Generative Adversarial Networks, https://arxiv.org/abs/1612.03242
Visual Question Answering
https://avisingh599.github.io/deeplearning/visual-qa/
Case: Sentiment analysis
http://nlp.stanford.edu/sentiment/
Can capture complex cases where bag-of-words models fail.
“This movie was actually neither that funny, nor super witty.”
Case: Sentiment analysis
https://blog.openai.com/unsupervised-sentiment-neuron/
“Our research implies that simply training large unsupervised next-step-prediction
models on large amounts of data may be a good approach to use when creating
systems with good representation learning capabilities.”
Case: Machine Translation
Sequence to Sequence Learning with Neural Networks, http://arxiv.org/abs/1409.3215
Case: Automated Speech Translation
Translating voice calls and video calls in 8 languages and instant messages in over 50.
https://www.skype.com/en/features/skype-translator/
Speech Recognition: Word Error Rate (WER)
“Google now has just an 8 percent error rate. Compare that to 23 percent in
2013” (2015)
http://venturebeat.com/2015/05/28/google-says-its-speech-recognition-technology-now-has-only-an-8-word-error-rate/
IBM Watson. “The performance of our new system – an 8% word error rate – is
36% better than previously reported external results.” (2015)
https://developer.ibm.com/watson/blog/2015/05/26/ibm-watson-announces-breakthrough-in-conversational-speech-transcr
iption/
Baidu. “We are able to reduce error rates of our previous end-to-end system in
English by up to 43%, and can also recognize Mandarin speech with high
accuracy. Creating high-performing recognizers for two very different languages,
English and Mandarin, required essentially no expert knowledge of the
languages” (2015)
http://arxiv.org/abs/1512.02595
Example: Baidu Deep Speech 2 (2015)
● “The Deep Speech 2 ASR pipeline approaches or exceeds the accuracy of Amazon Mechanical
Turk human workers on several benchmarks, works in multiple languages with little modification,
and is deployable in a production setting.”
● “Table 13 shows that the DS2 system outperforms humans in 3 out of the 4 test sets and is
competitive on the fourth. Given this result, we suspect that there is little room for a generic speech
system to further improve on clean read speech without further domain adaptation”
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin, http://arxiv.org/abs/1512.02595
Case: Baidu Automated Speech Recognition (ASR)
More Fun: MtG cards
http://www.escapistmagazine.com/articles/view/scienceandtech/14276-Magic-The-Gathering-Cards-Made-by-Artificial-Intelligence
https://arxiv.org/abs/1708.08151 Automated Crowdturfing Attacks and Defenses in Online Review Systems
Case: Review generation
Case: Question Answering
A Neural Network for Factoid Question Answering over Paragraphs, https://cs.umd.edu/~miyyer/qblearn/
Case: Dialogue Systems
A Neural Conversational Model,
Oriol Vinyals, Quoc Le
http://arxiv.org/abs/1506.05869
What for: Conversational Commerce
https://medium.com/chris-messina/2016-will-be-the-year-of-conversational-commerce-1586e85e3991
What for: Conversational Commerce
[Robotic] Control
Reinforcement Learning
Simulated race car control (2013)
http://people.idsia.ch/~juergen/gecco2013torcs.pdf
http://people.idsia.ch/~juergen/compressednetworksearch.html
Reinforcement Learning
Reinforcement Learning
Human-level control through deep reinforcement learning (2014)
http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html
Playing Atari with Deep Reinforcement Learning (2013)
http://arxiv.org/abs/1312.5602
Reinforcement Learning
Game of Go: Computer-Human 4:1
AlphaGo in datacenters
“We’ve managed to reduce the amount of energy we use for cooling by up to 40 percent.”
https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-40/
Drone control
http://www.digitaltrends.com/cool-tech/swiss-drone-ai-follows-trails/
This drone can automatically follow forest
trails to track down lost hikers
Car control
Meet the 26-Year-Old Hacker Who Built a
Self-Driving Car... in His Garage
https://www.youtube.com/watch?v=KTrgRYa2wbI
Car driving
https://www.youtube.com/watch?v=YuyT2SDcYrU
“Actually a “Perception to Action” system. The visual perception and control
system is a Deep learning architecture trained end to end to transform pixels
from the cameras into steering angles. And this car uses regular color cameras,
not LIDARS like the Google cars. It is watching the driver and learns.”
Example: Sensorimotor Deep Learning
“In this project we aim to develop deep learning techniques that can be deployed
on a robot to allow it to learn directly from trial-and-error, where the only
information provided by the teacher is the degree to which it is succeeding at the
current task.”
http://rll.berkeley.edu/deeplearningrobotics/
Summary
DL/Multi-modal Learning
Deep Learning models become multi-modal: they use 2+ modalities
simultaneously, i.e.:
● Image caption generation: images + text
● Search Web by an image: images + text
● Video describing: the same but added time dimension
● Visual question answering: images + text
● Speech recognition: audio + video (lips motion)
● Image classification and navigation: RGB-D (color + depth)
Where does it aim to?
● Common metric space for each concept, “thought vector”. Will be possible to
match different modalities easily.
DL/Transfer of Ideas
Methods developed for one modality are successfully transferred to another:
● Convolutional Neural Networks, CNNs (originally developed for image
recognition) work well on texts, speech and some time-series signals (e.g.
ECG).
● Recurrent Neural Networks, RNNs (mostly used on language and other
sequential data) seem to work on images.
If the technologies successfully transfer from one modality to another (for
example, image to texts for CNNs) then probably the ideas worked in one domain
will work in another (style transfer for images could be transferred to texts).
Why Deep Learning is helpful? Or even a game-changer
● Works on raw data (pixels, sound, text or chars), no need to feature
engineering
○ Some features are really hard to develop (requires years of work for
group of experts)
○ Some features are patented (i.e. SIFT, SURF for images)
● Allows end-to-end learning (pixels-to-category, sound to sentence, English
sentence to Chinese sentence, etc)
○ No need to do segmentation, etc. (a lot of manual labor)
⇒ You can iterate faster (and get superior quality at the same time!)
Deep learning cases - Founders Institute/Moscow - 2017.10.19
Still some issues exist: Datasets
● No dataset -- no deep learning
There are a lot of data available (and it’s required for deep learning,
otherwise simple models could be better)
○ But sometimes you have no dataset…
■ Nonetheless some hacks available: Transfer learning, Data
augmentation, Mechanical Turk, …
http://www.spacemachine.net/views/2016/3/datasets-over-algorithms
Still some issues exist: Datasets
Still some issues exist: Computing power
● Requires a lot of computations. No cluster or GPU machines
-- much more time required
● Currently GPUs (mostly NVIDIA) is the only choice
● Waiting FPGA/ASIC coming into this field (Google TPU gen.2, Intel
2017+). The situation resembles the path of Bitcoin mining
● Neuromorphic computing is on the rise (IBM TrueNorth, memristors, etc)
● Quantum computing can benefit machine learning as well (but probably it
won’t be a desktop or in-house server solutions)
Datasets and computing power are growing
Computing power is growing
● Google TPU gen.2
○ 180 TFLOPS?
● NVIDIA DGX-1 ($129,000)
○ 170 TFLOPS (FP16)
○ 85 TFLOPS (FP32)
● NVIDIA Tesla V100/P100
○ 15/10.6 TFLOPS
○ 120 TFLOPS on V100
Tensor Core units
● NVIDIA GTX Titan X
(Pascal [new] / Maxwell [old])
($1000)
○ 11/6.1 TFLOPS (FP32)
● NVIDIA GTX 1080/1080 Ti ($700)
○ 8/11.3 TFLOPS (FP32)
● NVIDIA Drive PX-2 / PX
○ 8.0/2.3 TFLOPS
● NVidia Jetson TK1/TX1/TX2
○ 192/256/256 CUDA Cores
○ 64/64/128-bit 4/4/6-Core ARM CPU, 2/4/8 Gb Mem
● Raspberry Pi 3
○ 1.2 GHz 64-bit quad-core ARM Cortex-A53, 1 Gb SDRAM, US$35
● Tablets, Smartphones
○ Qualcomm Snapdragon 835, Apple A11 Bionic
● Google Project Tango
Deep Learning goes mobile!
Still some issues exist: Reasoning
Deep learning is mainly about perception, but there is a lot of inference involved in
everyday human reasoning.
● Neural networks lack common sense
● Cannot find information by inference
● Cannot explain the answer
○ It could be a must-have requirement in some areas, i.e. law, medicine.
Still some issues exist: Reasoning
The most fruitful approach is likely to be a hybrid neural-symbolic system. Topic of
active research right now.
And it seems all major players are already go this way (Watson, Siri, Cyc, …)
There is a lot of knowledge available (or extractable) in the world. Large
knowledge bases about the real world (Cyc/OpenCyc, FreeBase, Wikipedia,
schema.org, RDF, ..., scientific journals + text mining, …)
So what to do next?
Universal Libraries and Frameworks
● Torch7, PyTorch (http://torch.ch/, http://pytorch.org) [Lua, Python]
● TensorFlow (https://www.tensorflow.org/) [Python, C++]
● Keras (http://keras.io/) [Python]
● Theano (http://deeplearning.net/software/theano/) [Python]
○ Lasagne (https://github.com/Lasagne/Lasagne)
○ blocks (https://github.com/mila-udem/blocks)
○ pylearn2 (https://github.com/lisa-lab/pylearn2)
● Microsoft Cognitive Toolkit (CNTK) (http://www.cntk.ai/) [Python, C++, C#,
BrainScript]
● Neon (http://neon.nervanasys.com/) [Python]
● Deeplearning4j (http://deeplearning4j.org/) [Java]
● MXNet (http://mxnet.io/) [C++, Python, R, Scala, Julia, Matlab, Javascript]
● …
Libraries & Frameworks for image/video processing
● OpenCV (http://opencv.org/)
● Caffe/Caffe2 (http://caffe.berkeleyvision.org/, https://caffe2.ai/)
● Torch7 (http://torch.ch/)
● clarifai (http://clarif.ai/)
● Google Vision API (https://cloud.google.com/vision/)
● …
● + all universal libraries
Libraries & Frameworks for speech
● Microsoft Cognitive Toolkit (CNTK) (http://www.cntk.ai/) [Python, C++, C#,
BrainScript]
● KALDI (http://kaldi-asr.org/) [C++]
● Google Speech API (https://cloud.google.com/)
● Yandex SpeechKit (https://tech.yandex.ru/speechkit/)
● Baidu Speech API (http://www.baidu.com/)
● wit.ai (https://wit.ai/)
● …
Libraries & Frameworks for text processing
● Torch7 (http://torch.ch/)
● Theano/Keras/…
● TensorFlow (https://www.tensorflow.org/)
● Google Translate API (https://cloud.google.com/translate/)
● Salesforce Einstein
(https://www.salesforce.com/products/einstein/overview/)
●
● Machine Translation Benchmark (July 2017)
(https://www.slideshare.net/KonstantinSavenkov/intento-machine-translation-benchmark-july-2017)
● Intent Detection Benchmark (August 2017)
(https://www.slideshare.net/KonstantinSavenkov/nlu-intent-detection-benchmark-by-intento-august-
2017)
● ...
What to read and where to study?
- CS231n: Convolutional Neural Networks for Visual Recognition, Fei-Fei
Li, Andrej Karpathy, Stanford
(http://vision.stanford.edu/teaching/cs231n/index.html)
- CS224d: Deep Learning for Natural Language Processing, Richard
Socher, Stanford (http://cs224d.stanford.edu/index.html)
- Neural Networks for Machine Learning, Geoffrey Hinton
(https://www.coursera.org/course/neuralnets)
- Computer Vision course collection
(http://eclass.cc/courselists/111_computer_vision_and_navigation)
- Deep learning course collection
(http://eclass.cc/courselists/117_deep_learning)
- Book “Deep Learning”, Ian Goodfellow, Yoshua Bengio and Aaron Courville
(http://www.deeplearningbook.org/)
What to read and where to study?
- Google+ Deep Learning community
(https://plus.google.com/communities/112866381580457264725)
- VK Deep Learning community (http://vk.com/deeplearning)
- Quora (https://www.quora.com/topic/Deep-Learning)
- FB Deep Learning Moscow
(https://www.facebook.com/groups/1505369016451458/)
- Twitter Deep Learning Hub (https://twitter.com/DeepLearningHub)
- NVidia blog (https://devblogs.nvidia.com/parallelforall/tag/deep-learning/)
- IEEE Spectrum blog (http://spectrum.ieee.org/blog/cars-that-think)
- http://deeplearning.net/
- Arxiv Sanity Preserver http://www.arxiv-sanity.com/
- ...
Whom to follow?
- Jürgen Schmidhuber (http://people.idsia.ch/~juergen/)
- Geoffrey E. Hinton (http://www.cs.toronto.edu/~hinton/)
- Google DeepMind (http://deepmind.com/)
- Yann LeCun (http://yann.lecun.com, https://www.facebook.com/yann.lecun)
- Yoshua Bengio (http://www.iro.umontreal.ca/~bengioy,
https://www.quora.com/profile/Yoshua-Bengio)
- Andrej Karpathy (http://karpathy.github.io/)
- Andrew Ng (http://www.andrewng.org/)
- ...
[Bonus] Hardware
Hardware: Overview
Serious problems with the current processors are:
● energy efficiency (DeepMind used 1,202 CPUs and 176 GPUs)
● architecture (not well-suitable for brain-like computations)
Computing power is growing
● Google TPU gen.2
○ 180 TFLOPS?
● NVIDIA DGX-1 ($129,000)
○ 170 TFLOPS (FP16)
○ 85 TFLOPS (FP32)
● NVIDIA Tesla V100/P100
○ 15/10.6 TFLOPS
○ 120 TFLOPS on V100
Tensor Core units
● NVIDIA GTX Titan X
(Pascal [new] / Maxwell [old])
($1000)
○ 11/6.1 TFLOPS (FP32)
● NVIDIA GTX 1080/1080 Ti ($700)
○ 8/11.3 TFLOPS (FP32)
● NVIDIA Drive PX-2 / PX
○ 8.0/2.3 TFLOPS
(Sep 23, 2017) Inside iPhone 8: Apple's A11 Bionic
introduces 5 new custom silicon engines
“Creating an entirely new GPU architecture "wasn't
innovative enough," so A11 Bionic also features an entirely
new Neural Engine within its Image Signal Processor, tuned to solve very specific
problems such as matching, analyzing and calculating thousands of reference
points within a flood of image data rushing from the camera sensor.
Those tasks could be sent to the GPU, but having logic optimized specifically for
matrix multiplications and floating-point processing allows the Neural Engine to
excel at those tasks.
http://appleinsider.com/articles/17/09/23/inside-iphone-8-apples-a11-bionic-introduces-5-new-custom-silicon-engines
Mobile AI: Apple
(Aug 16, 2017) We are making on-device AI ubiquitous
“In fact, the Hexagon DSP with Qualcomm Hexagon Vector
eXtensions on Snapdragon 835 has been shown to offer a
25X improvement in energy efficiency and an 8X
Improvement in performance when compared against running the same
workloads (GoogleNet Inception Network) on the Qualcomm Kryo CPU.
We have introduced the Snapdragon Neural Processing Engine (NPE) Software
Developer Kit (SDK). This features an accelerated runtime for on-device execution
of convolutional neural networks (CNN) and recurrent neural networks (RNN) —
which are great for tasks like image recognition and natural language processing,
respectively”
https://www.qualcomm.com/news/onq/2017/08/16/we-are-making-device-ai-ubiquitous
Mobile AI: Qualcomm
FPGA/ASIC
● FPGA (field-programmable gate array) is an integrated circuit designed to be
configured by a customer or a designer after manufacturing
● ASIC (application-specific integrated circuit) is an integrated circuit customized
for a particular use, rather than intended for general-purpose use.
● Both FPGAs and ASICs are usually much more energy-efficient than general
purpose processors (so more productive with respect to GFLOPS per Watt).
● OpenCL can be the language for development for FPGA, and more ML/DL
libraries are using OpenCL too (for example, Caffe). So, there should appear an
easy way to do ML on FPGAs.
● Bitcoin mining is another heavy-lifting task which passed the way from CPU
through GPU to FPGA and finally ASICs. The history could repeat itself with
deep learning.
FPGA/ASIC custom chips
There is a lot of movement to FPGA/ASIC right now:
● Mobileye chips with specially developed ASIC cores are used in BMW, Tesla, Volvo, etc.
● Microsoft develops Project Catapult that uses clusters of FPGAs
https://blogs.msdn.microsoft.com/msr_er/2015/11/12/project-catapult-servers-available-to-academic-researchers/
● Baidu tries to use FPGAs for DL
http://www.hotchips.org/wp-content/uploads/hc_archives/hc26/HC26-12-day2-epub/HC26.12-5-FPGAs-epub/HC26.12.545-Soft-Def-Acc-Ouyang-baidu-v3--baidu-v4.pdf
● Altera (one of the FPGA monsters) was acquired by Intel in 2015. Intel is working on a
hybrid Xeon+FPGA chip
http://www.nextplatform.com/2016/03/14/intel-marrying-fpga-beefy-broadwell-open-compute-future/
● Nervana plans to make a special chip to make machine learning faster (acquired by Intel)
http://www.eetimes.com/document.asp?doc_id=1328523&
● Movidius (acquired by Intel) Myriad X VPU - a dedicated hardware accelerator for deep
neural network inferences.
https://www.movidius.com/myriadx
ASIC: Google TPU
● (May 18, 2016) Google announced
Tensor Processing Unit (TPU)
○ a custom ASIC built specifically for machine learning
— and tailored for TensorFlow
○ Has been running TPUs inside Google’s data centers
for more than a year.
○ Server racks with TPUs used in the AlphaGo matches
with Lee Sedol
https://cloudplatform.googleblog.com/2016/05/Google-supercharges-machine-learning-tasks-with-custom-chip.html
https://cloudplatform.googleblog.com/2017/04/quantifying-the-performance-of-the-TPU-our-first-machine-learning-chip.html
ASIC: Google TPU gen.2
● (May 17, 2017) Build and train machine learning models
on our new Google Cloud TPUs
○ Second generation of a custom ASIC built specifically for machine
learning
○ Now supports training, not only inference
○ Enormous up to 180 teraflops of
floating-point performance
https://blog.google/topics/google-cloud/google-cloud-offer-tpus-machine-learning/
https://cloud.google.com/tpu/
A “TPU pod” built with 64 second-generation TPUs delivers up to
11.5 petaflops of machine learning acceleration.
https://cloud.google.com/tpu/
FPGA: Intel DLIA
(Nov 15, 2016) Intel Unveils FPGA to Accelerate
Neural Networks
The Intel Deep Learning Inference Accelerator
(DLIA) combines traditional Intel CPUs with field
programmable gate arrays (FPGAs), semiconductors
that can be reprogrammed to perform specialized
computing tasks. FPGAs allow users to tailor compute
power to specific workloads or applications.
http://datacenterfrontier.com/intel-unveils-fpga-to-accelerate-ai-neural-networks/
ASIC: Intel Knights Mill
(Aug 24, 2017) Intel Spills Details on Knights Mill
Processor
Knights Mill, a Xeon Phi processor tweaked for machine
learning applications.
Knights Mill represents the chipmaker’s first Xeon Phi offering aimed exclusively
at the machine learning market, specifically for the training of deep neural
networks. For the inferencing side of deep learning, Intel points to its
Altera-based FPGA products, which are being used extensively by Microsoft in its
Azure cloud.
Knights Mill is scheduled for launch in Q4 of this year.
https://www.top500.org/news/intel-spills-details-on-knights-mill-processor/
ASIC: Intel Nervana NNP
(Oct 17, 2017) Announcing Industry’s First
Neural Network Processor
Intel will ship the industry’s first silicon for neural
network processing, the Intel® Nervana™
Neural Network Processor (NNP), before the end
of this year (ex-Lake Crest processor).
● New memory architecture designed for maximizing utilization of silicon
computation
● Massive bi-directional data transfer to achieve true model parallelism where
neural network parameters are distributed across multiple chips.
● A new numeric format called Flexpoint
https://newsroom.intel.com/editorials/intel-pioneers-new-technologies-advance-artificial-intelligence/
Neuromorphic chips
● DARPA SyNAPSE program (Systems of Neuromorphic Adaptive Plastic
Scalable Electronics)
● IBM TrueNorth; Stanford Neurogrid; HRL neuromorphic chip; Human Brain
Project SpiNNaker and HICANN; Qualcomm.
https://www.technologyreview.com/s/526506/neuromorphic-chips/
x
http://www.eetimes.com/document.asp?doc_id=1327791
Neuromorphic chips: Snapdragon 820
Over the years, Qualcomm’s primary focus had been to make mobile
processors for smartphones and tablets. But the company is now trying
to expand into other areas including making chips for automobile and
robots as well. The company is also marketing the Kyro as its
neuromorphic, cognitive computing platform Zeroth.
http://www.extremetech.com/computing/200090-qualcomms-cognitive-compute-processors-are-coming-to-snapdragon-820
Neuromorphic chips: IBM TrueNorth
● 1M neurons, 256M synapses, 4096 neurosynaptic
cores on a chip, est. 46B synaptic ops per sec per W
● Uses 70mW, power density is 20 milliwatts per
cm^2— almost 1/10,000th the power of most modern
microprocessors
● “Our sights are now set high on the ambitious goal of
integrating 4,096 chips in a single rack with 4B neurons and 1T synapses while
consuming ~4kW of power”.
● Currently IBM is making plans to commercialize it.
● (2016) Lawrence Livermore National Lab got a cluster of 16 TrueNorth chips
(16M neurons, 4B synapses, for context, the human brain has 86B neurons).
When running flat out, the entire cluster will consume a grand total of 2.5 watts.
http://spectrum.ieee.org/tech-talk/computing/hardware/ibms-braininspired-computer-chip-comes-from-the-future
Neuromorphic chips: IBM TrueNorth
● (03.2016) IBM Research demonstrated convolutional neural nets with close to
state of the art performance:
“Convolutional Networks for Fast, Energy-Efficient Neuromorphic Computing”, http://arxiv.org/abs/1603.08270
Neuromorphic chips: Intel Loihi
(Sep 25, 2017) As part of an effort within Intel Labs, Intel has
developed a first-of-its-kind self-learning neuromorphic chip –
codenamed Loihi – that mimics how the brain functions by
learning to operate based on various modes of feedback from the
environment. This extremely energy-efficient chip, which uses the
data to learn and make inferences, gets smarter over time and
does not need to be trained in the traditional way. It takes a novel
approach to computing via asynchronous spiking.
It is up to 1,000 times more energy-efficient than general purpose
computing required for typical training systems.
In the first half of 2018, the Loihi test chip will be shared with
leading university and research institutions with a focus on
advancing AI.
https://newsroom.intel.com/editorials/intels-new-self-learning-chip-promises-accelerate-artificial-intelligence/
Neuromorphic chips: Intel Loihi
● Fully asynchronous neuromorphic many core mesh that
supports a wide range of sparse, hierarchical and recurrent
neural network topologies
● Each neuromorphic core includes a learning engine that can
be programmed to adapt network parameters during
operation, supporting supervised, unsupervised,
reinforcement and other learning paradigms.
● Fabrication on Intel’s 14 nm process technology.
● A total of 130,000 neurons and 130 million synapses.
● Development and testing of several algorithms with high
algorithmic efficiency for problems including path planning,
constraint satisfaction, sparse coding, dictionary learning,
and dynamic pattern learning and adaptation.
https://newsroom.intel.com/editorials/intels-new-self-learning-chip-promises-accelerate-artificial-intelligence/
Memristors
● Neuromorphic chips generally use the same silicon transistors and digital
circuits that make up ordinary computer processors. There is another way to
build brain inspired chips.
https://www.technologyreview.com/s/537211/a-better-way-to-build-brain-inspired-chips/
● Memristors (memory resistor), exotic electronic devices only confirmed to exist
in 2008. The memristor's electrical resistance is not constant but depends on
the history of current that had previously flowed through the device, i.e.the
device remembers its history. An analog memory device.
● Some startups try to make special chips for low-power machine learning, i.e.
Knowm
http://www.forbes.com/sites/alexknapp/2015/09/09/this-startup-has-a-brain-inspired-chip-for-machine-learning/#5007095d51a2
http://www.eetimes.com/document.asp?doc_id=1327068
https://www.technologyreview.com/s/603495/10-breakthrough-technologies-2017-practical-quantum-computers/
Quantum Computing: D-Wave
● May 2013 Google teamed with NASA and launched Quantum AI Lab, equipped
with a quantum computer from D-Wave Systems (D-Wave 2, 512 qubits).
● Aug 2015 D-Wave announced D-Wave 2X (1000+ qubits)
● Actually D-Wave computers are not full quantum computers.
Quantum Computing: D-Wave
● (May 2013)
“We’ve already developed some quantum machine learning algorithms. One
produces very compact, efficient recognizers -- very useful when you’re short
on power, as on a mobile device. Another can handle highly polluted training
data, where a high percentage of the examples are mislabeled, as they often
are in the real world. And we’ve learned some useful principles: e.g., you get
the best results not with pure quantum computing, but by mixing quantum and
classical computing.”
https://research.googleblog.com/2013/05/launching-quantum-artificial.html
Quantum Computing: D-Wave
● (Jun 2014) Yet results on the D-Wave 2 computer seem controversial:
“Using random spin glass instances as a benchmark, we find no evidence of
quantum speedup when the entire data set is considered, and obtain
inconclusive results when comparing subsets of instances on an
instance-by-instance basis. Our results do not rule out the possibility of
speedup for other classes of problems and illustrate the subtle nature of the
quantum speedup question.”
http://science.sciencemag.org/content/early/2014/06/18/science.1252319
Quantum Computing: D-Wave
● (Dec 2015)
“We found that for problem instances involving nearly 1000 binary variables,
quantum annealing significantly outperforms its classical counterpart, simulated
annealing. It is more than 108
times faster than simulated annealing running on a
single core. We also compared the quantum hardware to another algorithm
called Quantum Monte Carlo. This is a method designed to emulate the behavior
of quantum systems, but it runs on conventional processors. While the scaling
with size between these two methods is comparable, they are again separated
by a large factor sometimes as high as 108
.”
https://research.googleblog.com/2015/12/when-can-quantum-annealing-win.html
Deep learning cases - Founders Institute/Moscow - 2017.10.19
Quantum Computing: Google
● (Jul 2016)
“ We have performed the first completely scalable quantum simulation of a
molecule
…
In our experiment, we focus on an approach known as the variational quantum
eigensolver (VQE), which can be understood as a quantum analog of a
neural network. The quantum advantage of VQE is that quantum bits can
efficiently represent the molecular wavefunction whereas exponentially many
classical bits would be required. Using VQE, we quantum computed the energy
landscape of molecular hydrogen, H2.
https://research.googleblog.com/2016/07/towards-exact-quantum-description-of.html
Quantum Computing: Google
(May 2017) Google Plans to Demonstrate the Supremacy
of Quantum Computing
“Google’s quantum computing chip is a 2-by-3 array of qubits.
The company hopes to make a 7-by-7 array later this year.
By the end of this year, the team aims to increase the number of superconducting
qubits it builds on integrated circuits to create a 7-by-7 array. With this quantum IC,
the Google researchers aim to perform operations at the edge of what’s possible with
even the best supercomputers, and so demonstrate “quantum supremacy.””
https://spectrum.ieee.org/computing/hardware/google-plans-to-demonstrate-the-supremacy-of-quantum-computing
Quantum Computing: IBM
(Sep 13, 2017) IBM Makes Breakthrough in Race to
Commercialize Quantum Computers
“IBM has been pushing to commercialize quantum computers and
recently began allowing anyone to experiment with running
calculations on a 16-qubit quantum computer it has built to
demonstrate the technology.”
https://www.bloomberg.com/news/articles/2017-09-13/ibm-makes-breakthrough-in-race-to-commercialize-quantum-computers
“IBM announced on May 17, 2017 that it has successfully built and tested its most powerful universal quantum
computing processors. Its upgraded 16 qubit processor (pictured) will be available for use by developers,
researchers, and programmers to explore quantum computing using a real quantum processor at no cost via
the IBM Cloud. IBM first opened public access to its quantum processors one year ago, to serve as an
enablement tool for scientific research, a resource for university classrooms, and a catalyst of enthusiasm for
the field. To date users have run more than 300,000 quantum experiments on the IBM Cloud”
https://phys.org/news/2017-05-ibm-powerful-universal-quantum-processors.html
Quantum Computing: Intel
(Oct 10, 2017) Quantum Inside: Intel Manufactures
an Exotic New Chip
“Intel’s quantum chip uses superconducting qubits.
The approach builds on an existing electrical circuit
design but uses a fundamentally different electronic phenomenon that only works at
very low temperatures. The chip, which can handle 17 qubits, was developed over
the past 18 months by researchers at a lab in Oregon and is being manufactured at
an Intel facility in Arizona.
https://www.technologyreview.com/s/609094/quantum-inside-intel-manufactures-an-exotic-new-chip/
https://newsroom.intel.com/news/intel-delivers-17-qubit-superconducting-chip-advanced-packaging-qutech/
Quantum Computing
● Quantum computers can provide significant speedups for many problems in
machine learning (training of classical Boltzmann machines, Quantum Bayesian
inference, SVM, PCA, Linear algebra, etc) and can enable fundamentally
different types of learning.
https://www.youtube.com/watch?v=ETJcALOplOA
● The three known types of quantum computing:
○ Universal Quantum: Offers the potential to be exponentially faster than traditional computers for
a number of important applications: Machine Learning, Cryptography, Material Science, etc. The
hardest to build. Current estimates: >100.000 physical qubits.
○ Analog Quantum: will be able to simulate complex quantum interactions that are intractable for
any known conventional machine: Quantum Chemistry, Quantum Dynamics, etc. Could happen
within next 5 years. It is conjectured that it will contain physical 50-100 qubits.
○ Quantum Annealer: a very specialized form of quantum computing. Suited for optimization
problems. The easiest to build. Has no known advantages over conventional computing.
http://www.research.ibm.com/quantum/expertise.html
Hardware: Summary
● Ordinary CPUs are general purpose and not as effective as they could be
● GPUs are becoming more and more powerful each year (but still consuming a
lot of power).
● ASICs/FPGAs are on the rise. We’ve already seen some and will probably
see even more interesting announces this year.
● Neuromorphic chips etc. are probably much farther from the market (3-5
years?) while already show interesting results.
● Memristors are probably even farther, but keep an eye on them.
● Quantum computing: still unclear. Probably will be cloud solutions, not
desktop ones.
https://ru.linkedin.com/in/grigorysapunov
gs@inten.to
Thanks!

Más contenido relacionado

Más de Grigory Sapunov

Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware LandscapeGrigory Sapunov
 
Modern neural net architectures - Year 2019 version
Modern neural net architectures - Year 2019 versionModern neural net architectures - Year 2019 version
Modern neural net architectures - Year 2019 versionGrigory Sapunov
 
AI - Last Year Progress (2018-2019)
AI - Last Year Progress (2018-2019)AI - Last Year Progress (2018-2019)
AI - Last Year Progress (2018-2019)Grigory Sapunov
 
Практический подход к выбору доменно-адаптивного NMT​
Практический подход к выбору доменно-адаптивного NMT​Практический подход к выбору доменно-адаптивного NMT​
Практический подход к выбору доменно-адаптивного NMT​Grigory Sapunov
 
Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Grigory Sapunov
 
Sequence learning and modern RNNs
Sequence learning and modern RNNsSequence learning and modern RNNs
Sequence learning and modern RNNsGrigory Sapunov
 
Введение в Deep Learning
Введение в Deep LearningВведение в Deep Learning
Введение в Deep LearningGrigory Sapunov
 
Введение в машинное обучение
Введение в машинное обучениеВведение в машинное обучение
Введение в машинное обучениеGrigory Sapunov
 
Введение в архитектуры нейронных сетей / HighLoad++ 2016
Введение в архитектуры нейронных сетей / HighLoad++ 2016Введение в архитектуры нейронных сетей / HighLoad++ 2016
Введение в архитектуры нейронных сетей / HighLoad++ 2016Grigory Sapunov
 
Artificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and FutureArtificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and FutureGrigory Sapunov
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Grigory Sapunov
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingGrigory Sapunov
 
Computer Vision and Deep Learning
Computer Vision and Deep LearningComputer Vision and Deep Learning
Computer Vision and Deep LearningGrigory Sapunov
 
Международная научно-практическая конференция учителей / Яндекс, МФТИ / 05.12...
Международная научно-практическая конференция учителей / Яндекс, МФТИ / 05.12...Международная научно-практическая конференция учителей / Яндекс, МФТИ / 05.12...
Международная научно-практическая конференция учителей / Яндекс, МФТИ / 05.12...Grigory Sapunov
 
Как (не) надо делать MOOC
Как (не) надо делать MOOCКак (не) надо делать MOOC
Как (не) надо делать MOOCGrigory Sapunov
 
Lifelong Learning @TEDxPokrovkaSt / Moscow / 15.05.2014
Lifelong Learning @TEDxPokrovkaSt / Moscow / 15.05.2014Lifelong Learning @TEDxPokrovkaSt / Moscow / 15.05.2014
Lifelong Learning @TEDxPokrovkaSt / Moscow / 15.05.2014Grigory Sapunov
 

Más de Grigory Sapunov (20)

Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware Landscape
 
Modern neural net architectures - Year 2019 version
Modern neural net architectures - Year 2019 versionModern neural net architectures - Year 2019 version
Modern neural net architectures - Year 2019 version
 
AI - Last Year Progress (2018-2019)
AI - Last Year Progress (2018-2019)AI - Last Year Progress (2018-2019)
AI - Last Year Progress (2018-2019)
 
Практический подход к выбору доменно-адаптивного NMT​
Практический подход к выбору доменно-адаптивного NMT​Практический подход к выбору доменно-адаптивного NMT​
Практический подход к выбору доменно-адаптивного NMT​
 
Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018
 
Sequence learning and modern RNNs
Sequence learning and modern RNNsSequence learning and modern RNNs
Sequence learning and modern RNNs
 
Введение в Deep Learning
Введение в Deep LearningВведение в Deep Learning
Введение в Deep Learning
 
Введение в машинное обучение
Введение в машинное обучениеВведение в машинное обучение
Введение в машинное обучение
 
Введение в архитектуры нейронных сетей / HighLoad++ 2016
Введение в архитектуры нейронных сетей / HighLoad++ 2016Введение в архитектуры нейронных сетей / HighLoad++ 2016
Введение в архитектуры нейронных сетей / HighLoad++ 2016
 
Artificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and FutureArtificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and Future
 
Multidimensional RNN
Multidimensional RNNMultidimensional RNN
Multidimensional RNN
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image Processing
 
Computer Vision and Deep Learning
Computer Vision and Deep LearningComputer Vision and Deep Learning
Computer Vision and Deep Learning
 
Apache Spark & MLlib
Apache Spark & MLlibApache Spark & MLlib
Apache Spark & MLlib
 
Международная научно-практическая конференция учителей / Яндекс, МФТИ / 05.12...
Международная научно-практическая конференция учителей / Яндекс, МФТИ / 05.12...Международная научно-практическая конференция учителей / Яндекс, МФТИ / 05.12...
Международная научно-практическая конференция учителей / Яндекс, МФТИ / 05.12...
 
EdCrunch
EdCrunchEdCrunch
EdCrunch
 
Как (не) надо делать MOOC
Как (не) надо делать MOOCКак (не) надо делать MOOC
Как (не) надо делать MOOC
 
MOOCs 101
MOOCs 101MOOCs 101
MOOCs 101
 
Lifelong Learning @TEDxPokrovkaSt / Moscow / 15.05.2014
Lifelong Learning @TEDxPokrovkaSt / Moscow / 15.05.2014Lifelong Learning @TEDxPokrovkaSt / Moscow / 15.05.2014
Lifelong Learning @TEDxPokrovkaSt / Moscow / 15.05.2014
 

Último

UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 

Último (20)

UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 

Deep learning cases - Founders Institute/Moscow - 2017.10.19

  • 1. Deep Learning Cases: Image, Text and Control Grigory Sapunov Founders Institute Moscow 19.10.2017 gs@inten.to
  • 2. AI/ML/DL ● Artificial Intelligence (AI) is a broad field of study dedicated to complex problem solving. ● Machine Learning (ML) is usually considered as a subfield of AI. ML is a data-driven approach focused on creating algorithms that has the ability to learn from the data without being explicitly programmed. ● Deep Learning (DL) is a subfield of ML focused on deep neural networks (NN) able to automatically learn hierarchical representations.
  • 3. “Simple” Image & Video Processing
  • 4. Typical tasks for CNNs https://research.facebook.com/blog/learning-to-segment/ Detection task is harder than classification, but both are almost done. And with better-than-human quality.
  • 5. Super-human recognition ● Blue: Traditional CV ● Purple: Deep Learning ● Red: Human
  • 6. Case #1: IJCNN 2011 The German Traffic Sign Recognition Benchmark ● Classification, >40 classes ● >50,000 real-life images ● First Superhuman Visual Pattern Recognition ○ 2x better than humans ○ 3x better than the closest artificial competitor ○ 6x better than the best non-neural method Method Correct (Error) 1 Committee of CNNs 99.46 % (0.54%) 2 Human Performance 98.84 % (1.16%) 3 Multi-Scale CNNs 98.31 % (1.69%) 4 Random Forests 96.14 % (3.86%) http://people.idsia.ch/~juergen/superhumanpatternrecognition.html
  • 7. Case #2: ILSVRC 2010-2015 Large Scale Visual Recognition Challenge (ILSVRC) ● Object detection (200 categories, ~0.5M images) ● Classification + localization (1000 categories, 1.2M images)
  • 9. Example: Face Detection + Emotion Classification
  • 10. Example: Face Detection + Classification + Regression
  • 15. Examples: Road Sign Recognition (on mobile!)
  • 16. More complex Image & Video Processing
  • 17. https://www.youtube.com/watch?v=ZJMtDRbqH40 NYU Semantic Segmentation with a Convolutional Network (33 categories) Semantic Segmentation
  • 21. More Fun: Neural Style http://www.boredpanda.com/inceptionism-neural-network-deep-dream-art/
  • 22. More Fun: Neural Doodle http://arxiv.org/abs/1603.01768 Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks (a) Original painting by Renoir, (b) semantic annotations, (c) desired layout, (d) generated output.
  • 23. More Fun: Photo-realistic Style Transfer https://arxiv.org/abs/1703.07511 Deep Photo Style Transfer
  • 24. More Fun: Photo-realistic Style Transfer https://arxiv.org/abs/1703.07511 Deep Photo Style Transfer
  • 25. Generative Adversarial Networks (GANs) Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks https://arxiv.org/abs/1511.06434
  • 26. Generative Adversarial Networks (GANs) http://www.evolvingai.org/ppgn
  • 28. Deep Learning and NLP Variety of tasks: ● Finding synonyms ● Fact extraction: people and company names, geography, prices, dates, product names, … ● Classification: genre and topic detection, positive/negative sentiment analysis, authorship detection, … ● Machine translation ● Search (written and spoken) ● Question answering ● Dialog systems ● Language modeling, Part of speech recognition
  • 29. https://code.google.com/archive/p/word2vec/ Example: Semantic Spaces (word2vec, GloVe) vector('king') - vector('man') + vector('woman') = vector('queen')
  • 31. Encoding semantics Using word2vec instead of word indexes allows you to better deal with the word meanings (e.g. no need to enumerate all synonyms because their vectors are already close to each other). But the naive way to work with word2vec vectors still gives you a “bag of words” model, where phrases “The man killed the tiger” and “The tiger killed the man” are equal. Need models which pay attention to the word ordering: paragraph2vec, sentence embeddings (using RNN/LSTM), even World2Vec (LeCunn @CVPR2015).
  • 32. Multi-modal learning http://arxiv.org/abs/1411.2539 Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
  • 35. Caption Generation http://arxiv.org/abs/1411.4555 “Show and Tell: A Neural Image Caption Generator”
  • 37. Example: NeuralTalk and Walk Ingredients: ● https://github.com/karpathy/neuraltalk2 Project for learning Multimodal Recurrent Neural Networks that describe images with sentences ● Webcam/notebook Result: ● https://vimeo.com/146492001
  • 39. Product of the near future: DenseCap and ? http://arxiv.org/abs/1511.07571 DenseCap: Fully Convolutional Localization Networks for Dense Captioning
  • 40. Example: Image generation by text StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, https://arxiv.org/abs/1612.03242
  • 42. Case: Sentiment analysis http://nlp.stanford.edu/sentiment/ Can capture complex cases where bag-of-words models fail. “This movie was actually neither that funny, nor super witty.”
  • 43. Case: Sentiment analysis https://blog.openai.com/unsupervised-sentiment-neuron/ “Our research implies that simply training large unsupervised next-step-prediction models on large amounts of data may be a good approach to use when creating systems with good representation learning capabilities.”
  • 44. Case: Machine Translation Sequence to Sequence Learning with Neural Networks, http://arxiv.org/abs/1409.3215
  • 45. Case: Automated Speech Translation Translating voice calls and video calls in 8 languages and instant messages in over 50. https://www.skype.com/en/features/skype-translator/
  • 46. Speech Recognition: Word Error Rate (WER) “Google now has just an 8 percent error rate. Compare that to 23 percent in 2013” (2015) http://venturebeat.com/2015/05/28/google-says-its-speech-recognition-technology-now-has-only-an-8-word-error-rate/ IBM Watson. “The performance of our new system – an 8% word error rate – is 36% better than previously reported external results.” (2015) https://developer.ibm.com/watson/blog/2015/05/26/ibm-watson-announces-breakthrough-in-conversational-speech-transcr iption/ Baidu. “We are able to reduce error rates of our previous end-to-end system in English by up to 43%, and can also recognize Mandarin speech with high accuracy. Creating high-performing recognizers for two very different languages, English and Mandarin, required essentially no expert knowledge of the languages” (2015) http://arxiv.org/abs/1512.02595
  • 47. Example: Baidu Deep Speech 2 (2015) ● “The Deep Speech 2 ASR pipeline approaches or exceeds the accuracy of Amazon Mechanical Turk human workers on several benchmarks, works in multiple languages with little modification, and is deployable in a production setting.” ● “Table 13 shows that the DS2 system outperforms humans in 3 out of the 4 test sets and is competitive on the fourth. Given this result, we suspect that there is little room for a generic speech system to further improve on clean read speech without further domain adaptation” Deep Speech 2: End-to-End Speech Recognition in English and Mandarin, http://arxiv.org/abs/1512.02595
  • 48. Case: Baidu Automated Speech Recognition (ASR)
  • 49. More Fun: MtG cards http://www.escapistmagazine.com/articles/view/scienceandtech/14276-Magic-The-Gathering-Cards-Made-by-Artificial-Intelligence
  • 50. https://arxiv.org/abs/1708.08151 Automated Crowdturfing Attacks and Defenses in Online Review Systems Case: Review generation
  • 51. Case: Question Answering A Neural Network for Factoid Question Answering over Paragraphs, https://cs.umd.edu/~miyyer/qblearn/
  • 52. Case: Dialogue Systems A Neural Conversational Model, Oriol Vinyals, Quoc Le http://arxiv.org/abs/1506.05869
  • 53. What for: Conversational Commerce https://medium.com/chris-messina/2016-will-be-the-year-of-conversational-commerce-1586e85e3991
  • 56. Reinforcement Learning Simulated race car control (2013) http://people.idsia.ch/~juergen/gecco2013torcs.pdf http://people.idsia.ch/~juergen/compressednetworksearch.html
  • 58. Reinforcement Learning Human-level control through deep reinforcement learning (2014) http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html Playing Atari with Deep Reinforcement Learning (2013) http://arxiv.org/abs/1312.5602
  • 60. Game of Go: Computer-Human 4:1
  • 61. AlphaGo in datacenters “We’ve managed to reduce the amount of energy we use for cooling by up to 40 percent.” https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-40/
  • 62. Drone control http://www.digitaltrends.com/cool-tech/swiss-drone-ai-follows-trails/ This drone can automatically follow forest trails to track down lost hikers
  • 63. Car control Meet the 26-Year-Old Hacker Who Built a Self-Driving Car... in His Garage https://www.youtube.com/watch?v=KTrgRYa2wbI
  • 64. Car driving https://www.youtube.com/watch?v=YuyT2SDcYrU “Actually a “Perception to Action” system. The visual perception and control system is a Deep learning architecture trained end to end to transform pixels from the cameras into steering angles. And this car uses regular color cameras, not LIDARS like the Google cars. It is watching the driver and learns.”
  • 65. Example: Sensorimotor Deep Learning “In this project we aim to develop deep learning techniques that can be deployed on a robot to allow it to learn directly from trial-and-error, where the only information provided by the teacher is the degree to which it is succeeding at the current task.” http://rll.berkeley.edu/deeplearningrobotics/
  • 67. DL/Multi-modal Learning Deep Learning models become multi-modal: they use 2+ modalities simultaneously, i.e.: ● Image caption generation: images + text ● Search Web by an image: images + text ● Video describing: the same but added time dimension ● Visual question answering: images + text ● Speech recognition: audio + video (lips motion) ● Image classification and navigation: RGB-D (color + depth) Where does it aim to? ● Common metric space for each concept, “thought vector”. Will be possible to match different modalities easily.
  • 68. DL/Transfer of Ideas Methods developed for one modality are successfully transferred to another: ● Convolutional Neural Networks, CNNs (originally developed for image recognition) work well on texts, speech and some time-series signals (e.g. ECG). ● Recurrent Neural Networks, RNNs (mostly used on language and other sequential data) seem to work on images. If the technologies successfully transfer from one modality to another (for example, image to texts for CNNs) then probably the ideas worked in one domain will work in another (style transfer for images could be transferred to texts).
  • 69. Why Deep Learning is helpful? Or even a game-changer ● Works on raw data (pixels, sound, text or chars), no need to feature engineering ○ Some features are really hard to develop (requires years of work for group of experts) ○ Some features are patented (i.e. SIFT, SURF for images) ● Allows end-to-end learning (pixels-to-category, sound to sentence, English sentence to Chinese sentence, etc) ○ No need to do segmentation, etc. (a lot of manual labor) ⇒ You can iterate faster (and get superior quality at the same time!)
  • 71. Still some issues exist: Datasets ● No dataset -- no deep learning There are a lot of data available (and it’s required for deep learning, otherwise simple models could be better) ○ But sometimes you have no dataset… ■ Nonetheless some hacks available: Transfer learning, Data augmentation, Mechanical Turk, …
  • 73. Still some issues exist: Computing power ● Requires a lot of computations. No cluster or GPU machines -- much more time required ● Currently GPUs (mostly NVIDIA) is the only choice ● Waiting FPGA/ASIC coming into this field (Google TPU gen.2, Intel 2017+). The situation resembles the path of Bitcoin mining ● Neuromorphic computing is on the rise (IBM TrueNorth, memristors, etc) ● Quantum computing can benefit machine learning as well (but probably it won’t be a desktop or in-house server solutions)
  • 74. Datasets and computing power are growing
  • 75. Computing power is growing ● Google TPU gen.2 ○ 180 TFLOPS? ● NVIDIA DGX-1 ($129,000) ○ 170 TFLOPS (FP16) ○ 85 TFLOPS (FP32) ● NVIDIA Tesla V100/P100 ○ 15/10.6 TFLOPS ○ 120 TFLOPS on V100 Tensor Core units ● NVIDIA GTX Titan X (Pascal [new] / Maxwell [old]) ($1000) ○ 11/6.1 TFLOPS (FP32) ● NVIDIA GTX 1080/1080 Ti ($700) ○ 8/11.3 TFLOPS (FP32) ● NVIDIA Drive PX-2 / PX ○ 8.0/2.3 TFLOPS
  • 76. ● NVidia Jetson TK1/TX1/TX2 ○ 192/256/256 CUDA Cores ○ 64/64/128-bit 4/4/6-Core ARM CPU, 2/4/8 Gb Mem ● Raspberry Pi 3 ○ 1.2 GHz 64-bit quad-core ARM Cortex-A53, 1 Gb SDRAM, US$35 ● Tablets, Smartphones ○ Qualcomm Snapdragon 835, Apple A11 Bionic ● Google Project Tango Deep Learning goes mobile!
  • 77. Still some issues exist: Reasoning Deep learning is mainly about perception, but there is a lot of inference involved in everyday human reasoning. ● Neural networks lack common sense ● Cannot find information by inference ● Cannot explain the answer ○ It could be a must-have requirement in some areas, i.e. law, medicine.
  • 78. Still some issues exist: Reasoning The most fruitful approach is likely to be a hybrid neural-symbolic system. Topic of active research right now. And it seems all major players are already go this way (Watson, Siri, Cyc, …) There is a lot of knowledge available (or extractable) in the world. Large knowledge bases about the real world (Cyc/OpenCyc, FreeBase, Wikipedia, schema.org, RDF, ..., scientific journals + text mining, …)
  • 79. So what to do next?
  • 80. Universal Libraries and Frameworks ● Torch7, PyTorch (http://torch.ch/, http://pytorch.org) [Lua, Python] ● TensorFlow (https://www.tensorflow.org/) [Python, C++] ● Keras (http://keras.io/) [Python] ● Theano (http://deeplearning.net/software/theano/) [Python] ○ Lasagne (https://github.com/Lasagne/Lasagne) ○ blocks (https://github.com/mila-udem/blocks) ○ pylearn2 (https://github.com/lisa-lab/pylearn2) ● Microsoft Cognitive Toolkit (CNTK) (http://www.cntk.ai/) [Python, C++, C#, BrainScript] ● Neon (http://neon.nervanasys.com/) [Python] ● Deeplearning4j (http://deeplearning4j.org/) [Java] ● MXNet (http://mxnet.io/) [C++, Python, R, Scala, Julia, Matlab, Javascript] ● …
  • 81. Libraries & Frameworks for image/video processing ● OpenCV (http://opencv.org/) ● Caffe/Caffe2 (http://caffe.berkeleyvision.org/, https://caffe2.ai/) ● Torch7 (http://torch.ch/) ● clarifai (http://clarif.ai/) ● Google Vision API (https://cloud.google.com/vision/) ● … ● + all universal libraries
  • 82. Libraries & Frameworks for speech ● Microsoft Cognitive Toolkit (CNTK) (http://www.cntk.ai/) [Python, C++, C#, BrainScript] ● KALDI (http://kaldi-asr.org/) [C++] ● Google Speech API (https://cloud.google.com/) ● Yandex SpeechKit (https://tech.yandex.ru/speechkit/) ● Baidu Speech API (http://www.baidu.com/) ● wit.ai (https://wit.ai/) ● …
  • 83. Libraries & Frameworks for text processing ● Torch7 (http://torch.ch/) ● Theano/Keras/… ● TensorFlow (https://www.tensorflow.org/) ● Google Translate API (https://cloud.google.com/translate/) ● Salesforce Einstein (https://www.salesforce.com/products/einstein/overview/) ● ● Machine Translation Benchmark (July 2017) (https://www.slideshare.net/KonstantinSavenkov/intento-machine-translation-benchmark-july-2017) ● Intent Detection Benchmark (August 2017) (https://www.slideshare.net/KonstantinSavenkov/nlu-intent-detection-benchmark-by-intento-august- 2017) ● ...
  • 84. What to read and where to study? - CS231n: Convolutional Neural Networks for Visual Recognition, Fei-Fei Li, Andrej Karpathy, Stanford (http://vision.stanford.edu/teaching/cs231n/index.html) - CS224d: Deep Learning for Natural Language Processing, Richard Socher, Stanford (http://cs224d.stanford.edu/index.html) - Neural Networks for Machine Learning, Geoffrey Hinton (https://www.coursera.org/course/neuralnets) - Computer Vision course collection (http://eclass.cc/courselists/111_computer_vision_and_navigation) - Deep learning course collection (http://eclass.cc/courselists/117_deep_learning) - Book “Deep Learning”, Ian Goodfellow, Yoshua Bengio and Aaron Courville (http://www.deeplearningbook.org/)
  • 85. What to read and where to study? - Google+ Deep Learning community (https://plus.google.com/communities/112866381580457264725) - VK Deep Learning community (http://vk.com/deeplearning) - Quora (https://www.quora.com/topic/Deep-Learning) - FB Deep Learning Moscow (https://www.facebook.com/groups/1505369016451458/) - Twitter Deep Learning Hub (https://twitter.com/DeepLearningHub) - NVidia blog (https://devblogs.nvidia.com/parallelforall/tag/deep-learning/) - IEEE Spectrum blog (http://spectrum.ieee.org/blog/cars-that-think) - http://deeplearning.net/ - Arxiv Sanity Preserver http://www.arxiv-sanity.com/ - ...
  • 86. Whom to follow? - Jürgen Schmidhuber (http://people.idsia.ch/~juergen/) - Geoffrey E. Hinton (http://www.cs.toronto.edu/~hinton/) - Google DeepMind (http://deepmind.com/) - Yann LeCun (http://yann.lecun.com, https://www.facebook.com/yann.lecun) - Yoshua Bengio (http://www.iro.umontreal.ca/~bengioy, https://www.quora.com/profile/Yoshua-Bengio) - Andrej Karpathy (http://karpathy.github.io/) - Andrew Ng (http://www.andrewng.org/) - ...
  • 88. Hardware: Overview Serious problems with the current processors are: ● energy efficiency (DeepMind used 1,202 CPUs and 176 GPUs) ● architecture (not well-suitable for brain-like computations)
  • 89. Computing power is growing ● Google TPU gen.2 ○ 180 TFLOPS? ● NVIDIA DGX-1 ($129,000) ○ 170 TFLOPS (FP16) ○ 85 TFLOPS (FP32) ● NVIDIA Tesla V100/P100 ○ 15/10.6 TFLOPS ○ 120 TFLOPS on V100 Tensor Core units ● NVIDIA GTX Titan X (Pascal [new] / Maxwell [old]) ($1000) ○ 11/6.1 TFLOPS (FP32) ● NVIDIA GTX 1080/1080 Ti ($700) ○ 8/11.3 TFLOPS (FP32) ● NVIDIA Drive PX-2 / PX ○ 8.0/2.3 TFLOPS
  • 90. (Sep 23, 2017) Inside iPhone 8: Apple's A11 Bionic introduces 5 new custom silicon engines “Creating an entirely new GPU architecture "wasn't innovative enough," so A11 Bionic also features an entirely new Neural Engine within its Image Signal Processor, tuned to solve very specific problems such as matching, analyzing and calculating thousands of reference points within a flood of image data rushing from the camera sensor. Those tasks could be sent to the GPU, but having logic optimized specifically for matrix multiplications and floating-point processing allows the Neural Engine to excel at those tasks. http://appleinsider.com/articles/17/09/23/inside-iphone-8-apples-a11-bionic-introduces-5-new-custom-silicon-engines Mobile AI: Apple
  • 91. (Aug 16, 2017) We are making on-device AI ubiquitous “In fact, the Hexagon DSP with Qualcomm Hexagon Vector eXtensions on Snapdragon 835 has been shown to offer a 25X improvement in energy efficiency and an 8X Improvement in performance when compared against running the same workloads (GoogleNet Inception Network) on the Qualcomm Kryo CPU. We have introduced the Snapdragon Neural Processing Engine (NPE) Software Developer Kit (SDK). This features an accelerated runtime for on-device execution of convolutional neural networks (CNN) and recurrent neural networks (RNN) — which are great for tasks like image recognition and natural language processing, respectively” https://www.qualcomm.com/news/onq/2017/08/16/we-are-making-device-ai-ubiquitous Mobile AI: Qualcomm
  • 92. FPGA/ASIC ● FPGA (field-programmable gate array) is an integrated circuit designed to be configured by a customer or a designer after manufacturing ● ASIC (application-specific integrated circuit) is an integrated circuit customized for a particular use, rather than intended for general-purpose use. ● Both FPGAs and ASICs are usually much more energy-efficient than general purpose processors (so more productive with respect to GFLOPS per Watt). ● OpenCL can be the language for development for FPGA, and more ML/DL libraries are using OpenCL too (for example, Caffe). So, there should appear an easy way to do ML on FPGAs. ● Bitcoin mining is another heavy-lifting task which passed the way from CPU through GPU to FPGA and finally ASICs. The history could repeat itself with deep learning.
  • 93. FPGA/ASIC custom chips There is a lot of movement to FPGA/ASIC right now: ● Mobileye chips with specially developed ASIC cores are used in BMW, Tesla, Volvo, etc. ● Microsoft develops Project Catapult that uses clusters of FPGAs https://blogs.msdn.microsoft.com/msr_er/2015/11/12/project-catapult-servers-available-to-academic-researchers/ ● Baidu tries to use FPGAs for DL http://www.hotchips.org/wp-content/uploads/hc_archives/hc26/HC26-12-day2-epub/HC26.12-5-FPGAs-epub/HC26.12.545-Soft-Def-Acc-Ouyang-baidu-v3--baidu-v4.pdf ● Altera (one of the FPGA monsters) was acquired by Intel in 2015. Intel is working on a hybrid Xeon+FPGA chip http://www.nextplatform.com/2016/03/14/intel-marrying-fpga-beefy-broadwell-open-compute-future/ ● Nervana plans to make a special chip to make machine learning faster (acquired by Intel) http://www.eetimes.com/document.asp?doc_id=1328523& ● Movidius (acquired by Intel) Myriad X VPU - a dedicated hardware accelerator for deep neural network inferences. https://www.movidius.com/myriadx
  • 94. ASIC: Google TPU ● (May 18, 2016) Google announced Tensor Processing Unit (TPU) ○ a custom ASIC built specifically for machine learning — and tailored for TensorFlow ○ Has been running TPUs inside Google’s data centers for more than a year. ○ Server racks with TPUs used in the AlphaGo matches with Lee Sedol https://cloudplatform.googleblog.com/2016/05/Google-supercharges-machine-learning-tasks-with-custom-chip.html https://cloudplatform.googleblog.com/2017/04/quantifying-the-performance-of-the-TPU-our-first-machine-learning-chip.html
  • 95. ASIC: Google TPU gen.2 ● (May 17, 2017) Build and train machine learning models on our new Google Cloud TPUs ○ Second generation of a custom ASIC built specifically for machine learning ○ Now supports training, not only inference ○ Enormous up to 180 teraflops of floating-point performance https://blog.google/topics/google-cloud/google-cloud-offer-tpus-machine-learning/ https://cloud.google.com/tpu/
  • 96. A “TPU pod” built with 64 second-generation TPUs delivers up to 11.5 petaflops of machine learning acceleration.
  • 98. FPGA: Intel DLIA (Nov 15, 2016) Intel Unveils FPGA to Accelerate Neural Networks The Intel Deep Learning Inference Accelerator (DLIA) combines traditional Intel CPUs with field programmable gate arrays (FPGAs), semiconductors that can be reprogrammed to perform specialized computing tasks. FPGAs allow users to tailor compute power to specific workloads or applications. http://datacenterfrontier.com/intel-unveils-fpga-to-accelerate-ai-neural-networks/
  • 99. ASIC: Intel Knights Mill (Aug 24, 2017) Intel Spills Details on Knights Mill Processor Knights Mill, a Xeon Phi processor tweaked for machine learning applications. Knights Mill represents the chipmaker’s first Xeon Phi offering aimed exclusively at the machine learning market, specifically for the training of deep neural networks. For the inferencing side of deep learning, Intel points to its Altera-based FPGA products, which are being used extensively by Microsoft in its Azure cloud. Knights Mill is scheduled for launch in Q4 of this year. https://www.top500.org/news/intel-spills-details-on-knights-mill-processor/
  • 100. ASIC: Intel Nervana NNP (Oct 17, 2017) Announcing Industry’s First Neural Network Processor Intel will ship the industry’s first silicon for neural network processing, the Intel® Nervana™ Neural Network Processor (NNP), before the end of this year (ex-Lake Crest processor). ● New memory architecture designed for maximizing utilization of silicon computation ● Massive bi-directional data transfer to achieve true model parallelism where neural network parameters are distributed across multiple chips. ● A new numeric format called Flexpoint https://newsroom.intel.com/editorials/intel-pioneers-new-technologies-advance-artificial-intelligence/
  • 101. Neuromorphic chips ● DARPA SyNAPSE program (Systems of Neuromorphic Adaptive Plastic Scalable Electronics) ● IBM TrueNorth; Stanford Neurogrid; HRL neuromorphic chip; Human Brain Project SpiNNaker and HICANN; Qualcomm. https://www.technologyreview.com/s/526506/neuromorphic-chips/ x http://www.eetimes.com/document.asp?doc_id=1327791
  • 102. Neuromorphic chips: Snapdragon 820 Over the years, Qualcomm’s primary focus had been to make mobile processors for smartphones and tablets. But the company is now trying to expand into other areas including making chips for automobile and robots as well. The company is also marketing the Kyro as its neuromorphic, cognitive computing platform Zeroth. http://www.extremetech.com/computing/200090-qualcomms-cognitive-compute-processors-are-coming-to-snapdragon-820
  • 103. Neuromorphic chips: IBM TrueNorth ● 1M neurons, 256M synapses, 4096 neurosynaptic cores on a chip, est. 46B synaptic ops per sec per W ● Uses 70mW, power density is 20 milliwatts per cm^2— almost 1/10,000th the power of most modern microprocessors ● “Our sights are now set high on the ambitious goal of integrating 4,096 chips in a single rack with 4B neurons and 1T synapses while consuming ~4kW of power”. ● Currently IBM is making plans to commercialize it. ● (2016) Lawrence Livermore National Lab got a cluster of 16 TrueNorth chips (16M neurons, 4B synapses, for context, the human brain has 86B neurons). When running flat out, the entire cluster will consume a grand total of 2.5 watts. http://spectrum.ieee.org/tech-talk/computing/hardware/ibms-braininspired-computer-chip-comes-from-the-future
  • 104. Neuromorphic chips: IBM TrueNorth ● (03.2016) IBM Research demonstrated convolutional neural nets with close to state of the art performance: “Convolutional Networks for Fast, Energy-Efficient Neuromorphic Computing”, http://arxiv.org/abs/1603.08270
  • 105. Neuromorphic chips: Intel Loihi (Sep 25, 2017) As part of an effort within Intel Labs, Intel has developed a first-of-its-kind self-learning neuromorphic chip – codenamed Loihi – that mimics how the brain functions by learning to operate based on various modes of feedback from the environment. This extremely energy-efficient chip, which uses the data to learn and make inferences, gets smarter over time and does not need to be trained in the traditional way. It takes a novel approach to computing via asynchronous spiking. It is up to 1,000 times more energy-efficient than general purpose computing required for typical training systems. In the first half of 2018, the Loihi test chip will be shared with leading university and research institutions with a focus on advancing AI. https://newsroom.intel.com/editorials/intels-new-self-learning-chip-promises-accelerate-artificial-intelligence/
  • 106. Neuromorphic chips: Intel Loihi ● Fully asynchronous neuromorphic many core mesh that supports a wide range of sparse, hierarchical and recurrent neural network topologies ● Each neuromorphic core includes a learning engine that can be programmed to adapt network parameters during operation, supporting supervised, unsupervised, reinforcement and other learning paradigms. ● Fabrication on Intel’s 14 nm process technology. ● A total of 130,000 neurons and 130 million synapses. ● Development and testing of several algorithms with high algorithmic efficiency for problems including path planning, constraint satisfaction, sparse coding, dictionary learning, and dynamic pattern learning and adaptation. https://newsroom.intel.com/editorials/intels-new-self-learning-chip-promises-accelerate-artificial-intelligence/
  • 107. Memristors ● Neuromorphic chips generally use the same silicon transistors and digital circuits that make up ordinary computer processors. There is another way to build brain inspired chips. https://www.technologyreview.com/s/537211/a-better-way-to-build-brain-inspired-chips/ ● Memristors (memory resistor), exotic electronic devices only confirmed to exist in 2008. The memristor's electrical resistance is not constant but depends on the history of current that had previously flowed through the device, i.e.the device remembers its history. An analog memory device. ● Some startups try to make special chips for low-power machine learning, i.e. Knowm http://www.forbes.com/sites/alexknapp/2015/09/09/this-startup-has-a-brain-inspired-chip-for-machine-learning/#5007095d51a2 http://www.eetimes.com/document.asp?doc_id=1327068
  • 109. Quantum Computing: D-Wave ● May 2013 Google teamed with NASA and launched Quantum AI Lab, equipped with a quantum computer from D-Wave Systems (D-Wave 2, 512 qubits). ● Aug 2015 D-Wave announced D-Wave 2X (1000+ qubits) ● Actually D-Wave computers are not full quantum computers.
  • 110. Quantum Computing: D-Wave ● (May 2013) “We’ve already developed some quantum machine learning algorithms. One produces very compact, efficient recognizers -- very useful when you’re short on power, as on a mobile device. Another can handle highly polluted training data, where a high percentage of the examples are mislabeled, as they often are in the real world. And we’ve learned some useful principles: e.g., you get the best results not with pure quantum computing, but by mixing quantum and classical computing.” https://research.googleblog.com/2013/05/launching-quantum-artificial.html
  • 111. Quantum Computing: D-Wave ● (Jun 2014) Yet results on the D-Wave 2 computer seem controversial: “Using random spin glass instances as a benchmark, we find no evidence of quantum speedup when the entire data set is considered, and obtain inconclusive results when comparing subsets of instances on an instance-by-instance basis. Our results do not rule out the possibility of speedup for other classes of problems and illustrate the subtle nature of the quantum speedup question.” http://science.sciencemag.org/content/early/2014/06/18/science.1252319
  • 112. Quantum Computing: D-Wave ● (Dec 2015) “We found that for problem instances involving nearly 1000 binary variables, quantum annealing significantly outperforms its classical counterpart, simulated annealing. It is more than 108 times faster than simulated annealing running on a single core. We also compared the quantum hardware to another algorithm called Quantum Monte Carlo. This is a method designed to emulate the behavior of quantum systems, but it runs on conventional processors. While the scaling with size between these two methods is comparable, they are again separated by a large factor sometimes as high as 108 .” https://research.googleblog.com/2015/12/when-can-quantum-annealing-win.html
  • 114. Quantum Computing: Google ● (Jul 2016) “ We have performed the first completely scalable quantum simulation of a molecule … In our experiment, we focus on an approach known as the variational quantum eigensolver (VQE), which can be understood as a quantum analog of a neural network. The quantum advantage of VQE is that quantum bits can efficiently represent the molecular wavefunction whereas exponentially many classical bits would be required. Using VQE, we quantum computed the energy landscape of molecular hydrogen, H2. https://research.googleblog.com/2016/07/towards-exact-quantum-description-of.html
  • 115. Quantum Computing: Google (May 2017) Google Plans to Demonstrate the Supremacy of Quantum Computing “Google’s quantum computing chip is a 2-by-3 array of qubits. The company hopes to make a 7-by-7 array later this year. By the end of this year, the team aims to increase the number of superconducting qubits it builds on integrated circuits to create a 7-by-7 array. With this quantum IC, the Google researchers aim to perform operations at the edge of what’s possible with even the best supercomputers, and so demonstrate “quantum supremacy.”” https://spectrum.ieee.org/computing/hardware/google-plans-to-demonstrate-the-supremacy-of-quantum-computing
  • 116. Quantum Computing: IBM (Sep 13, 2017) IBM Makes Breakthrough in Race to Commercialize Quantum Computers “IBM has been pushing to commercialize quantum computers and recently began allowing anyone to experiment with running calculations on a 16-qubit quantum computer it has built to demonstrate the technology.” https://www.bloomberg.com/news/articles/2017-09-13/ibm-makes-breakthrough-in-race-to-commercialize-quantum-computers “IBM announced on May 17, 2017 that it has successfully built and tested its most powerful universal quantum computing processors. Its upgraded 16 qubit processor (pictured) will be available for use by developers, researchers, and programmers to explore quantum computing using a real quantum processor at no cost via the IBM Cloud. IBM first opened public access to its quantum processors one year ago, to serve as an enablement tool for scientific research, a resource for university classrooms, and a catalyst of enthusiasm for the field. To date users have run more than 300,000 quantum experiments on the IBM Cloud” https://phys.org/news/2017-05-ibm-powerful-universal-quantum-processors.html
  • 117. Quantum Computing: Intel (Oct 10, 2017) Quantum Inside: Intel Manufactures an Exotic New Chip “Intel’s quantum chip uses superconducting qubits. The approach builds on an existing electrical circuit design but uses a fundamentally different electronic phenomenon that only works at very low temperatures. The chip, which can handle 17 qubits, was developed over the past 18 months by researchers at a lab in Oregon and is being manufactured at an Intel facility in Arizona. https://www.technologyreview.com/s/609094/quantum-inside-intel-manufactures-an-exotic-new-chip/ https://newsroom.intel.com/news/intel-delivers-17-qubit-superconducting-chip-advanced-packaging-qutech/
  • 118. Quantum Computing ● Quantum computers can provide significant speedups for many problems in machine learning (training of classical Boltzmann machines, Quantum Bayesian inference, SVM, PCA, Linear algebra, etc) and can enable fundamentally different types of learning. https://www.youtube.com/watch?v=ETJcALOplOA ● The three known types of quantum computing: ○ Universal Quantum: Offers the potential to be exponentially faster than traditional computers for a number of important applications: Machine Learning, Cryptography, Material Science, etc. The hardest to build. Current estimates: >100.000 physical qubits. ○ Analog Quantum: will be able to simulate complex quantum interactions that are intractable for any known conventional machine: Quantum Chemistry, Quantum Dynamics, etc. Could happen within next 5 years. It is conjectured that it will contain physical 50-100 qubits. ○ Quantum Annealer: a very specialized form of quantum computing. Suited for optimization problems. The easiest to build. Has no known advantages over conventional computing. http://www.research.ibm.com/quantum/expertise.html
  • 119. Hardware: Summary ● Ordinary CPUs are general purpose and not as effective as they could be ● GPUs are becoming more and more powerful each year (but still consuming a lot of power). ● ASICs/FPGAs are on the rise. We’ve already seen some and will probably see even more interesting announces this year. ● Neuromorphic chips etc. are probably much farther from the market (3-5 years?) while already show interesting results. ● Memristors are probably even farther, but keep an eye on them. ● Quantum computing: still unclear. Probably will be cloud solutions, not desktop ones.