On-Device AI

On-Device AI
Kyuwoong Hwang
Qualcomm AI Research, Qualcomm Korea YH
Qualcomm AI Research, is an initiative of Qualcomm Technologies, Inc.

Devices,
machines,
and things
are becoming
more intelligent

Source: Gartner Mar. ‘18
>8.6Billion
Cumulative smartphone unit shipments
forecast between 2018–2022
Mobile - THE most pervasive AI platform

Training
Execution and inference
Execution and inference
Training (emerging)
Process data closest to the
source,complement the cloud
Qualcomm Technologies
AI strategy
Privacy | Reliability | Low latency | Efficiency | Personalization

Consistent AI R&D investment is
the foundation for product leadership
Qualcomm®
Artificial Intelligence Research
Qualcomm AI Research is an organization within Qualcomm Technologies, Inc. Qualcomm Research is a division of Qualcomm
Technologies, Inc. Qualcomm Snapdragon and Qualcomm Neural Processing SDK are
products of Qualcomm Technologies, Inc. and/or its subsidiaries.
Consolidating long term, cutting-edge AI research efforts
Vision
Intelligence
Platform
Qualcomm®
Neural
Processing
SDK
1st Gen AI
(Snapdragon 820)
2nd Gen AI
(Snapdragon 835)
3rd Gen AI
(Snapdragon 845)
Snapdragon
660
Snapdragon
630
Brain Corp
raises $114M
Announced
Facebook
Caffe2 support
Collaboration
with Google on
TensorFlow
MWC demo
showcasing photo
sorting and hand
writing recognition
Acquired
EuVision
Opened Qualcomm
Research Netherlands
Research face
detection with deep
learning
Completed Brain
Corp joint research
Research artificial
neural processing
architectures
Investment and
collaboration with
Brain Corp
Research in spiking
neural networks
Qualcomm Research
initiates first AI project
2007
Deep-learning
based AlexNet wins
ImageNet competition
Qualcomm
Technologies
ships ONNX
supported
by Microsoft,
Facebook,
Amazon
201720152013 20142009 2012 2016 2018
Acquired
Scyfer
Announced
Qualcomm®
Neural
Processing SDK
Opened joint
research lab
with University
of Amsterdam
Qualcomm
Technologies
researchers
win best paper
at ICLR
Qualcomm AI
Research
2018

Evolution of AI on Snapdragon
Training
..prototxt + .weights
On-device
Execution/
Inference
CPU ops
ARM-V7A or NEON
Custom Runtime
2018201720162015
Snapdragon 820

Training
On-device
Execution/
Inference
NPE quantize (DSP)
NPE TF8 convert & opt
NPE quantize (DSP)
NPE C2 convert & opt
.(TF8) .pb (Caffe2) .pb
.DLC
2018201720162015
NEON OpenCL HVX
QSML Kernels Hexagon NN
Neural Processing SDK
CPU ops GPU ops DSP ops
Snapdragon 835

Training
Cognitive Toolkit
.pb .onnx .onnx .onnx.onnx .onnx .onnx
On-device
Execution/
Inference
2018201720162015
SNPE Model
Loader
Android NN
SNPE Model
Loader
Android NN
NEON OpenCL HVX
QSML Kernels Hexagon NN
Snapdragon Neural Network Core
CPU ops GPU ops DSP ops
Snapdragon 845
Android NN
SNPE Model
Loader
…
Parrots

Qualcomm Artificial Intelligence Engine (AI Engine)
Qualcomm Hexagon, Qualcomm Adreno ,Qualcomm Kryo, Qualcomm Artificial Intelligence Engine are products of Qualcomm Technologies Inc and its subsidiaries.
Qualcomm™
Hexagon
Hardware
• Qualcomm® Neural Processing SDK
• Android NN
• Hexagon NN
• Caffe/Caffe2
• TensorFlow/TensorFlow Lite
• ONNX
Ecosystem
• OEMs/ODMs
• ISVs
• Dev
• Cloud
Applications
& FeaturesINT8 networks FP32 and FP16
networks
FP32 and INT8
networks
Available on Snapdragon 660, 820, 835, 845, and 710. Available on Qualcomm QCS603 and QCS605.
SoftwareFrameworks
Qualcomm™
Adreno
Qualcomm™
Kryo

On-device AI
Benchmarking –
What Matters
Most
AlexNet
BN-AlexNet
BN-NIN
GoogLeNet
Inception-v3
Inception-v4
ResNet-101 ResNet-152
ResNet-18
ResNet-34
ResNet-50
VGG-16 VGG-19
50%
55%
60%
65%
70%
75%
80%
85%
0 5 10 15 20 25 30 35 40 45
Top-1Accuracy
G-Operations
Top-1 Accuracy versus G-Operations, for image classification networks
Efficient
Inefficient
Use Cases
Accuracy
Performance
Energy
Source: arXiv:1605.07678 [cs.CV]
Compute cost
too high for
mobile use
cases

11
Snapdragon AI - Accuracy
Source: AI Mark App
Model Accuracy Preservation
X
100% 99%100% 99%100% 99%
89%
96%
Inception-v3 ResNet-34
SDM660 - DSP SDM710 - DSP SDM845 - DSP Competitor's NPU

12
Snapdragon AI - Performance
17
24
93
84
31 36
131 128
32
23
140
18 14
30
Inception-v3 ResNet50 SqueezeNet MobileNet
SDM660 - DSP SDM710 - DSP SDM845 - DSP Competitor's NPU
142
AI Image Classification Performance (inf/sec)
Source: 3rd Party & Qualcomm Technologies internal testing/data.

Ongoing AI
Software
Optimizations
Performance across
Snapdragon SoC
portfolio doubled in a
year.
Cumulative AI software performance improvements over
the last 12 months – Inception v3 inf/sec
June 2017 May 2018October 2017 January 2018
PerformanceImprovements
2X
Source: Qualcomm Technologies internal testing/data.

Edges
Parts Objects
OutputPixels
Dog?
Cat?
Bayesian deep-learning
addresses these challenges
Inspired by brain functionality, introducing noise
to neural networks is beneficial
Noise can be a
good thing for AI
Compression
and quantization
• Reduce complexity
of the neural network model
• Reduce bit-width of the
parameters and activations
• Save power and
improve efficiency
Introduce noise to weights Noise propagates to activations

Apply Bayesian
deep-learning to
shrink the model
Compression
and quantization
• Quantize weights:
Use lower precision (bit-width)
• Prune activations:
Reduce number of activation
nodesEdges
Parts Objects
Output (Y)
Pixels
Dog
Cat
Add data (X)
w1
w2
Prune activations
(similar method as
quantization)
X
Quantize weights
(or even remove)
Prior P(W)—distribution before data (X)
w2
Posterior P(W| X,Y)—
distribution after data
w1 Initial weight values
w1 is pinned down
w2 is still highly uncertain
Introduce noise to parameters

Bayesian deep learning provides broad benefits
A powerful tool to address a variety of deep learning challenges
Compression
and quantization
Quantize parameters and
activations, prune model
components
Regularization
and generalization
Avoid overfitting data; choose
the simplest model to explain
observations (Occam’s razor)
Confidence
estimation
Generate the confidence
intervals of the predictions
Privacy/adversarial
robustness
Avoid storing personal
information in parameters, be
less sensitive to adversarial
attacks

83%
84%
85%
86%
87%
88%
89%
90%
1 1.5 2 2.5 3 3.5
Top5Accuracy
Compression ratio
ResNet-18 on ImageNet
Baseline Channel pruning SVD Bayesian pruning
Applying Bayesian deep learning to real use cases
Image classification
Zhang, Zou, He, & Sun 2015 (SVD);
He, Zhang, & Sun 2017 (channel pruning)
compression ratio while maintaining close to the same accuracy3X

1818
On-device AI Phone Use Cases
Speaker Recognition
Object
Location
Noise
Suppression
Security
& Liveness
Voice/Audio
Keyword
AR/VR
Language
Translation
Natural Language
Understanding
Face Detection &
Recognition
Landmark DetectionStyle Transfer
Text
Recognition
Spatial Audio
Detection
Video
Summarization
Scene
Classification
Gestures/Hand
Tracking
Diagnostics
& Power
Computational
Photography

AI Software
Partners and
features for
Snapdragon
AI Phones
Available today
Assisted
Photography
Face
Attributes
Relighting
Smart
Album
Image
Style
Transfer
2D
Face
Unlock
Bokeh
(single
camera)
Food
Classification
Object
Detection
People
detection
Inner
Magic
Animoji
Text
Translation

Snapdragon AI Cloud Partners
Windows ML
On-Device
Voice UI
AR
High-energy
Dance Studio
NN API

A Few AI Phones Powered by Snapdragon
On-device AI
Motorola X4
vivo X21UD
and X21
OPPO R15
Pro
Sony
Xperia X
Blackshark
Xiaomi Mi MIX 2S
Xiaomi
Mi6X
Asus
Zenfone 5Z
OnePlus 6
Smartisan r1

Adjusting
head rest
Driver recognized
Hi Carlos, I’m adjusting
the vehicle to your preferences
Adjusting steering
wheel height
Playing your kids’
favorite cartoon
Body temperature
is low; I’ll turn
down the AC
Changing from sport
to comfort mode
Traffic ahead
Taking your favorite
scenic route instead
Driver falling asleep
Playing your favorite
rock music
Surround view
Pedestrian is about
to cross the road
Icy road ahead
Slowing down

Uniquely positioned to make the intelligent wireless edge a reality
The future of AI
Is being built by Qualcomm Technologies with our customers
Multi-industry systems
design expertise
5G + AI
leadership
10+ years of AI
research
Broad industry
collaboration
Leader in wireless
edge development

On-Device Virtual Assistant
Kyuwoong Hwang
Qualcomm AI Research, Qualcomm Korea YH
Qualcomm AI Research, is an initiative of Qualcomm Technologies, Inc.

Reasoning
Learn, infer context,
and anticipate
Perception
Hear, see, and
observe
Action
Act intuitively, interact
naturally, and protect
privacy
AI brings human-like
understanding and
behaviors to the
machines

Advancing AI research to make on-device AI ubiquitous
A common platform is fundamental to scaling AI internally and across the industry
Power efficiency
Model design, compression, quantization,
activation, algorithms, and efficient hardware
Efficient learning
Robust learning through minimal data,
unsupervised learning, and on-device learning
Personalization
Continuous learning, model adaptation,
and privacy-preserved distributed learning
System architecture
Multi-task and multi-modal learning, sensor fusion, and cloud-edge systems
Action
Reinforcement learning
for decision making
Reasoning
Scene understanding, language
understanding, behavior prediction
Perception
Object detection, speech
recognition, contextual fusion
AutomotiveIoT Mobile

A true personal assistant
One of many use cases requiring a broad set of AI capabilities
Power efficiency
Model design, compression, quantization,
activation, algorithms, and efficient hardware
Efficient learning
Robust learning through minimal data,
unsupervised learning, and on-device learning
Personalization
Continuous learning, model adaptation,
and privacy-preserved distributed learning
AutomotiveIoT Mobile
Action
Reinforcement learning
for decision making
Reasoning
Scene understanding, language
understanding, behavior prediction
Perception
Object detection, speech
recognition, contextual fusion
System architecture
Multi-task and multi-modal learning, sensor fusion, and cloud-edge systems

v
Designed to be:
Always-on
Conversational
Personal
Private
Critical to create a
true virtual assistant
Voice is the
transformative user
interface (UI) we’ve
been waiting for
AR

Voice UI components required for an end-to-end solution
Text-to-speech
Speech
synthesis
Natural language generation
Dialog management
Natural language understanding
Natural
language
processing
Speech-to-text
Speech
recognition
“Alexa,” “Hey Snapdragon”
Always-on keyword detection
Voice
activation
Echo cancellation
Speech denoising
Speech
pre-processing
Signal acquisition and playback
Front-end
processing
Machine speech chain: listener and speaker
Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.

Machine learning has ignited the voice UI revolution
“As speech recognition accuracy goes from say 95% to 99%, all of us in the room will go from barely using it today to using it all
the time. Most people underestimate the difference between 95% and 99% accuracy— 99% is a gamechanger. No one wants
to wait 10 seconds for a response. Accuracy, followed by latency, are the two key metrics for a production speech system.”
— Andrew Ng
GMM: Gaussian Mixture Model, CNN: Convolutional Neural Network, RNN: Recurrent Neural Network
Human accuracy
GMM RNN + CNN
Machine automatic speech recognition accuracy
202020102000199019801970
50%
CNN
55%
60% 62%
70%
95%

Moving voice UI functionality to the end device
An end-to-end solution powered by machine learning
Automatic speech
recognition
(ASR)
Text-to-speech
(TTS)
News
SMS
Music
Maps Wikipedia
Weather Stocks
Voice
activation
Service manager
Multi-mic echo
cancellation,
beamforming,
and speech
denoising
Natural language
understanding
(NLU)
On-device processing
(always-on and real-time)
Cloud processing
(services)
Cloud centric (today)

Moving voice UI functionality to the end device
An end-to-end solution powered by machine learning
Automatic speech
recognition
(ASR)
Text-to-speech
(TTS)
News
SMS
Music
Maps Wikipedia
Weather Stocks
Voice
activation
Service manager
Multi-mic echo
cancellation,
beamforming,
and speech
denoising
Natural language
understanding
(NLU)
On-device processing (always-on and real-time) Cloud processing (services)
On-device centric (future)

Cloud tasks
Complex voice fallback
Training and model update
Knowledge base
Services
On-device
processing
of voice UI
Provides unique benefits
complementing the cloud
Challenge
Providing the voice UI functionality
within the power/thermal envelope
Machine learning models
Offline raw data
Queries
On-device tasks
Automatic speech recognition
Natural language processing
Always-on audio cognition
On-device training
Benefits
Privacy
Instant response
Always-on
Device context

Noisy speech spectrogram
Clean speech spectrogram
“If people were more generous,
there would be no need for welfare”
0.5 1 1.5 2 2.5 3
4000
3000
2000
1000
0
Frequency
Time
0.5 1 1.5 2 2.5 3
4000
3000
2000
1000
0
Frequency
Time
DL-based denoising model
trained with extensive speech
noise databases
DL-based
denoising
Speech
denoising
• Single or multiple mics
• Applicable for
◦ Two-way conversation
◦ Voice/speaker recognition
◦ Keyword spotting
• Deep learning (DL)
significantly improves the
performance over traditional
methods
• Robust in challenging
interference and noise
scenarios
“If people were more generous,
there would be no need for welfare”

Qualcomm
Voice
Activation
supports:
Qualcomm® Voice
Activation (VA)
High accuracy, robust to background
noise, and supports multiple languages
Deep learning is improving performance
Among state-of-the-art in terms of
performance vs. power consumption
-47%
-11%
2014 2015 2016 2017
QualcommVApowerconsumption
WCD9330
WCD9335
WCD9340
Amazon Alexa
Baidu DUEROS
Microsoft Cortana
Google Assistant
Qualcomm Voice Activation, Qualcomm WCD9330, Qualcomm WCD9335, and Qualcomm WCD9340 are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

Transcribe the
audio to text
Deep learning gives
state-of-the-art accuracy
on a mobile device
Personalization—adaptation
to individual accent and
acoustic environment
Automatic
speech
recognition
Natural language
understanding (NLU)
Acoustic features Acoustic model
Reduce input audio to
essential information
Language model
Deep learning converts
input into linguistic units
Adapted to each user’s
accent and environment
Uses context and language
statistics for best utterance
estimation
speaking tendencies
Allows the same intention
to be expressed in multiple
ways
intent expressions
“Turn → on → the → light”
On-device automatic speech recognition (ASR)
User intention

An end-to-end on-device voice UI example for smart homes
99% on-device intent accuracy
is achieved for domain specific command sets when adapted to accent and environmental condition
Demo of automatic speech recognition and natural language understanding
Large command set
Turn on the living room lights
Click the kitchen lights off
Turn off all lights
Switch on the ceiling fan
Shut off the sprinklers
Start music
Pause song
Next track
Go back one
Play previous song
Turn speaker off
Increase temperature
Intent understanding
Turn on the kitchen light
Click kitchen light on
Switch on light in the kitchen
Turn the light on in the kitchen
NLU: These four phrases
map to the same intent

A true virtual
assistant
A “digital me” sitting on the device:
context aware and personalized

Calendar
Messaging
Apps
On-device data
Contextual intelligence is required for personalization
The fusion of many types of sensors and personal information
Low power sensing, processing, and connectivity
Efficient, heterogeneous
architectures
Sensor fusion and
machine learning
Integrated, always-on
data capturing
Low-energy wireless technologies
(e.g. BT-LE, 5G NR IoT)
Cloud data
Off-device data
IoT data
Sensor fusion
Gyroscope
Compass
Camera
Ambient light
Temperature
Iris scanEnvironment
Pulse
HumidityMicrophone
Sensor data
C-V2X

Creating personalized memories
Essential for a true virtual assistant
History, number
of people, identity
After the party, strolling on the
beach at sunset in La Jolla talking
with my son and laughing
Live sentiment
analysis
Strolling on the beach at sunset
in La Jolla talking with my son
and laughing
Activity analysis
Strolling on the beach at sunset
in La Jolla talking with my son
GPS location
La Jolla, California
Visual analysis
A sunset over the ocean
in La Jolla
Sound analysis
Talking with my son at sunset in
La Jolla

A true personal assistant is responsive and proactive
“Remember the time I was strolling with
my son after the party at La Jolla beach?”
“Yes I do, here is a picture you took of the sunset.
Should I share it with your family group?”
Responsive
Decision-making and conversation based
on contextual analysis and prompting
(e.g. finding memories)
Proactive
Decision-making and conversation based
on contextual analysis without prompting
(e.g. automatically sharing memories)
“I noticed that you are tired and stressed, I’m turning
on the Rocky III soundtrack and navigating you
to the gym for a workout and sauna.”
“This music gets my blood going and a
workout and sauna will help me relieve stress.”

Multi-mic echo
cancellation,
beamforming,
and speech
denoising
The first step to an on-device virtual assistant
Enabling on-device voice UI
News
SMS
Music
Maps Wikipedia
Weather Stocks
Voice
activation
Service manager
Automatic speech
recognition (ASR)
Natural language
understanding (NLU)
On-device processing Cloud processing
Text-to-speech
(TTS)

Adding an “AI agent” to create a true virtual assistant
The on-device AI agent continuously learns personal knowledge and acts intuitively
On-device processing Cloud processing (services)
Cloud knowledge graph
Automatic speech
recognition (ASR)
Multi-mic echo
cancellation and
beamforming
Sensors
Text-to-speech
(TTS)
Natural language
understanding (NLU)
Voice
activation
AI agent

Adding an “AI agent” to create a true virtual assistant
Contextualization allows personalization at acoustic, intent, and behavior levels
Cloud processing (services)
Automatic speech
recognition (ASR)
Voice
activation
Multi-mic echo
cancellation and
beamforming
Cloud knowledge
graph
On-device processing
Text-to-speech
(TTS)
Sensors
Natural language
understanding (NLU)
AI agent
Dialog
management
Speaker identification
Acoustic event detection
Gender and age detection
Voice activity detection
Emotion classification
Contextual fusion
and learning
Local knowledge
graph

We are advancing AI research
to make on-device AI ubiquitous
We are creating AI platform
innovations that are fundamental
to scaling AI across the industry
We provide the low-power
end-to-end on-device solution
for a true personal assistant

On-Device AI

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a On-Device AI

Similar a On-Device AI (20)

Más de LGCNSairesearch

Más de LGCNSairesearch (7)

Último

Último (20)

On-Device AI