Qualcomm Centriq Arm-based Servers for Edge Computing at ONS 2018
1. Qualcomm Centriq™ Arm-based Servers for
Edge Computing
World’s First 10nm Server Processor
Chaitali Sengupta, PhD
Sr Director, Technology
Qualcomm Datacenter Technologies, Inc.
March 27, 2018 Open Networking Summit, North America
Qualcomm Centriq is a product of Qualcomm Datacenter Technologies, Inc
2. 2
What is “Edge”
. . .
EDGE
Centralized CloudDevices / Premises Edge Cloud
Cloud Service Providers |
Datacenters
◦ > 100 ms latency
◦ 5-10 per operator or cloud service
provider
◦ 100s-1000s of server racks per site
Edge Cloud | Cloudlets |
Edge Gateways
◦ 5-20 ms latency
◦ Few server racks per site
Smartphones | Connected Cars |
Drones | IoT | Enterprise | Homes
◦ Customer Devices:
<2 ms latency for millions of devices
◦ Customer Premises:
<5 ms latency for thousands of devices
6. 6
Qualcomm
Centriq™
2400
Qualcomm® Falkor™ CPU
5th-Generation Custom Core Design /
ARMv8-Compliant
Highly Integrated Server SoC
Single Chip Platform-level Solution /
ARM SBSA Level 3 Compliant /
60 MB L3 cache /
32 Lanes PCIe Gen3
High core count
Up to 48 cores / 2.6 GHz all cores peak
frequency
Qualcomm Centriq is a product of Qualcomm Datacenter Technologies, Inc
Purpose-built for Edge and
Centralized Cloud
Qualcomm Centriq & Falkor are a product of Qualcomm Datacenter Technologies, Inc
• Network Function Virtualization infrastructure support via OPNFV and leading
partners to enable components such as OpenStack, DPDK, etc.
• Micro services, Containers, Virtualization support (e.g. KVM, Docker)
Software ecosystem to enable
NFV and Edge Computing
7. 7
Why the Centriq™ 2400 Server Processor is a good fit for Cloudlets / Edge
• High thread density and high
performance per thread at lower power
– large number of VM-s and
containers running independent
processing for radio and edge
application for multiple
bearers/users/services
• Thread isolation and predictable
latency
• Quality of service features to ensure
resources are allocated fairly and no
one service hogs them
– Isolation between multiple
bearers/users/services
– Service each user in real time
13.9 12.8 13.613.7 13.8 14.1
~Parity 8% better 4% better
Qualcomm Centriq 2460
120W TDP
vs.
Intel Xeon Platinum 8180
205W TDP*
Qualcomm Centriq
2452 120W TDP
vs.
Intel Xeon Gold
6152 140W TDP*
Qualcomm Centriq 2434
110W TDP
vs.
Intel Xeon Silver 4116
85W TDP*
SPECint®_rate2006EstimateperThread
Intel Xeon
Qualcomm
Centriq
Performance per thread leadership vs. top end Intel Xeon
Throughput performance leadership at same thread count
Qualcomm Centriq is a product of Qualcomm Datacenter Technologies, Inc
8. 8
Why the Centriq™ 2400 Server Processor is a good fit for Cloudlets / Edge
• Performance per watt and low
power leadership
– suitable for limited power
budget at edge cloud
– Cloudflare:
“Although it has a TDP of
120W, during my tests it
never went above 89W (for
the go benchmark). In
comparison Skylake and
Broadwell both went over
160W, while the TDP of the
two CPUs is 170W.”
3.8 4.0 3.9
5.5 5.3 5.1
SPECint®_rate2006EstimateperTDPW
45% better 32% better 31% better
Intel Xeon
Qualcomm
Centriq
Qualcomm Centriq 2460 120W TDP
vs.
Intel Xeon Platinum 8180 205W TDP*
Qualcomm Centriq 2452 120W TDP
vs.
Intel Xeon Gold 6152 140W TDP*
Qualcomm Centriq 2434 110W TDP
vs.
Intel Xeon Silver 4116 85W TDP*
12,403
16,311
15,393
161
165
72
0
50
100
150
200
0
6,000
12,000
18,000
Broadwell Skylake Centriq
Powerconsumption(w)
Requests/second
NGINX test data CPU powerNGINX
Equivalent performance: Qualcomm Centriq 46-core comparable to
two Skylake 12-core processors
Performance per Watt leadership vs. top end Intel Xeon
https://blog.cloudflare.co
m/arm-takes-wing/ :
Centriq “managed to get
214 requests/watt vs the
Skylake’s 99 requests/watt
and Broadwell’s 77”.
Qualcomm Centriq is a product of Qualcomm Datacenter Technologies, Inc
9. 9
Why the Centriq™ 2400 Server Processor is a good fit for Cloudlets / Edge
• More performance per CPU$
and TCO leadership
– Cost efficiency is essential
for edge computing to be
viable
• Analysis by Cloudflare
– Improved server density
per rack at same power
>> cost efficiency
Qualcomm Centriq 2460 120W TDP
vs.
Intel Xeon Platinum 8180 205W TDP*
Qualcomm Centriq 2452 120W TDP
vs.
Intel Xeon Gold 6152 140W TDP*
Qualcomm Centriq 2434 110W TDP
vs.
Intel Xeon Silver 4116 85W TDP*
12,403
16,311
15,393
161
165
72
0
50
100
150
200
0
6,000
12,000
18,000
Broadwell Skylake Centriq
Powerconsumption(w)
Requests/second
NGINX test data CPU powerNGINX
Equivalent performance: Qualcomm Centriq 46-core comparable
to two Skylake 12-core processors
Performance per CPU $ vs. top end Intel Xeon Platinum, Gold, and Silver
0.08
0.15
0.330.33
0.46
0.64
2x
better
SPECint®_rate2006EstimateperCPU$
Intel Xeon
Qualcomm
Centriq
3x
better
4x
better
40
servers
60
servers
Skylake Centriq
Improved density
Can fit 60 Qualcomm servers per cabinet
using same power as 40 Intel Skylake servers
Qualcomm Centriq is a product of Qualcomm Datacenter Technologies, Inc
10. 10
Enabling Edge Computing on Qualcomm Centriq™ Arm-based Servers
• Data Plane performance is critical for most
Edge Computing use cases
• Network scaling from 10 to 40 to 100GigE
consumes progressively more CPU resources
• As a result of data plane processing optimization and the
higher core count of Centriq:
– Centriq™ 2400 offers head room for significantly more
additional compute at same networking line rate
performance as comparable Intel Skylake
Edge-specific
Optimizations
Packet processing, Containers,
Networking acceleration
Leverage NFV
Infrastructure
OPNFV Release D, E
OpenStack Nova, Neutron, Keystone
Virtualization &
Containers
KVM, Kubernetes, Docker
Data Plane
Optimizations
DPDK, OVS, FD.IO/VPP
NIC/SmartNIC
HW & SW
Optimizations
Mellanox, Netronome
* Based on QDT internal benchmarking of Centriq™ 2400 with Intel Skylake Gold 6152
Ecosystem Components
11. 1111
Edge Computing Use Case on Centriq™: Cloud Gaming
48 Cores.
More Cloud Apps. More Instances Serviced.
Qualcomm Centriq™ 2400 Server Platform
OS: Ubuntu
LXD + AnBox
Android Runtime
Gaming
App
Container
Android Runtime
Gaming
App
Container
32-bit to 64-bit Binary Translator
• Game logic runs on
server
• Real time rendering
in the client
(Android, iOS, any
other OS)
• Client takes user
input in real time
Qualcomm Centriq is a product of Qualcomm Datacenter Technologies, Inc
• Exceptional compute density for a high
number of game instances per CPU socket
• Full gaming experience live at >60fps
• Real time response with low latency
• Low CPU utilization
12. 1212
Call for Action.
As edge computing
popularizes, edge computing
SW infrastructure will
proliferate.
We (the ecosystem) need to
work together to ensure it all
gets ported to and optimized
on Arm!
• http://www.etsi.org/technologies-clusters/technologies/multi-
access-edge-computing
• https://www.openstack.org/edge-computing/
• https://www.openfogconsortium.org/
• http://openedgecomputing.org/
• https://www.akraino.org/
• http://www.telecominfraproject.com/project-groups-2/access-
projects/edge-computing/
• … …
EnablingEdgeComputingonArm viaIndustryInitiatives