SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
Shinnosuke Furuya, Ph.D., HPC Developer Relations, NVIDIA
2021/08/26
組み込みから HPC まで
ARM コアで実現するエコシステム
2
Automotive
NVIDIA Drive
HPC
NVIDIA Grace CPU
Network
NVIDIA BlueField DPU
AGENDA
3
“World’s Best Performing CEO”
HARVARD BUSINESS REVIEW
“100 Best Companies to Work For”
FORTUNE
“Best Places to Work in 2021”
GLASSDOOR
“World’s Best CEOs”
BARRON’S
“50 Smartest Companies”
MIT TECH REVIEW
“Most Innovative Companies”
FAST COMPANY
Founded in 1993 Jensen Huang, Founder & CEO 19,000 Employees $16.7B in FY21
4
AUTOMOTIVE
5
TEGRA SOC GENERATIONS
Tegra K1 Tegra X1 Parker
CPU Arm Cortex A15 (4 cores) Arm Cortex A57 (4 cores)
NVIDIA Denver 2 (2 cores)
Arm Cortex A57 (4 cores)
GPU Kepler (192 cores) Maxwell (256 cores) Pascal (256 cores)
Products
NVIDIA SHIELD Tablet
NVIDIA Jetson TK1
Google Chromebook
Nintendo Switch
Nintendo Switch Lite
NVIDIA SHIELD TV
NVIDIA Jetson TX1
NVIDIA Jetson Nano
Tesla
Mercedes Benz
Magic Leap 1
NVIDIA Jetson TX2
Xavier Orin Atlan
CPU NVIDIA Carmel (8 cores) Arm Cortex A78AE (12 cores) new Arm
GPU Volta (384 cores) next generation next generation
Products
Toyota
NVIDIA Jetson AGX Xavier
NVIDIA Jetson Xavier NX
NVIDIA DRIVE Pegasus
Mercedes Benz
Volvo
(many automotive companies)
6
NVIDIA ORIN
24.5 billion transistors
12 A78 (Hercules) ARM64 CPUs
254 INT8 TOPS - CUDA Tensor Core GPU + DLA
205 GB/s memory bandwidth
4 10Gbps ENET
8K 30 Dec | 4K 60 Enc – H264 / H265 / VP9
4 R52 Lock-step Pairs Integrated Safety Island ASIL-D
Secure key storage
FUSA ASIL-B Chip | ASIL-D Systematic
Advanced, Software-defined Platform
for Autonomous Machines
7
NVIDIA DRIVE ATLAN
Fusing Next Generation AI and BlueField
Industry’s First 1,000 TOPS SoC
400 Gbps Networking with Secure Gateway
ASIL-D Safety Island
TOPS is the New Horsepower
8
HYPERION 8
AV PLATFORM
2x Orin AV Computer
1x Orin IX Computer
4x Orin + 4x MLNX 3D GT Data Recorder
Sensor Suite: 8 Cameras [8MP], 4 Fisheyes [3MP], 3 In-
Cabin, 9 Radar, 2 Lidar
State-of-the-Art Advances for Data Collection,
Development and Testing
9
THE FUTURE CAR IS SOFTWARE DEFINED
10
NVIDIA DRIVE AV
11
HPC
12
GIANT MODELS PUSHING LIMITS OF EXISTING ARCHITECTURE
Requires a New Architecture
GPU 8,000 GB/sec
CPU 200 GB/sec
PCIE Gen4 (Effective Per GPU) 16 GB/sec
Mem-to-GPU 64 GB/sec
System Bandwidth Bottleneck
DDR4 HBM2e
GPU
GPU
GPU
GPU
x86
ELMo (94M)
BERT-Large
(340M)
GPT-2
(1.5B)
Megatron-LM
(8.3B)
T5 (11B)
Turing-NLG
(17.2B)
GPT-3 (175B)
0.00001
0.0001
0.001
0.01
0.1
1
10
100
1000
2018 2019 2020 2021 2022 2023
Model
Size
(Trillions
of
Parameters)
100 TRILLION PARAMETER MODELS BY 2023
13
ANNOUNCING NVIDIA GRACE
Breakthrough CPU Designed for Giant-Scale AI and HPC Applications
FASTEST INTERCONNECTS
>900 GB/s Cache Coherent NVLink CPU To GPU (14x)
>600GB/s CPU To CPU (2x)
NEXT GENERATION ARM NEOVERSE CORES
>300 SPECrate2017_int_base est.
Availability 2023
HIGHEST MEMORY BANDWIDTH
>500GB/s LPDDR5x w/ ECC
>2x Higher B/W
10x Higher Energy Efficiency
14
TURBOCHARGED TERABYTE SCALE ACCELERATED COMPUTING
CURRENT x86 ARCHITECTURE
DDR4 HBM2e
Evolving Architecture For New Workloads
INTEGRATED CPU-GPU ARCHITECTURE
LPDDR5x HBM2e
3 DAYS FROM 1 MONTH
Fine-Tune Training of 1T Model
REAL-TIME INFERENCE
ON 0.5T MODEL
Interactive Single Node NLP Inference
GPU
GPU
GPU
GPU
GRACE
GRACE
GRACE
GRACE
GPU
GPU
GPU
GPU
x86
Transfer 2TB in 30 secs Transfer 2TB in 1 secs
GPU 8,000 GB/sec
CPU 200 GB/sec
PCIE Gen4
(Effective Per GPU)
16 GB/sec
Mem-to-GPU 64 GB/sec
GPU 8,000 GB/sec
CPU 500 GB/sec
NVLink 500 GB/sec
Mem-to-GPU 2000 GB/sec
Bandwidth claims rounded to nearest hundred for illustration.
Performance results based on projections on these configurations Grace : 8xGrace and 8xA100 with 4th Gen NVIDIA NVLink Connection between CPU and GPU and x86: DGX A100.
Training: 1 Month of training is Fine-Tuning a 1T parameter model on a large custom data set on 64xGrace+64xA100 compared to 8xDGXA100 (16xX86+64xA100)
Inference: 530B Parameter model on 8xGrace+8xA100 compared to DGXA100.
15
ANNOUNCING
THE WORLD’S FASTEST
SUPERCOMPUTER FOR AI
20 Exaflops of AI
Accelerated w/ NVIDIA Grace CPU and NVIDIA GPU
HPC and AI For Scientific and Commercial Apps
Advance Weather, Climate, and Material Science
16
NETWORK
17
INTRODUCING NVIDIA BLUEFIELD-3 DPU
First 400Gb/s Data Processing Unit
Offloads and Accelerates Data Center Infrastructure
Isolates Application from Control and Management Plane
Powerful CPU – 16x Arm A78 Cores
Datapath Accelerator – 16x Cores, 256 Threads
Process Networking, Storage, and Security at 400 Gbps
18
INTRODUCING NVIDIA BLUEFIELD-3 DPU
First 400Gb/s Data Processing Unit
22 Billion Transistors
400Gb/s Ethernet & InfiniBand Connectivity
400Gb/s Crypto Acceleration
18M IOP/s Elastic Block Storage
300 Equivalent x86 Cores
CONNECTX-7
DATA PATH ACCELERATOR
PCIe GEN 5.0
DDR5 MEMORY INTERFACE
ARM CORES
ACCELERATION
ENGINES
19
BLUEFIELD DPU GENERATIONS
BlueField BlueField-2 BlueField-3
Port speed
2 x 100Gb/s
InfiniBand and Ethernet
2 x 100Gb/s, 1 x 200Gb/s
InfiniBand and Ethernet
1 x 400Gb/s, 2x200Gb/s
InfiniBand and Ethernet
Performance
Bandwidth: 200Gb/s
DPDK Max Msg Rate:150Mpps
RDMA max msg rate: 200Mpps
Bandwidth: 200Gb/s
DPDK Max Msg Rate: 215Mpps
RDMA max msg rate: 215Mpps
Bandwidth: 400Gb/s
DPDK max msg rate: 250Mpps
RDMA max msg rate: 330Mpps
Modulation NRZ NRZ & 50G PAM4 NRZ & 100G PAM4
DDR Channels DDR4-2400MT/s Dual channels DDR4-3200MT/s Single channel 2 x DDR5-5600 Interfaces
Max Arm Cores 16 x A72 Arm cores 8 x A72 Arm cores 16 x A78 Arm cores (Hercules)
Embedded ASIC ConnectX-5 ConnectX-6 Dx ConnectX-7
PCIe Gen3.0 x32 / Gen4.0 x16 Gen4.0 x16 Gen5.0 x32
20
NVIDIA DOCA
Enabling Broad BlueField Partner Ecosystem
Software Development Framework for BlueField DPUs
Offload, Accelerate, and Isolate Infrastructure Processing
Support for Hyperscale, Enterprise, Supercomputing and
Hyperconverged Infrastructure
Software Compatibility for Generations of BlueField DPUs
DOCA is for DPUs what CUDA is for GPUs
CYBER
SECURITY
EDGE
STORAGE
PLATFORM
INFRASTRUCTURE
ORCHESTRATION
MANAGEMENT
TELEMETRY
SECURITY NETWORKING STORAGE
ACCELERATION LIBRARIES
DOCA
21
BLUEFIELD-3 USE CASES
Unprecedented Innovation for Modern Data Centers
Cloud Computing
Bare-Metal I Virtualized I Containerized
Private I Public I Hybrid Cloud
Cyber Security
Distributed Security | NGFW I
Micro-segmentation
HPC & AI
Cloud-Native Supercomputing |
Accelerated DLRM
Telco & Edge
Telco Cloud | CloudRAN |
Edge Compute
Media Streaming
Visual High Quality I
8K Video I CDN
Data Storage
HCI I Elastic Block Storage I
Instance Storage
22
BLUEFIELD ENABLES
CLOUD-NATIVE
SUPERCOMPUTING
Collective offload with UCC accelerator
Smart MPI progression
User-defined algorithms
1.4X higher application performance
Multi-Tenancy with Zero-Trust Security
23
NVIDIA DPU ROADMAP
Exponential Growth in Data Center Infrastructure Processing
2020 2022
1X
10X
100X
BlueField-2
7B Transistors
9 SPECint
0.7 TOPS
200 Gbps
BlueField-3
22B Transistors
42 SPECint
1.5 TOPS
400 Gbps
BlueField-4
64B Transistors
160 SPECint
1000 TOPS
800 Gbps
2024
DOCA — ONE ARCHITECTURE
* BlueField-4 product to include opt-in GPU and non-GPU configurations
24
SUMMARY
25
3 CHIPS. YEARLY LEAPS. ONE ARCHITECTURE.
26
MEGATRON cuQUANTUM MORPHEUS
MERLIN
MAXINE CLARA METROPOLIS ISAAC DRIVE
AERIAL
APPLICATION FRAMEWORKS
CHIPS & SYSTEMS
PLATFORM SOFTWARE
NVIDIA ECOSYSTEM PLATFORM
AGX
DGX
RTX EGX
HGX
FLEET COMMAND
OMNIVERSE AI ENTERPRISE
VGPU
GPU CPU DPU
GEFORCE RIVA
27
SUMMARY
• Tegra SoC has a long history, and that experience has been applied to current Xavier, the next generation Orin, and
the next generation Atlan
• The future car is software defined, and NVIDIA provide whole ecosystem such as DRIVE Hyperion and DGX Systems
• Grace CPU is designed for giant-scale AI and HPC applications
• BlueField-3 DPU is the first 400 Gb/s data processing unit
• DOCA enables broad BlueField ecosystem
• GPU, CPU and DPU chips make a yearly leaps in one architecture
組み込みから HPC まで ARM コアで実現するエコシステム

Más contenido relacionado

La actualidad más candente

NVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October SummaryNVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October Summary
NVIDIA
 

La actualidad más candente (20)

BIG IP F5 GTM Presentation
BIG IP F5 GTM PresentationBIG IP F5 GTM Presentation
BIG IP F5 GTM Presentation
 
Server model-comparison
Server model-comparisonServer model-comparison
Server model-comparison
 
Dell Technologies Complete Portfolio on a single Page - ISO A0 Poster
Dell Technologies Complete Portfolio on a single Page - ISO A0 PosterDell Technologies Complete Portfolio on a single Page - ISO A0 Poster
Dell Technologies Complete Portfolio on a single Page - ISO A0 Poster
 
uCPE and VNFs Explained
uCPE and VNFs ExplaineduCPE and VNFs Explained
uCPE and VNFs Explained
 
Dell Technologies Portfolio On One Single Page - POSTER - v4b June 2020
Dell Technologies Portfolio On One Single Page - POSTER - v4b June 2020Dell Technologies Portfolio On One Single Page - POSTER - v4b June 2020
Dell Technologies Portfolio On One Single Page - POSTER - v4b June 2020
 
cisco-aci-virtualization-guide-52x
cisco-aci-virtualization-guide-52xcisco-aci-virtualization-guide-52x
cisco-aci-virtualization-guide-52x
 
NVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October SummaryNVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October Summary
 
Akraino and Edge Computing
Akraino and Edge ComputingAkraino and Edge Computing
Akraino and Edge Computing
 
Dell Technologies Dell EMC POWERMAX Storage On One Single Page - POSTER - v1a...
Dell Technologies Dell EMC POWERMAX Storage On One Single Page - POSTER - v1a...Dell Technologies Dell EMC POWERMAX Storage On One Single Page - POSTER - v1a...
Dell Technologies Dell EMC POWERMAX Storage On One Single Page - POSTER - v1a...
 
Dell Technologies Dell EMC ISILON Storage On One Single Page - POSTER - v1a S...
Dell Technologies Dell EMC ISILON Storage On One Single Page - POSTER - v1a S...Dell Technologies Dell EMC ISILON Storage On One Single Page - POSTER - v1a S...
Dell Technologies Dell EMC ISILON Storage On One Single Page - POSTER - v1a S...
 
The future of AIOps
The future of AIOpsThe future of AIOps
The future of AIOps
 
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
DCSF 19 Accelerating Docker Containers with NVIDIA GPUsDCSF 19 Accelerating Docker Containers with NVIDIA GPUs
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
 
5G Automotive, V2X Opportunity and Challenges
5G Automotive, V2X Opportunity and Challenges5G Automotive, V2X Opportunity and Challenges
5G Automotive, V2X Opportunity and Challenges
 
Arista Networks - Building the Next Generation Workplace and Data Center Usin...
Arista Networks - Building the Next Generation Workplace and Data Center Usin...Arista Networks - Building the Next Generation Workplace and Data Center Usin...
Arista Networks - Building the Next Generation Workplace and Data Center Usin...
 
IoT Device Fleet Management: Create a Robust Solution with Azure
IoT Device Fleet Management: Create a Robust Solution with AzureIoT Device Fleet Management: Create a Robust Solution with Azure
IoT Device Fleet Management: Create a Robust Solution with Azure
 
Cisco Digital Network Architecture - Introducing the Network Intuitive
Cisco Digital Network Architecture - Introducing the Network IntuitiveCisco Digital Network Architecture - Introducing the Network Intuitive
Cisco Digital Network Architecture - Introducing the Network Intuitive
 
Red Hat NFV solution overview
Red Hat NFV solution overview   Red Hat NFV solution overview
Red Hat NFV solution overview
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021
 
Cisco Security portfolio update
Cisco Security portfolio updateCisco Security portfolio update
Cisco Security portfolio update
 
A Software Defined WAN Architecture
A Software Defined WAN ArchitectureA Software Defined WAN Architecture
A Software Defined WAN Architecture
 

Similar a 組み込みから HPC まで ARM コアで実現するエコシステム

Similar a 組み込みから HPC まで ARM コアで実現するエコシステム (20)

Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfNVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
 
Hardware in Space
Hardware in SpaceHardware in Space
Hardware in Space
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
 
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation Webinar
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
GIST AI-X Computing Cluster
GIST AI-X Computing ClusterGIST AI-X Computing Cluster
GIST AI-X Computing Cluster
 
Advances in GPU Computing
Advances in GPU ComputingAdvances in GPU Computing
Advances in GPU Computing
 
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoTVEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
 
NVIDIA Keynote #GTC21
NVIDIA Keynote #GTC21 NVIDIA Keynote #GTC21
NVIDIA Keynote #GTC21
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
 
GTC 2016 Opening Keynote
GTC 2016 Opening KeynoteGTC 2016 Opening Keynote
GTC 2016 Opening Keynote
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
Nvidia at SEMICon, Munich
Nvidia at SEMICon, MunichNvidia at SEMICon, Munich
Nvidia at SEMICon, Munich
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
 

Último

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 

Último (20)

Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 

組み込みから HPC まで ARM コアで実現するエコシステム

  • 1. Shinnosuke Furuya, Ph.D., HPC Developer Relations, NVIDIA 2021/08/26 組み込みから HPC まで ARM コアで実現するエコシステム
  • 2. 2 Automotive NVIDIA Drive HPC NVIDIA Grace CPU Network NVIDIA BlueField DPU AGENDA
  • 3. 3 “World’s Best Performing CEO” HARVARD BUSINESS REVIEW “100 Best Companies to Work For” FORTUNE “Best Places to Work in 2021” GLASSDOOR “World’s Best CEOs” BARRON’S “50 Smartest Companies” MIT TECH REVIEW “Most Innovative Companies” FAST COMPANY Founded in 1993 Jensen Huang, Founder & CEO 19,000 Employees $16.7B in FY21
  • 5. 5 TEGRA SOC GENERATIONS Tegra K1 Tegra X1 Parker CPU Arm Cortex A15 (4 cores) Arm Cortex A57 (4 cores) NVIDIA Denver 2 (2 cores) Arm Cortex A57 (4 cores) GPU Kepler (192 cores) Maxwell (256 cores) Pascal (256 cores) Products NVIDIA SHIELD Tablet NVIDIA Jetson TK1 Google Chromebook Nintendo Switch Nintendo Switch Lite NVIDIA SHIELD TV NVIDIA Jetson TX1 NVIDIA Jetson Nano Tesla Mercedes Benz Magic Leap 1 NVIDIA Jetson TX2 Xavier Orin Atlan CPU NVIDIA Carmel (8 cores) Arm Cortex A78AE (12 cores) new Arm GPU Volta (384 cores) next generation next generation Products Toyota NVIDIA Jetson AGX Xavier NVIDIA Jetson Xavier NX NVIDIA DRIVE Pegasus Mercedes Benz Volvo (many automotive companies)
  • 6. 6 NVIDIA ORIN 24.5 billion transistors 12 A78 (Hercules) ARM64 CPUs 254 INT8 TOPS - CUDA Tensor Core GPU + DLA 205 GB/s memory bandwidth 4 10Gbps ENET 8K 30 Dec | 4K 60 Enc – H264 / H265 / VP9 4 R52 Lock-step Pairs Integrated Safety Island ASIL-D Secure key storage FUSA ASIL-B Chip | ASIL-D Systematic Advanced, Software-defined Platform for Autonomous Machines
  • 7. 7 NVIDIA DRIVE ATLAN Fusing Next Generation AI and BlueField Industry’s First 1,000 TOPS SoC 400 Gbps Networking with Secure Gateway ASIL-D Safety Island TOPS is the New Horsepower
  • 8. 8 HYPERION 8 AV PLATFORM 2x Orin AV Computer 1x Orin IX Computer 4x Orin + 4x MLNX 3D GT Data Recorder Sensor Suite: 8 Cameras [8MP], 4 Fisheyes [3MP], 3 In- Cabin, 9 Radar, 2 Lidar State-of-the-Art Advances for Data Collection, Development and Testing
  • 9. 9 THE FUTURE CAR IS SOFTWARE DEFINED
  • 12. 12 GIANT MODELS PUSHING LIMITS OF EXISTING ARCHITECTURE Requires a New Architecture GPU 8,000 GB/sec CPU 200 GB/sec PCIE Gen4 (Effective Per GPU) 16 GB/sec Mem-to-GPU 64 GB/sec System Bandwidth Bottleneck DDR4 HBM2e GPU GPU GPU GPU x86 ELMo (94M) BERT-Large (340M) GPT-2 (1.5B) Megatron-LM (8.3B) T5 (11B) Turing-NLG (17.2B) GPT-3 (175B) 0.00001 0.0001 0.001 0.01 0.1 1 10 100 1000 2018 2019 2020 2021 2022 2023 Model Size (Trillions of Parameters) 100 TRILLION PARAMETER MODELS BY 2023
  • 13. 13 ANNOUNCING NVIDIA GRACE Breakthrough CPU Designed for Giant-Scale AI and HPC Applications FASTEST INTERCONNECTS >900 GB/s Cache Coherent NVLink CPU To GPU (14x) >600GB/s CPU To CPU (2x) NEXT GENERATION ARM NEOVERSE CORES >300 SPECrate2017_int_base est. Availability 2023 HIGHEST MEMORY BANDWIDTH >500GB/s LPDDR5x w/ ECC >2x Higher B/W 10x Higher Energy Efficiency
  • 14. 14 TURBOCHARGED TERABYTE SCALE ACCELERATED COMPUTING CURRENT x86 ARCHITECTURE DDR4 HBM2e Evolving Architecture For New Workloads INTEGRATED CPU-GPU ARCHITECTURE LPDDR5x HBM2e 3 DAYS FROM 1 MONTH Fine-Tune Training of 1T Model REAL-TIME INFERENCE ON 0.5T MODEL Interactive Single Node NLP Inference GPU GPU GPU GPU GRACE GRACE GRACE GRACE GPU GPU GPU GPU x86 Transfer 2TB in 30 secs Transfer 2TB in 1 secs GPU 8,000 GB/sec CPU 200 GB/sec PCIE Gen4 (Effective Per GPU) 16 GB/sec Mem-to-GPU 64 GB/sec GPU 8,000 GB/sec CPU 500 GB/sec NVLink 500 GB/sec Mem-to-GPU 2000 GB/sec Bandwidth claims rounded to nearest hundred for illustration. Performance results based on projections on these configurations Grace : 8xGrace and 8xA100 with 4th Gen NVIDIA NVLink Connection between CPU and GPU and x86: DGX A100. Training: 1 Month of training is Fine-Tuning a 1T parameter model on a large custom data set on 64xGrace+64xA100 compared to 8xDGXA100 (16xX86+64xA100) Inference: 530B Parameter model on 8xGrace+8xA100 compared to DGXA100.
  • 15. 15 ANNOUNCING THE WORLD’S FASTEST SUPERCOMPUTER FOR AI 20 Exaflops of AI Accelerated w/ NVIDIA Grace CPU and NVIDIA GPU HPC and AI For Scientific and Commercial Apps Advance Weather, Climate, and Material Science
  • 17. 17 INTRODUCING NVIDIA BLUEFIELD-3 DPU First 400Gb/s Data Processing Unit Offloads and Accelerates Data Center Infrastructure Isolates Application from Control and Management Plane Powerful CPU – 16x Arm A78 Cores Datapath Accelerator – 16x Cores, 256 Threads Process Networking, Storage, and Security at 400 Gbps
  • 18. 18 INTRODUCING NVIDIA BLUEFIELD-3 DPU First 400Gb/s Data Processing Unit 22 Billion Transistors 400Gb/s Ethernet & InfiniBand Connectivity 400Gb/s Crypto Acceleration 18M IOP/s Elastic Block Storage 300 Equivalent x86 Cores CONNECTX-7 DATA PATH ACCELERATOR PCIe GEN 5.0 DDR5 MEMORY INTERFACE ARM CORES ACCELERATION ENGINES
  • 19. 19 BLUEFIELD DPU GENERATIONS BlueField BlueField-2 BlueField-3 Port speed 2 x 100Gb/s InfiniBand and Ethernet 2 x 100Gb/s, 1 x 200Gb/s InfiniBand and Ethernet 1 x 400Gb/s, 2x200Gb/s InfiniBand and Ethernet Performance Bandwidth: 200Gb/s DPDK Max Msg Rate:150Mpps RDMA max msg rate: 200Mpps Bandwidth: 200Gb/s DPDK Max Msg Rate: 215Mpps RDMA max msg rate: 215Mpps Bandwidth: 400Gb/s DPDK max msg rate: 250Mpps RDMA max msg rate: 330Mpps Modulation NRZ NRZ & 50G PAM4 NRZ & 100G PAM4 DDR Channels DDR4-2400MT/s Dual channels DDR4-3200MT/s Single channel 2 x DDR5-5600 Interfaces Max Arm Cores 16 x A72 Arm cores 8 x A72 Arm cores 16 x A78 Arm cores (Hercules) Embedded ASIC ConnectX-5 ConnectX-6 Dx ConnectX-7 PCIe Gen3.0 x32 / Gen4.0 x16 Gen4.0 x16 Gen5.0 x32
  • 20. 20 NVIDIA DOCA Enabling Broad BlueField Partner Ecosystem Software Development Framework for BlueField DPUs Offload, Accelerate, and Isolate Infrastructure Processing Support for Hyperscale, Enterprise, Supercomputing and Hyperconverged Infrastructure Software Compatibility for Generations of BlueField DPUs DOCA is for DPUs what CUDA is for GPUs CYBER SECURITY EDGE STORAGE PLATFORM INFRASTRUCTURE ORCHESTRATION MANAGEMENT TELEMETRY SECURITY NETWORKING STORAGE ACCELERATION LIBRARIES DOCA
  • 21. 21 BLUEFIELD-3 USE CASES Unprecedented Innovation for Modern Data Centers Cloud Computing Bare-Metal I Virtualized I Containerized Private I Public I Hybrid Cloud Cyber Security Distributed Security | NGFW I Micro-segmentation HPC & AI Cloud-Native Supercomputing | Accelerated DLRM Telco & Edge Telco Cloud | CloudRAN | Edge Compute Media Streaming Visual High Quality I 8K Video I CDN Data Storage HCI I Elastic Block Storage I Instance Storage
  • 22. 22 BLUEFIELD ENABLES CLOUD-NATIVE SUPERCOMPUTING Collective offload with UCC accelerator Smart MPI progression User-defined algorithms 1.4X higher application performance Multi-Tenancy with Zero-Trust Security
  • 23. 23 NVIDIA DPU ROADMAP Exponential Growth in Data Center Infrastructure Processing 2020 2022 1X 10X 100X BlueField-2 7B Transistors 9 SPECint 0.7 TOPS 200 Gbps BlueField-3 22B Transistors 42 SPECint 1.5 TOPS 400 Gbps BlueField-4 64B Transistors 160 SPECint 1000 TOPS 800 Gbps 2024 DOCA — ONE ARCHITECTURE * BlueField-4 product to include opt-in GPU and non-GPU configurations
  • 25. 25 3 CHIPS. YEARLY LEAPS. ONE ARCHITECTURE.
  • 26. 26 MEGATRON cuQUANTUM MORPHEUS MERLIN MAXINE CLARA METROPOLIS ISAAC DRIVE AERIAL APPLICATION FRAMEWORKS CHIPS & SYSTEMS PLATFORM SOFTWARE NVIDIA ECOSYSTEM PLATFORM AGX DGX RTX EGX HGX FLEET COMMAND OMNIVERSE AI ENTERPRISE VGPU GPU CPU DPU GEFORCE RIVA
  • 27. 27 SUMMARY • Tegra SoC has a long history, and that experience has been applied to current Xavier, the next generation Orin, and the next generation Atlan • The future car is software defined, and NVIDIA provide whole ecosystem such as DRIVE Hyperion and DGX Systems • Grace CPU is designed for giant-scale AI and HPC applications • BlueField-3 DPU is the first 400 Gb/s data processing unit • DOCA enables broad BlueField ecosystem • GPU, CPU and DPU chips make a yearly leaps in one architecture