SlideShare una empresa de Scribd logo
1 de 15
Descargar para leer sin conexión
13th ANNUAL WORKSHOP 2017
INTEL® OMNI-PATH STATUS, UPSTREAMING
AND ONGOING WORK
Todd Rimmer, Omni-Path Lead Software Architect
March, 2017
Intel Corporation
OpenFabrics Alliance Workshop 2017
LEGAL DISCLAIMERS
2
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED
BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED
WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY
PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH
MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS
AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING
IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF
ITS PARTS.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or
"undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without
notice. Do not finalize a design with this information.
The cost reduction scenarios described in this document are intended to enable you to get a better understanding of how the purchase of a given Intel product, combined with a number of situation-specific variables,
might affect your future cost and savings. Circumstances will vary and there may be unaccounted-for costs related to the use and deployment of a given product. Nothing in this document should be interpreted as either
a promise of or contract for a given level of costs.
Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or
configuration may affect your actual performance.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to: Learn About Intel® Processor Numbers
All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm
The High-Performance Linpack (HPL) benchmark is used in the Intel® FastFabrics toolset included in the Intel® Fabric Suite. The HPL product includes software developed at the University of Tennessee, Knoxville,
Innovative Computing Libraries.
Intel, Intel Xeon, Intel Xeon Phi™ are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.
Copyright © 2016, Intel Corporation
OpenFabrics Alliance Workshop 2017
AGENDA
 Omni-Path Overview
 Omni-Path Innovations
 Omni-Path Upstreaming
 Omni-Path Ongoing Work
3
OpenFabrics Alliance Workshop 2017
OMNI-PATH OVERVIEW
Omni-Path is a new fabric technology
• Omni-Path is NOT InfiniBand
Significant deployments in Nov 2016 Top 500 list
 Clusters  28
 Flops  43.7PF
 Top10  1 system (#6)
 Top15  2 system (#6, #12)
 Top50  4 systems
 Top100  10 systems
 Xeon Efficiency  88.5%
 HPCG list  #3
4
OpenFabrics Alliance Workshop 2017
INTEL® OMNI-PATH HOST SOFTWARE COMPONENTS
5
Mgmt
Node 1
Mgmt
Node 2
File Sys
Server
Boot
Server
Boot
Server
Login
Server
Login
Server
CN0 CN1 CN2 CN3 CN4 CNn
Compute Nodes
LAN/WAN
Director
Switches
Edge Switches
Intel® OPA Fabric
File Sys
Server
Host Software Stack
• Runs on all Intel® OPA-connected
host nodes
• High performance, highly
scalable MPI implementation
and extensive set of upper layer
protocols
Fabric Management Stack
• Runs on OPA-connected
management nodes
• Initializes, configures and
monitors the fabric routing, QoS,
security, and performance
• Includes toolkit for TCO
functions: configuration,
monitoring, diags, and repair
Mgmt
Work
Station
Fabric Management GUI
• Runs on workstation with a
local screen/keyboard
• Provides interactive GUI access
to Fabric Management TCO
features (configuration,
monitoring, diagnostics,
element management drill
down)
LAN/
WAN
OpenFabrics Alliance Workshop 2017
LAYER 4: TRANSPORT LAYER AND KEY SOFTWARE
 Open Fabrics Interface (OFI)
libfabric
• Framework providing an API
applications and middleware can use
for multiple vendors and L4 protocols
• Allows vendor innovation w/o
requiring application churn
 Performance Scaled Messaging
(PSM)
• API and corresponding L4 protocol
designed for the needs of HPC
 Open Fabrics Alliance verbs
• API and corresponding L4 protocol
designed for RDMA IO
6
OpenFabrics Alliance Workshop 2017
OMNI-PATH LINK LAYER INNOVATIONS
 8K and 10K MTUs
 Up to 31 VLs and 32 SLs
 LIDs extended to 24 bits
 Packet Integrity Protection - link level retry
 Dynamic Lane Scaling - link width downgrade
 Traffic Flow Optimization – packet preemption
 SCs – topology deadlock avoidance
7
OpenFabrics Alliance Workshop 2017
 Optimized management protocols
• 2K MAD packets
• Allowing more efficient SMA, PMA, SA, PA protocols
• New attributes, attribute layouts, and fields
• P-Key based security for SMA packets
• New and extended PMA counters
• Architected PA protocol
 Optimized SA clients
• ibacm plug-ins, ib_sa use of ibacm
 Many new HW features, diagnostics, controls
• Drives new SMA and PMA attributes, fields, attribute layouts
 IB compatibility for key name services SA
queries
8
FABRIC MANAGEMENT
Scalability for Exascale
• PathRecord
• MulticastMemberRecord
• ServiceRecord
• NodeRecord
• Notices
OpenFabrics Alliance Workshop 2017
OMNI-PATH OPEN SOURCE STRATEGY
 ALL OPA specific host software is
open sourced and upstreamed
• Drivers
• Providers/libraries
• Management software, including FM GUI
• OFA enhancements
• 2K MADs, extended LIDs, etc
• Included in RHEL 7.3 and SLES 12 sp2
9
OpenFabrics Alliance Workshop 2017
OMNI-PATH OPEN SOURCE STRATEGY
 OFA enhancements via community collaboration
• Ensuring ongoing app interop and multi-vendor OFA
 Older distros supported via “delta package”
• Intel shipped additions to standard distro
• Minimal footprint of changes
• Surgically add OPA enablement
• Without changing fundamental OFA rev in distro
 Delta package used for new OPA features and optimizations
• Rapid delivery of OPA optimizations, fixes, special features
• src.rpms and open source on github/01org
 Create proper API abstractions to allow vendor innovation
• OFI/Libfabric
• Allow vendor innovation in drivers/libraries without OFA API churn
10
OFA and kernel.org
Developer
Intel OFA Delta
for each distro
distro
OpenFabrics Alliance Workshop 2017
INTEL OFA DELTA
11
OFA Common Delta
OFA Common in-distro
OPA Specific Delta OPA Specific in-distro
 Minimize distro changes
 Avoid replacing common
OFA code in distro
• Changes driven through standard
open processes
• OPA support fully in current distros
 OPA specific value adds
• New features not yet in distro
• Items where Intel is Maintainer
• HFI Driver, OPA libraries/providers,
OPA Mgmt
• All open sourced and upsteamed
OpenFabrics Alliance Workshop 2017
CHALLENGES
 OFA core and verbs are very InfiniBand oriented
• Even forced design decisions in OPA HW architecture
• No forward compatible way for apps to identify OPA cards/features
• Dependency on IB specific enums, such as speed, rate, MTU, address ranges
 Support for extended LIDs not ideal
• Didn’t want to break existing APIs/ABIs
• Forced to overload fields in ibv_global_route to carry extended LID fields
 Difficult to get extended MTU into some components
• Don’t want to break existing APIs nor ABIs
• IPoIB UD mode on OPA limited to 4K MTU
• Apps that use SA PathRecord or RDMA CM can be given 8K MTU
• Works if pass through as given, issues if try to display or interpret
• Assorted verbs benchmarks limited to 4K MTU
 Mgmt Interfaces assume IB MADs
• example: SRP needed IB PortInfo records from SA to find targets with DMA
• example: SMPs can’t be sent from a UD QP
12
OpenFabrics Alliance Workshop 2017
ONGOING WORK
 Continued OFA and fabric scalability work
 Raw packet snoop/injection interface to driver
• Enable wireshark, fabric HW testing, fabric simulation, HW emulation
 Improved trace and debug features
• Historical traces for “after incident” analysis without log clutter
 SA RMPP stack scalability
• Timeouts too large, need way to timeout a large query quickly when no response
• Get ALL SA queries through a common code path (ibacm and plugins)
 Introduction of vNIC driver
• Ethernet over fabric, optimized for Omni-Path
• New Ethernet Management Agent (EMA) in hosts
• Allow use of vNIC IP addresses as fabric address in RDMACM, ibacm
13
OpenFabrics Alliance Workshop 2017
SUMMARY
 Intel® Omni-Path is a new fabric
• In production and installed at many sites, including top500 sites
 Omni-Path is not InfiniBand, many HW & SW innovations
• Extended LIDs, 2K MADs, TFO, etc
 Omni-Path is open sourced and integrated into OFA
• Some challenges, but now upstream and in standard distros
• Delta distribution model to support older distros
 Omni-Path work is ongoing
• Community collaboration
14
13th ANNUAL WORKSHOP 2017
THANK YOU
Todd Rimmer, Omni-Path Lead Software Architect
Intel Corporation

Más contenido relacionado

La actualidad más candente

Serving Deep Learning Models At Scale With RedisAI: Luca Antiga
Serving Deep Learning Models At Scale With RedisAI: Luca AntigaServing Deep Learning Models At Scale With RedisAI: Luca Antiga
Serving Deep Learning Models At Scale With RedisAI: Luca Antiga
Redis Labs
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
inside-BigData.com
 
The HPE Machine and Gen-Z - BUD17-503
The HPE Machine and Gen-Z - BUD17-503The HPE Machine and Gen-Z - BUD17-503
The HPE Machine and Gen-Z - BUD17-503
Linaro
 
Challenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPIChallenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPI
inside-BigData.com
 

La actualidad más candente (20)

Serving Deep Learning Models At Scale With RedisAI: Luca Antiga
Serving Deep Learning Models At Scale With RedisAI: Luca AntigaServing Deep Learning Models At Scale With RedisAI: Luca Antiga
Serving Deep Learning Models At Scale With RedisAI: Luca Antiga
 
ARM HPC Ecosystem
ARM HPC EcosystemARM HPC Ecosystem
ARM HPC Ecosystem
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
 
BXI: Bull eXascale Interconnect
BXI: Bull eXascale InterconnectBXI: Bull eXascale Interconnect
BXI: Bull eXascale Interconnect
 
gRPC stack supporting Intel Resource Director technology (RDT)
gRPC stack supporting Intel Resource Director technology (RDT)gRPC stack supporting Intel Resource Director technology (RDT)
gRPC stack supporting Intel Resource Director technology (RDT)
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
The HPE Machine and Gen-Z - BUD17-503
The HPE Machine and Gen-Z - BUD17-503The HPE Machine and Gen-Z - BUD17-503
The HPE Machine and Gen-Z - BUD17-503
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
Challenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPIChallenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPI
 
Open CAPI, A New Standard for High Performance Attachment of Memory, Accelera...
Open CAPI, A New Standard for High Performance Attachment of Memory, Accelera...Open CAPI, A New Standard for High Performance Attachment of Memory, Accelera...
Open CAPI, A New Standard for High Performance Attachment of Memory, Accelera...
 
Overview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future RoadmapOverview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future Roadmap
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
OpenCAPI Technology Ecosystem
OpenCAPI Technology EcosystemOpenCAPI Technology Ecosystem
OpenCAPI Technology Ecosystem
 
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated SystemsPetapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
 
DOME 64-bit μDataCenter
DOME 64-bit μDataCenterDOME 64-bit μDataCenter
DOME 64-bit μDataCenter
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at Renault
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overview
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
 

Similar a Omni-Path Status, Upstreaming and Ongoing Work

Similar a Omni-Path Status, Upstreaming and Ongoing Work (20)

TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura IntelTDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
 
A Path to NFV/SDN - Intel. Michael Brennan, INTEL
A Path to NFV/SDN - Intel. Michael Brennan, INTELA Path to NFV/SDN - Intel. Michael Brennan, INTEL
A Path to NFV/SDN - Intel. Michael Brennan, INTEL
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
 
Intel NFVi Enabling Kit Demo/Lab
Intel NFVi Enabling Kit Demo/LabIntel NFVi Enabling Kit Demo/Lab
Intel NFVi Enabling Kit Demo/Lab
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
NFF-GO (YANFF) - Yet Another Network Function Framework
NFF-GO (YANFF) - Yet Another Network Function FrameworkNFF-GO (YANFF) - Yet Another Network Function Framework
NFF-GO (YANFF) - Yet Another Network Function Framework
 
Tectonic Summit 2016: It's Go Time
Tectonic Summit 2016: It's Go Time Tectonic Summit 2016: It's Go Time
Tectonic Summit 2016: It's Go Time
 
Omni path-fabric-software-architecture-overview
Omni path-fabric-software-architecture-overviewOmni path-fabric-software-architecture-overview
Omni path-fabric-software-architecture-overview
 
Overview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path ArchitectureOverview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path Architecture
 
AIDC Summit LA- Hands-on Training
AIDC Summit LA- Hands-on Training AIDC Summit LA- Hands-on Training
AIDC Summit LA- Hands-on Training
 
Accelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersAccelerating AI Adoption with Partners
Accelerating AI Adoption with Partners
 
Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...
 
2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etc2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etc
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 
What's New in IBM Streams V4.2
What's New in IBM Streams V4.2What's New in IBM Streams V4.2
What's New in IBM Streams V4.2
 
Fujitsu World Tour 2017 - Digital Business with SAP & Fujitsu
Fujitsu World Tour 2017 - Digital Business with SAP & FujitsuFujitsu World Tour 2017 - Digital Business with SAP & Fujitsu
Fujitsu World Tour 2017 - Digital Business with SAP & Fujitsu
 
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
 
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
 

Más de inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
inside-BigData.com
 

Más de inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Omni-Path Status, Upstreaming and Ongoing Work

  • 1. 13th ANNUAL WORKSHOP 2017 INTEL® OMNI-PATH STATUS, UPSTREAMING AND ONGOING WORK Todd Rimmer, Omni-Path Lead Software Architect March, 2017 Intel Corporation
  • 2. OpenFabrics Alliance Workshop 2017 LEGAL DISCLAIMERS 2 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The cost reduction scenarios described in this document are intended to enable you to get a better understanding of how the purchase of a given Intel product, combined with a number of situation-specific variables, might affect your future cost and savings. Circumstances will vary and there may be unaccounted-for costs related to the use and deployment of a given product. Nothing in this document should be interpreted as either a promise of or contract for a given level of costs. Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to: Learn About Intel® Processor Numbers All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm The High-Performance Linpack (HPL) benchmark is used in the Intel® FastFabrics toolset included in the Intel® Fabric Suite. The HPL product includes software developed at the University of Tennessee, Knoxville, Innovative Computing Libraries. Intel, Intel Xeon, Intel Xeon Phi™ are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. Copyright © 2016, Intel Corporation
  • 3. OpenFabrics Alliance Workshop 2017 AGENDA  Omni-Path Overview  Omni-Path Innovations  Omni-Path Upstreaming  Omni-Path Ongoing Work 3
  • 4. OpenFabrics Alliance Workshop 2017 OMNI-PATH OVERVIEW Omni-Path is a new fabric technology • Omni-Path is NOT InfiniBand Significant deployments in Nov 2016 Top 500 list  Clusters  28  Flops  43.7PF  Top10  1 system (#6)  Top15  2 system (#6, #12)  Top50  4 systems  Top100  10 systems  Xeon Efficiency  88.5%  HPCG list  #3 4
  • 5. OpenFabrics Alliance Workshop 2017 INTEL® OMNI-PATH HOST SOFTWARE COMPONENTS 5 Mgmt Node 1 Mgmt Node 2 File Sys Server Boot Server Boot Server Login Server Login Server CN0 CN1 CN2 CN3 CN4 CNn Compute Nodes LAN/WAN Director Switches Edge Switches Intel® OPA Fabric File Sys Server Host Software Stack • Runs on all Intel® OPA-connected host nodes • High performance, highly scalable MPI implementation and extensive set of upper layer protocols Fabric Management Stack • Runs on OPA-connected management nodes • Initializes, configures and monitors the fabric routing, QoS, security, and performance • Includes toolkit for TCO functions: configuration, monitoring, diags, and repair Mgmt Work Station Fabric Management GUI • Runs on workstation with a local screen/keyboard • Provides interactive GUI access to Fabric Management TCO features (configuration, monitoring, diagnostics, element management drill down) LAN/ WAN
  • 6. OpenFabrics Alliance Workshop 2017 LAYER 4: TRANSPORT LAYER AND KEY SOFTWARE  Open Fabrics Interface (OFI) libfabric • Framework providing an API applications and middleware can use for multiple vendors and L4 protocols • Allows vendor innovation w/o requiring application churn  Performance Scaled Messaging (PSM) • API and corresponding L4 protocol designed for the needs of HPC  Open Fabrics Alliance verbs • API and corresponding L4 protocol designed for RDMA IO 6
  • 7. OpenFabrics Alliance Workshop 2017 OMNI-PATH LINK LAYER INNOVATIONS  8K and 10K MTUs  Up to 31 VLs and 32 SLs  LIDs extended to 24 bits  Packet Integrity Protection - link level retry  Dynamic Lane Scaling - link width downgrade  Traffic Flow Optimization – packet preemption  SCs – topology deadlock avoidance 7
  • 8. OpenFabrics Alliance Workshop 2017  Optimized management protocols • 2K MAD packets • Allowing more efficient SMA, PMA, SA, PA protocols • New attributes, attribute layouts, and fields • P-Key based security for SMA packets • New and extended PMA counters • Architected PA protocol  Optimized SA clients • ibacm plug-ins, ib_sa use of ibacm  Many new HW features, diagnostics, controls • Drives new SMA and PMA attributes, fields, attribute layouts  IB compatibility for key name services SA queries 8 FABRIC MANAGEMENT Scalability for Exascale • PathRecord • MulticastMemberRecord • ServiceRecord • NodeRecord • Notices
  • 9. OpenFabrics Alliance Workshop 2017 OMNI-PATH OPEN SOURCE STRATEGY  ALL OPA specific host software is open sourced and upstreamed • Drivers • Providers/libraries • Management software, including FM GUI • OFA enhancements • 2K MADs, extended LIDs, etc • Included in RHEL 7.3 and SLES 12 sp2 9
  • 10. OpenFabrics Alliance Workshop 2017 OMNI-PATH OPEN SOURCE STRATEGY  OFA enhancements via community collaboration • Ensuring ongoing app interop and multi-vendor OFA  Older distros supported via “delta package” • Intel shipped additions to standard distro • Minimal footprint of changes • Surgically add OPA enablement • Without changing fundamental OFA rev in distro  Delta package used for new OPA features and optimizations • Rapid delivery of OPA optimizations, fixes, special features • src.rpms and open source on github/01org  Create proper API abstractions to allow vendor innovation • OFI/Libfabric • Allow vendor innovation in drivers/libraries without OFA API churn 10 OFA and kernel.org Developer Intel OFA Delta for each distro distro
  • 11. OpenFabrics Alliance Workshop 2017 INTEL OFA DELTA 11 OFA Common Delta OFA Common in-distro OPA Specific Delta OPA Specific in-distro  Minimize distro changes  Avoid replacing common OFA code in distro • Changes driven through standard open processes • OPA support fully in current distros  OPA specific value adds • New features not yet in distro • Items where Intel is Maintainer • HFI Driver, OPA libraries/providers, OPA Mgmt • All open sourced and upsteamed
  • 12. OpenFabrics Alliance Workshop 2017 CHALLENGES  OFA core and verbs are very InfiniBand oriented • Even forced design decisions in OPA HW architecture • No forward compatible way for apps to identify OPA cards/features • Dependency on IB specific enums, such as speed, rate, MTU, address ranges  Support for extended LIDs not ideal • Didn’t want to break existing APIs/ABIs • Forced to overload fields in ibv_global_route to carry extended LID fields  Difficult to get extended MTU into some components • Don’t want to break existing APIs nor ABIs • IPoIB UD mode on OPA limited to 4K MTU • Apps that use SA PathRecord or RDMA CM can be given 8K MTU • Works if pass through as given, issues if try to display or interpret • Assorted verbs benchmarks limited to 4K MTU  Mgmt Interfaces assume IB MADs • example: SRP needed IB PortInfo records from SA to find targets with DMA • example: SMPs can’t be sent from a UD QP 12
  • 13. OpenFabrics Alliance Workshop 2017 ONGOING WORK  Continued OFA and fabric scalability work  Raw packet snoop/injection interface to driver • Enable wireshark, fabric HW testing, fabric simulation, HW emulation  Improved trace and debug features • Historical traces for “after incident” analysis without log clutter  SA RMPP stack scalability • Timeouts too large, need way to timeout a large query quickly when no response • Get ALL SA queries through a common code path (ibacm and plugins)  Introduction of vNIC driver • Ethernet over fabric, optimized for Omni-Path • New Ethernet Management Agent (EMA) in hosts • Allow use of vNIC IP addresses as fabric address in RDMACM, ibacm 13
  • 14. OpenFabrics Alliance Workshop 2017 SUMMARY  Intel® Omni-Path is a new fabric • In production and installed at many sites, including top500 sites  Omni-Path is not InfiniBand, many HW & SW innovations • Extended LIDs, 2K MADs, TFO, etc  Omni-Path is open sourced and integrated into OFA • Some challenges, but now upstream and in standard distros • Delta distribution model to support older distros  Omni-Path work is ongoing • Community collaboration 14
  • 15. 13th ANNUAL WORKSHOP 2017 THANK YOU Todd Rimmer, Omni-Path Lead Software Architect Intel Corporation