SlideShare a Scribd company logo
1 of 20
Download to read offline
Sunku Ranganath
https://www.linkedin.com/in/sunkuranganath/
Legal Disclaimer
© 2019 Intel Corporation. Intel, the Intel logo, Intel Inside, the Intel Inside logo, Intel Experience What’s Inside, The Intel Experience What’s Inside logo, and Xeon are trademarks of Intel Corporation in the U.S.
and/or other countries. *Other names and brands may be claimed as the property of others.
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer.
Intel processors of the same SKU may vary in frequency or power as a result of natural variability in the production process.
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
The cost reduction scenarios described are intended to enable you to get a better understanding of how the purchase of a given Intel based product, combined with a number of situation-specific variables, might
affect future costs and savings. Circumstances will vary and there may be unaccounted-for costs related to the use and deployment of a given product. Nothing in this document should be interpreted as either a
promise of or contract for a given level of costs or cost reduction.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.
Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2,
SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please
refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804.
No computer system can be absolutely secure.
Intel® Advanced Vector Extensions (Intel® AVX)* provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to
operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and
system configuration and you can learn more at http://www.intel.com/go/turbo.
Available on select Intel® processors. Requires an Intel® HT Technology-enabled system. Your performance varies depending on the specific hardware and software you use. Learn more by visiting
http://www.intel.com/info/hyperthreading.
Intel, the Intel logo, [List the Intel trademarks in your document] are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
© Intel Corporation
Acknowledgements
Timothy Verrall
John Browne
Damien Power
Emma Collins
Jean Christophe Bouche
Krzysztof Kepka
Agenda
Platform Observability
Service Assurance
Closed Loop Automation
Platform Observability & Service Assurance (SA)
• Observability: Ability to expose state of the platform to ensure Service Level
Objectives are met
• Observability Considerations: Logging, Metrics & Tracing
• Communications Service Provider Context:
• Care about overall Service Assurance
• Both Monitoring & Observability are important
• Service Assurance
• Application of policies to ensure services meet a pre-defined service quality level
• FCAPS (Fault, Configuration, Accounting, Performance & Security) attributes on
existing network infrastructure
6
Three Key Elements of SA Platform
 Monitoring: Enabling deeper
management and tracking of
specific service levels
 Presentation: Reporting to
enable reaction to service level
changes:
 Provisioning: Enable
configuration of service levels
based on workload or service
priority
Figure: Service Assurance elements mapping to ETSI NFV Model
7
Collectd Monitoring Agent
Collectd: Why & What
• Statistics collection daemon
• Uses read or write plugins to collect metrics write to an end
point
• Open source
• Widely adopted
• Configurable Collection Interval
Various Plugin types:
• Input/Output
• Binding Plugins
• Logging Plugins
• Notification Plugins
• Other: Network plugin with both send/receive feature
Figure: Collectd Architecture
https://github.com/collectd/collectd
8
Platform Telemetry Exposure & Integration
Compute Network Storage
Hypervisor [RT/SA KVM4NFV extensions]
NFVI
IPFIX
Virtualised
Compute
Virtualised
Network
Virtualised
Storage
E.g.
Working/Protect
Failover
Local
Corrective
Action
Enterprise
MIB
SYSLOG
Collectd
PMU*
counters
NIC counters
vSwitch
counters
SNMP API
Perfmon
MIB
Common / Standard Open APIs
Fast Path
Triggers on events or
counters
VM Stall Detection/
RT Stall Detection
Monitoring/
Analytics
Systems
Slow Path
Periodic Pull 1/15mins
RAS Hypervisor/Container
Counters
Container
Monitoring
Solutions
(Prometheus
….)
Includes
NetFlow Collectors
Vendor SA
Middleware
Intel® Node
Manager
NFV Platform
MIB
Standard Open APIs
Intel Components
Open Platform
Collectors
Intel® Run Sure Technology
MCA* PCIe AER
Resilient System Technology
Resilient Memory Technology
SDDC DDDC+1 Mirroring
RAID/
NVMe
Intel® Rapid
Storage
Technology
sFlow
Intel®
Management
Engine
IPMI
Ceilometer
Aodh
Vitrage
Congress
In progress
Done/Integrated
Open Stack
Collectd PluginsIntel Infrastructure
Management Technologies
Gnocchi
VES Plugin
Redfish
C
M
T
Intel® RDT
C
A
T
M
B
M
C
D
P
PO
W
ER
Out Of
Band
Telemetry
Kafka Prometheus
OpenStack
VIM
PMU*: Performance Monitoring Unit
Multiple Closed Loops
Plan & Provision
Offline
feedback loop
Design Analyze
Use cases (Loops)
• Capacity planning
• Peering planning
• Cache placement
• …
Optimize
MonitorOrchestrate
Near-real
Time
Feedback loop Real-Time
Feedback loop
Use cases (Loops)
• Service assurance
• Security operations
• …
Use cases (Loops)
• Traffic Engineering:
Network Optimization
• Demand placement
• Workload placement…
Telemetry
Telemetry
Real-time/Near Real-time Loops - Automated
Telemetry
Offline Processing
Online Processing
Source: https://pndablog.com/2017/06/05/feedback-loops-and-closed-loop-control/
10
Networking Closed Loops – High Level Architecture
Platform Resources
Forwarding Plane
Interfaces
Interfaces
TrafficTraffic
Platform
Analytics
Systems
Business Applications
Setting of Policy
SDN/NMS
Network Services
Cloud and Virtual
Management
MANO
EMS VNFM
Infrastructure
Control
Application
Independent Closed Loops: SDN, Cloud & Virtual Mgt, Platform
Local
Platform
Agent
Telemetry
distribution or
storage or
…..
Platform
Telemetry
Policy Based Provisioning
Control Loops
11
Closed Loops – Networking Stack
Application Layer
Network Data Analytics
Orchestration, Management, Policy
Cloud & Virtual Management
Network Control
Operating Systems
Data Path
Hardware/
Disaggregated Hardware
ServicesManagement&ControlInfrastructure
Micro-seconds/
Milliseconds
Mins/Hours/Days
Closed Loop
Reaction Time
Domain Knowledge
Local to
Platform
End to End
Enforce Local
Policy
Deployment
Policies
Enforce Network
Domain Policy
Map Policies
HW Enabled
Loops (eg
RAS)
Enforce DP
Loops (HA etc.)
Analyze/
Plan Policies
High Speed Control Loops are Close to the Platform
Seconds/Mins
Analytics
12
Closed Loops – Business Cases
Improved Customer
Experience
Cloud Optimization &
Efficiency
Edge Placement
Service Healing
Differentiated QoS
Service Optimization
Energy Optimization
Capacity Optimization
Cloud Configurations
Business
Use Cases
AI/ML/DL
Platform(s)
Feature Exposure Provisioning Telemetry
Local Policy Enforcement Agent(s)
For Local Dynamic Control
Intel Infrastructure
Management Tech
Intel RDT Power
Monitoring/Storage
NFV Orchestrator (NFVO) [eg ONAP/OSM]
Security
Threat Detection
Threat Response
Business Applications
collectd
Policy Based Provisioning
Control Loops
VNF Manager (VNFM)
Open Stack Kubernetes Telemetry I/FTelemetry I/F
Actively
Contributing
Intel
RunSure
Bare Metal
Telemetry I/F
Closed Loop Resiliency Demo
Goal: Maximize Service Availability
of Virtual Border Network Gateway
(vBNG) in memory error scenario
Figure 1 Source: OpenSAF and VMware from the Perspective of High Availability - Ali Nikzad, Ferhat KhendekMaria Toeroe
Concordia University Ericsson SVM’2013 – Zurich – October 2013
Figure 1: Service Recovery Timeline Figure 2: Closed Loop Resiliency
Demo with Kubernetes
More Details on Demo: https://networkbuilders.intel.com/social-hub/video/closed-loop-
platform-automation-workload-resiliency-demo
Closed Loop Automation (CLA) – Communities,
Standards
• Open Network Automation Platform
(ONAP) – Closed Loop Automation
Management Platform (CLAMP)
• OPNFV Working Group for CLA
• ETSI Zero Touch Service
Management (ZSM)
• ETSI Experiential Networked
Intelligence (ENI)
Ex: OPNFV WG
Ex: ONAP CLAMP
Use Cases & Gaps
• 5G Network Slicing
• Demand based Energy Savings
• Workload Resiliency
• Noisy Neighbor Detection & Avoidance
• And many more….
Figure: 5G Network Slicing Architecture
Source: https://www.researchgate.net/figure/5G-network-slicing-architecture_fig1_324175599
Gaps, On Going Work
• Telemetry tagging
• Policy delivery & management across
VIM to NFVI
Summary
Platform Observability & Monitoring play crucial role in ensuring service assurance
Platform telemetry heavily differentiate the services, along side of application telemetry
Various levels of closed loops are required for autonomous networks
Realtime & Near-Realtime closed loops require automation
Collaborate through Open Source Communities
Figure out use cases of interest
Leverage relevant infrastructure telemetry
Call To Action
18
ServiceAssurance“Phased”EvolutionforNFV/SDN
• Strategic Framework for SA “Phase” Evolution
 Phase 1 - Equivalence (Virtualized + Interworking with existing management systems)
 Phase 2 - Automated by MANO+SDN Controller
 Phase 3 - Predict failures and adapt automatically
Platform Service Assurance -
Equivalence
• Platform Service Assurance supporting:
•Intel RAS Technologies
•Cache Config & Monitoring
•Bios Config & Reporting
•Fastpath DPDK Interface Reporting
•Fastpath DPDK Keep Alive
•Virtual Switch Health
•QAT Watchdog
•Host Health
• …….
Platform Service Assurance
(MANO + SDN Controller)
•VIM and above, support:
• Enable RAS Technologies
• Enable Watchdog Metrics
• Enable DPDK and Keep Alive
• Enable Host Health
• Policy Based Provisioning
• …
Predictive Platform Service
Assurance
•Predict Failures and Adapt
Automatically:
• Automated and Adaptive to changes
notified in metrics
• Closed loop and Dynamic SA
environment
•
Phase 1 Phase 2 Phase 3
Evolving from Equivalence towards NFV/SDN Automation
Never Stops Solution of the day Under Construction
19
Platform Plugins Contributed by Intel
Plugin Domain Description
Intel RunSure/
RAS
Mcelog, PCIe AER, logparser: Metrics & notifications pertaining to Intel RunSure
technologies
Intel_RDT Resource Director Technologies related metrics
Virt Libvirt related metrics
OVS Ovs_stats, ovs_events: Metrics related to Open Virtual Switch
DPDK Dpdk_stats, dpdk_events, hugepages: DPDK related metrics
OpenStack Gnocchi, Aodh: Integration in OpenStack projects
Cloud Write_Kafka, Write_Prometheus, VES: Integration in to various cloud platforms
Storage RAID, NVMe: Storage related Metrics
Power/Energy CPUFreq, Turbostat: Frequency & power related metrics
Platform IPMI, RedFish, PMU: Out of Band metrics & platform counters
Infrastructure Metrics are Crucial as Application Metrics
20
Barometer Strategy:
• Ensure platform metrics/events are
accessible through open industry standard
interfaces.
• Demonstrate IA platform technologies can
be monitored, consumed and actioned in
real time
Opnfvbarometer
One Click Install:
 Easy install/configuration
for customers
 One command to install
Collectd/Influxdb/Grafana
• Three container approach for
Collectd:
• Stable Container: latest stable branch
• Master Container: up to date with
master
• Experimental Container: cherry pick
features of interest
Source: https://opnfv-barometer.readthedocs.io/en/latest/release/userguide/docker.userguide.html

More Related Content

What's hot

Introduction to Tenable
Introduction to TenableIntroduction to Tenable
Introduction to TenableBharat Jindal
 
Windows Ağlarda Saldırı Tespiti
Windows Ağlarda Saldırı TespitiWindows Ağlarda Saldırı Tespiti
Windows Ağlarda Saldırı TespitiSparta Bilişim
 
Network Penetration Testing
Network Penetration TestingNetwork Penetration Testing
Network Penetration TestingMohammed Adam
 
Bgp multihoming
Bgp multihomingBgp multihoming
Bgp multihomingee38sp
 
How MITRE ATT&CK helps security operations
How MITRE ATT&CK helps security operationsHow MITRE ATT&CK helps security operations
How MITRE ATT&CK helps security operationsSergey Soldatov
 
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORKZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORKMaganathin Veeraragaloo
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsThousandEyes
 
Zero Trust Framework for Network Security​
Zero Trust Framework for Network Security​Zero Trust Framework for Network Security​
Zero Trust Framework for Network Security​AlgoSec
 
Building a Next-Generation Security Operations Center (SOC)
Building a Next-Generation Security Operations Center (SOC)Building a Next-Generation Security Operations Center (SOC)
Building a Next-Generation Security Operations Center (SOC)Sqrrl
 
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Arm
 
WTF is Penetration Testing v.2
WTF is Penetration Testing v.2WTF is Penetration Testing v.2
WTF is Penetration Testing v.2Scott Sutherland
 
Building Security Operation Center
Building Security Operation CenterBuilding Security Operation Center
Building Security Operation CenterS.E. CTS CERT-GOV-MD
 
Threat Hunting - Moving from the ad hoc to the formal
Threat Hunting - Moving from the ad hoc to the formalThreat Hunting - Moving from the ad hoc to the formal
Threat Hunting - Moving from the ad hoc to the formalPriyanka Aash
 
Network Intrusion Detection System Using Snort
Network Intrusion Detection System Using SnortNetwork Intrusion Detection System Using Snort
Network Intrusion Detection System Using SnortDisha Bedi
 

What's hot (20)

Introduction to Tenable
Introduction to TenableIntroduction to Tenable
Introduction to Tenable
 
Odoo 14 Calendar
Odoo 14 CalendarOdoo 14 Calendar
Odoo 14 Calendar
 
Windows Ağlarda Saldırı Tespiti
Windows Ağlarda Saldırı TespitiWindows Ağlarda Saldırı Tespiti
Windows Ağlarda Saldırı Tespiti
 
Network Penetration Testing
Network Penetration TestingNetwork Penetration Testing
Network Penetration Testing
 
Techowl- Wazuh.pdf
Techowl- Wazuh.pdfTechowl- Wazuh.pdf
Techowl- Wazuh.pdf
 
Bgp multihoming
Bgp multihomingBgp multihoming
Bgp multihoming
 
How MITRE ATT&CK helps security operations
How MITRE ATT&CK helps security operationsHow MITRE ATT&CK helps security operations
How MITRE ATT&CK helps security operations
 
Fortigate Training
Fortigate TrainingFortigate Training
Fortigate Training
 
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORKZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
ZERO TRUST ARCHITECTURE - DIGITAL TRUST FRAMEWORK
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
 
Zero Trust Framework for Network Security​
Zero Trust Framework for Network Security​Zero Trust Framework for Network Security​
Zero Trust Framework for Network Security​
 
Building a Next-Generation Security Operations Center (SOC)
Building a Next-Generation Security Operations Center (SOC)Building a Next-Generation Security Operations Center (SOC)
Building a Next-Generation Security Operations Center (SOC)
 
NMAP
NMAPNMAP
NMAP
 
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
 
WTF is Penetration Testing v.2
WTF is Penetration Testing v.2WTF is Penetration Testing v.2
WTF is Penetration Testing v.2
 
Snort
SnortSnort
Snort
 
Building Security Operation Center
Building Security Operation CenterBuilding Security Operation Center
Building Security Operation Center
 
Threat Hunting - Moving from the ad hoc to the formal
Threat Hunting - Moving from the ad hoc to the formalThreat Hunting - Moving from the ad hoc to the formal
Threat Hunting - Moving from the ad hoc to the formal
 
Network Intrusion Detection System Using Snort
Network Intrusion Detection System Using SnortNetwork Intrusion Detection System Using Snort
Network Intrusion Detection System Using Snort
 
Data loss prevention (dlp)
Data loss prevention (dlp)Data loss prevention (dlp)
Data loss prevention (dlp)
 

Similar to Platform Observability and Infrastructure Closed Loops

Introduction to container networking in K8s - SDN/NFV London meetup
Introduction to container networking in K8s - SDN/NFV  London meetupIntroduction to container networking in K8s - SDN/NFV  London meetup
Introduction to container networking in K8s - SDN/NFV London meetupHaidee McMahon
 
Platform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed LoopsPlatform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed LoopsLiz Warner
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Ceph Community
 
Intel® Select Solutions for the Network
Intel® Select Solutions for the NetworkIntel® Select Solutions for the Network
Intel® Select Solutions for the NetworkLiz Warner
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY
 
DUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateDUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateAndrey Kudryavtsev
 
Cloud Technology: Now Entering the Business Process Phase
Cloud Technology: Now Entering the Business Process PhaseCloud Technology: Now Entering the Business Process Phase
Cloud Technology: Now Entering the Business Process Phasefinteligent
 
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...Amazon Web Services
 
E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case Intel IT Center
 
Hetergeneous Compute with Standards Based OFI/MPI/OpenMP Programming
Hetergeneous Compute with Standards Based OFI/MPI/OpenMP ProgrammingHetergeneous Compute with Standards Based OFI/MPI/OpenMP Programming
Hetergeneous Compute with Standards Based OFI/MPI/OpenMP ProgrammingIntel® Software
 
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...Kuralamudhan Ramakrishnan
 
AWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
AWS Summit Singapore - Make Business Intelligence Scalable and AdaptableAWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
AWS Summit Singapore - Make Business Intelligence Scalable and AdaptableAmazon Web Services
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureJim St. Leger
 
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...Liz Warner
 
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...Liz Warner
 
Xeon E5 Making the Business Case PowerPoint
Xeon E5 Making the Business Case PowerPointXeon E5 Making the Business Case PowerPoint
Xeon E5 Making the Business Case PowerPointIntel IT Center
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developersMichelle Holley
 
2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etc2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etcvideos
 
Building Efficient Edge Nodes for Content Delivery Networks
Building Efficient Edge Nodes for Content Delivery NetworksBuilding Efficient Edge Nodes for Content Delivery Networks
Building Efficient Edge Nodes for Content Delivery NetworksRebekah Rodriguez
 
Технологии Intel для виртуализации сетей операторов связи
Технологии Intel для виртуализации сетей операторов связиТехнологии Intel для виртуализации сетей операторов связи
Технологии Intel для виртуализации сетей операторов связиCisco Russia
 

Similar to Platform Observability and Infrastructure Closed Loops (20)

Introduction to container networking in K8s - SDN/NFV London meetup
Introduction to container networking in K8s - SDN/NFV  London meetupIntroduction to container networking in K8s - SDN/NFV  London meetup
Introduction to container networking in K8s - SDN/NFV London meetup
 
Platform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed LoopsPlatform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed Loops
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques
 
Intel® Select Solutions for the Network
Intel® Select Solutions for the NetworkIntel® Select Solutions for the Network
Intel® Select Solutions for the Network
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
 
DUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateDUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS Update
 
Cloud Technology: Now Entering the Business Process Phase
Cloud Technology: Now Entering the Business Process PhaseCloud Technology: Now Entering the Business Process Phase
Cloud Technology: Now Entering the Business Process Phase
 
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
 
E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case
 
Hetergeneous Compute with Standards Based OFI/MPI/OpenMP Programming
Hetergeneous Compute with Standards Based OFI/MPI/OpenMP ProgrammingHetergeneous Compute with Standards Based OFI/MPI/OpenMP Programming
Hetergeneous Compute with Standards Based OFI/MPI/OpenMP Programming
 
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
 
AWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
AWS Summit Singapore - Make Business Intelligence Scalable and AdaptableAWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
AWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
 
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
 
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
 
Xeon E5 Making the Business Case PowerPoint
Xeon E5 Making the Business Case PowerPointXeon E5 Making the Business Case PowerPoint
Xeon E5 Making the Business Case PowerPoint
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developers
 
2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etc2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etc
 
Building Efficient Edge Nodes for Content Delivery Networks
Building Efficient Edge Nodes for Content Delivery NetworksBuilding Efficient Edge Nodes for Content Delivery Networks
Building Efficient Edge Nodes for Content Delivery Networks
 
Технологии Intel для виртуализации сетей операторов связи
Технологии Intel для виртуализации сетей операторов связиТехнологии Intel для виртуализации сетей операторов связи
Технологии Intel для виртуализации сетей операторов связи
 

More from Open Source Technology Center MeetUps (7)

Clear Linux Overview and Engagement
Clear Linux Overview and EngagementClear Linux Overview and Engagement
Clear Linux Overview and Engagement
 
Clear Linux OS - Introduction
Clear Linux OS - IntroductionClear Linux OS - Introduction
Clear Linux OS - Introduction
 
Clear Linux OS - Architecture Overview
Clear Linux OS - Architecture OverviewClear Linux OS - Architecture Overview
Clear Linux OS - Architecture Overview
 
Container BoM Inspection with TERN
Container BoM Inspection with TERNContainer BoM Inspection with TERN
Container BoM Inspection with TERN
 
Road to Cloud Native Orchestration
Road to Cloud Native Orchestration Road to Cloud Native Orchestration
Road to Cloud Native Orchestration
 
Tungsten Fabric and DPDK vRouter Architecture
Tungsten Fabric and DPDK vRouter ArchitectureTungsten Fabric and DPDK vRouter Architecture
Tungsten Fabric and DPDK vRouter Architecture
 
Painless Cache Allocation in Cloud
Painless Cache Allocation in CloudPainless Cache Allocation in Cloud
Painless Cache Allocation in Cloud
 

Recently uploaded

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Platform Observability and Infrastructure Closed Loops

  • 2. Legal Disclaimer © 2019 Intel Corporation. Intel, the Intel logo, Intel Inside, the Intel Inside logo, Intel Experience What’s Inside, The Intel Experience What’s Inside logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. Intel processors of the same SKU may vary in frequency or power as a result of natural variability in the production process. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. The cost reduction scenarios described are intended to enable you to get a better understanding of how the purchase of a given Intel based product, combined with a number of situation-specific variables, might affect future costs and savings. Circumstances will vary and there may be unaccounted-for costs related to the use and deployment of a given product. Nothing in this document should be interpreted as either a promise of or contract for a given level of costs or cost reduction. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804. No computer system can be absolutely secure. Intel® Advanced Vector Extensions (Intel® AVX)* provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo. Available on select Intel® processors. Requires an Intel® HT Technology-enabled system. Your performance varies depending on the specific hardware and software you use. Learn more by visiting http://www.intel.com/info/hyperthreading. Intel, the Intel logo, [List the Intel trademarks in your document] are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © Intel Corporation
  • 3. Acknowledgements Timothy Verrall John Browne Damien Power Emma Collins Jean Christophe Bouche Krzysztof Kepka
  • 5. Platform Observability & Service Assurance (SA) • Observability: Ability to expose state of the platform to ensure Service Level Objectives are met • Observability Considerations: Logging, Metrics & Tracing • Communications Service Provider Context: • Care about overall Service Assurance • Both Monitoring & Observability are important • Service Assurance • Application of policies to ensure services meet a pre-defined service quality level • FCAPS (Fault, Configuration, Accounting, Performance & Security) attributes on existing network infrastructure
  • 6. 6 Three Key Elements of SA Platform  Monitoring: Enabling deeper management and tracking of specific service levels  Presentation: Reporting to enable reaction to service level changes:  Provisioning: Enable configuration of service levels based on workload or service priority Figure: Service Assurance elements mapping to ETSI NFV Model
  • 7. 7 Collectd Monitoring Agent Collectd: Why & What • Statistics collection daemon • Uses read or write plugins to collect metrics write to an end point • Open source • Widely adopted • Configurable Collection Interval Various Plugin types: • Input/Output • Binding Plugins • Logging Plugins • Notification Plugins • Other: Network plugin with both send/receive feature Figure: Collectd Architecture https://github.com/collectd/collectd
  • 8. 8 Platform Telemetry Exposure & Integration Compute Network Storage Hypervisor [RT/SA KVM4NFV extensions] NFVI IPFIX Virtualised Compute Virtualised Network Virtualised Storage E.g. Working/Protect Failover Local Corrective Action Enterprise MIB SYSLOG Collectd PMU* counters NIC counters vSwitch counters SNMP API Perfmon MIB Common / Standard Open APIs Fast Path Triggers on events or counters VM Stall Detection/ RT Stall Detection Monitoring/ Analytics Systems Slow Path Periodic Pull 1/15mins RAS Hypervisor/Container Counters Container Monitoring Solutions (Prometheus ….) Includes NetFlow Collectors Vendor SA Middleware Intel® Node Manager NFV Platform MIB Standard Open APIs Intel Components Open Platform Collectors Intel® Run Sure Technology MCA* PCIe AER Resilient System Technology Resilient Memory Technology SDDC DDDC+1 Mirroring RAID/ NVMe Intel® Rapid Storage Technology sFlow Intel® Management Engine IPMI Ceilometer Aodh Vitrage Congress In progress Done/Integrated Open Stack Collectd PluginsIntel Infrastructure Management Technologies Gnocchi VES Plugin Redfish C M T Intel® RDT C A T M B M C D P PO W ER Out Of Band Telemetry Kafka Prometheus OpenStack VIM PMU*: Performance Monitoring Unit
  • 9. Multiple Closed Loops Plan & Provision Offline feedback loop Design Analyze Use cases (Loops) • Capacity planning • Peering planning • Cache placement • … Optimize MonitorOrchestrate Near-real Time Feedback loop Real-Time Feedback loop Use cases (Loops) • Service assurance • Security operations • … Use cases (Loops) • Traffic Engineering: Network Optimization • Demand placement • Workload placement… Telemetry Telemetry Real-time/Near Real-time Loops - Automated Telemetry Offline Processing Online Processing Source: https://pndablog.com/2017/06/05/feedback-loops-and-closed-loop-control/
  • 10. 10 Networking Closed Loops – High Level Architecture Platform Resources Forwarding Plane Interfaces Interfaces TrafficTraffic Platform Analytics Systems Business Applications Setting of Policy SDN/NMS Network Services Cloud and Virtual Management MANO EMS VNFM Infrastructure Control Application Independent Closed Loops: SDN, Cloud & Virtual Mgt, Platform Local Platform Agent Telemetry distribution or storage or ….. Platform Telemetry Policy Based Provisioning Control Loops
  • 11. 11 Closed Loops – Networking Stack Application Layer Network Data Analytics Orchestration, Management, Policy Cloud & Virtual Management Network Control Operating Systems Data Path Hardware/ Disaggregated Hardware ServicesManagement&ControlInfrastructure Micro-seconds/ Milliseconds Mins/Hours/Days Closed Loop Reaction Time Domain Knowledge Local to Platform End to End Enforce Local Policy Deployment Policies Enforce Network Domain Policy Map Policies HW Enabled Loops (eg RAS) Enforce DP Loops (HA etc.) Analyze/ Plan Policies High Speed Control Loops are Close to the Platform Seconds/Mins
  • 12. Analytics 12 Closed Loops – Business Cases Improved Customer Experience Cloud Optimization & Efficiency Edge Placement Service Healing Differentiated QoS Service Optimization Energy Optimization Capacity Optimization Cloud Configurations Business Use Cases AI/ML/DL Platform(s) Feature Exposure Provisioning Telemetry Local Policy Enforcement Agent(s) For Local Dynamic Control Intel Infrastructure Management Tech Intel RDT Power Monitoring/Storage NFV Orchestrator (NFVO) [eg ONAP/OSM] Security Threat Detection Threat Response Business Applications collectd Policy Based Provisioning Control Loops VNF Manager (VNFM) Open Stack Kubernetes Telemetry I/FTelemetry I/F Actively Contributing Intel RunSure Bare Metal Telemetry I/F
  • 13. Closed Loop Resiliency Demo Goal: Maximize Service Availability of Virtual Border Network Gateway (vBNG) in memory error scenario Figure 1 Source: OpenSAF and VMware from the Perspective of High Availability - Ali Nikzad, Ferhat KhendekMaria Toeroe Concordia University Ericsson SVM’2013 – Zurich – October 2013 Figure 1: Service Recovery Timeline Figure 2: Closed Loop Resiliency Demo with Kubernetes More Details on Demo: https://networkbuilders.intel.com/social-hub/video/closed-loop- platform-automation-workload-resiliency-demo
  • 14. Closed Loop Automation (CLA) – Communities, Standards • Open Network Automation Platform (ONAP) – Closed Loop Automation Management Platform (CLAMP) • OPNFV Working Group for CLA • ETSI Zero Touch Service Management (ZSM) • ETSI Experiential Networked Intelligence (ENI) Ex: OPNFV WG Ex: ONAP CLAMP
  • 15. Use Cases & Gaps • 5G Network Slicing • Demand based Energy Savings • Workload Resiliency • Noisy Neighbor Detection & Avoidance • And many more…. Figure: 5G Network Slicing Architecture Source: https://www.researchgate.net/figure/5G-network-slicing-architecture_fig1_324175599 Gaps, On Going Work • Telemetry tagging • Policy delivery & management across VIM to NFVI
  • 16. Summary Platform Observability & Monitoring play crucial role in ensuring service assurance Platform telemetry heavily differentiate the services, along side of application telemetry Various levels of closed loops are required for autonomous networks Realtime & Near-Realtime closed loops require automation Collaborate through Open Source Communities Figure out use cases of interest Leverage relevant infrastructure telemetry Call To Action
  • 17.
  • 18. 18 ServiceAssurance“Phased”EvolutionforNFV/SDN • Strategic Framework for SA “Phase” Evolution  Phase 1 - Equivalence (Virtualized + Interworking with existing management systems)  Phase 2 - Automated by MANO+SDN Controller  Phase 3 - Predict failures and adapt automatically Platform Service Assurance - Equivalence • Platform Service Assurance supporting: •Intel RAS Technologies •Cache Config & Monitoring •Bios Config & Reporting •Fastpath DPDK Interface Reporting •Fastpath DPDK Keep Alive •Virtual Switch Health •QAT Watchdog •Host Health • ……. Platform Service Assurance (MANO + SDN Controller) •VIM and above, support: • Enable RAS Technologies • Enable Watchdog Metrics • Enable DPDK and Keep Alive • Enable Host Health • Policy Based Provisioning • … Predictive Platform Service Assurance •Predict Failures and Adapt Automatically: • Automated and Adaptive to changes notified in metrics • Closed loop and Dynamic SA environment • Phase 1 Phase 2 Phase 3 Evolving from Equivalence towards NFV/SDN Automation Never Stops Solution of the day Under Construction
  • 19. 19 Platform Plugins Contributed by Intel Plugin Domain Description Intel RunSure/ RAS Mcelog, PCIe AER, logparser: Metrics & notifications pertaining to Intel RunSure technologies Intel_RDT Resource Director Technologies related metrics Virt Libvirt related metrics OVS Ovs_stats, ovs_events: Metrics related to Open Virtual Switch DPDK Dpdk_stats, dpdk_events, hugepages: DPDK related metrics OpenStack Gnocchi, Aodh: Integration in OpenStack projects Cloud Write_Kafka, Write_Prometheus, VES: Integration in to various cloud platforms Storage RAID, NVMe: Storage related Metrics Power/Energy CPUFreq, Turbostat: Frequency & power related metrics Platform IPMI, RedFish, PMU: Out of Band metrics & platform counters Infrastructure Metrics are Crucial as Application Metrics
  • 20. 20 Barometer Strategy: • Ensure platform metrics/events are accessible through open industry standard interfaces. • Demonstrate IA platform technologies can be monitored, consumed and actioned in real time Opnfvbarometer One Click Install:  Easy install/configuration for customers  One command to install Collectd/Influxdb/Grafana • Three container approach for Collectd: • Stable Container: latest stable branch • Master Container: up to date with master • Experimental Container: cherry pick features of interest Source: https://opnfv-barometer.readthedocs.io/en/latest/release/userguide/docker.userguide.html