SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
Building a GPU-enabled
OpenStack Cloud for HPC
Blair Bethwaite, Lance Wilson, (and many others)
MONASH

eRESEARCH
Monash eResearch Centre: 

Enabling and Accelerating 21st Century Discovery through the
application of advanced computing, data informatics, tools and
infrastructure, delivered at scale, and built by with “co-design”
principle (researcher + technologist)
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
• UniMelb, as lead agent for Nectar, established first Node/site of the
Research Cloud in Jan 2012 and opened doors to the research
community
• Now seven Nodes (10+ DCs) and >40k cores around Australia
• Nectar established an OpenStack ecosystem for research computing in
Australia
• M3 built as first service in a new “monash-03” zone of the Research
Cloud focusing on HPC (computing) & HPDA (data-analytics)
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
HPC
150 active projects
1000+ user accounts
100+ institutions across Australia
Interactive Vis
600+ users
Multi-modal Australian ScienceS Imaging and Visualisation Environment
Specialised Facility for Imaging and Visualisation
MASSIVE
Instrument

Integration
Integrating with key Australian
Instrument Facilities.
– IMBL, XFM
– CryoEM
– MBI
– NCRIS: NIF, AMMRF
Large cohort of
researchers new to
HPC
~$2M per year funded by
partners and national
project funding
Partners
Monash University
Australian Synchrotron
CSIRO
Affiliate Partners
ARC Centre of Excellence in
Integrative Brain Function
ARC Centre of Excellence in
Advanced Molecular Imaging
M3 at Monash University

(including recent upgrade)
A Computer for 

Next-Generation Data Science
2100 Intel Haswell CPU-cores
560 Intel Broadwell CPU-cores
NVIDIA GPU coprocessors for data processing and
visualisation:
• 48 NVIDIA Tesla K80
• 40 NVIDIA Pascal P100 (16GB PCIe) (upgrade)
• 8 NVIDIA Grid K1 (32 individual GPUs) for medium
and low end visualisation
A 1.15 petabyte Lustre parallel file system
100 Gb/s Ethernet Mellanox Spectrum
Supplied by Dell, Mellanox and NVIDIA
M3
Steve Oberlin, Chief Technology Officer
Accelerated Computing, NVIDIA
Alan Finkel 

Australia’s Chief Scientist
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
www.openstack.org/science
openstack.org
The Crossroads of Cloud
and HPC: OpenStack
for Scientific Research
Exploring OpenStack cloud
computing for scientific workloads
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
Why OpenStack
‣Heterogeneous user requirements
‣same underlying infrastructure can be expanded to
accommodate multiple distinct and dynamic clusters (e.g.
bioinformatics focused, Hadoop)
‣Clusters need provisioning systems anyway
‣Forcing the cluster to be cloud-provisioning and managed makes it
easier to leverage other cloud resources e.g. community science
cloud, commercial cloud
‣OpenStack is a big focus of innovation and effort in the industry -
benefits of association and osmosis
‣Business function boundaries at the APIs
?
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
Key tuning for HPC
‣ With hardware features & software tuning this
is very much possible and performance is
almost native
‣ CPU host-model / host-passthrough
‣ Expose host CPU and NUMA cell topology
‣ Pin virtual cores to physical cores
‣ Pin virtual memory to physical memory
‣ Back guest memory with hugepages
‣ Disable kernel consolidation features
‣ Remove host network overheads for high-
performance data
http://frankdenneman.nl/2015/02/27/memory-deep-dive-numa-data-locality/
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
https://www.mellanox.com/related-docs/whitepapers/WP_Solving_IO_Bottlenecks.pdf
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
M3 HPFS Integration
• special flavors for cluster
instances which specify a PCI
passthrough SRIOV vNIC
• hypervisor has NICs with VFs
tied to data VLAN(s)
• data VLAN is RDMA capable so
e.g. Lustre can use o2ib LNET
driver
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
HPC-Cloud Interconnect
…
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
M3 Compute Performance
Snapshot
• Early system and virtualisation tuning on an m3d node
• Hardware & hypervisor:
• Dell R730, 2x E5-2680 v3 (2x 12 cores, HT off), 256GB RAM, 2x NVIDIA K80 cards,
Mellanox CX-4 50GbE DP
• Ubuntu Trusty host with Xenial kernel (4.4) and Mitaka Ubuntu Cloud archive hypervisor
(QEMU 2.5 + KVM)
• (Kernel samepage merging and transparent huge pages disabled to avoid performance
noise)
• Guest:
• M3 large GPU compute flavor (m3d) - 24 cores, 240GB RAM, 4x K80 GPUs, 1x Mellanox
CX-4 Virtual Function
• CentOS7 guest (3.10 kernel) running High Performance Linpack and Intel Optimised Linpack
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
So, all 👍 ?
• Early user on-boarding hit some speed bumps with inconsistent to poor performance on particular codes/
workloads, e.g., slower than legacy clusters
• Initial tuning did not include hugepages because…
• Couldn’t start 240GB RAM guests backed by static hugepages - initial memory allocation in KVM is
single threaded and takes longer than 30 secs after which libvirt gives up and shoots guest
• enabled transparent hugepages (THP) for large memory guests, configured 1G static hugepages for
everything else and repeated tests for all hosts to ensure no “bad” nodes
• benchmarks from m3a nodes:
• Dell C6320, 2x E5-2680 v3 (2x 12 cores, HT off), 128GB RAM, Mellanox CX-4 Lx 25GbE DP
• Ubuntu Trusty host with Xenial kernel (4.4) and Mitaka Ubuntu Cloud archive hypervisor (QEMU
2.5 + KVM)
• M3 standard compute flavor (m3a) - 24 cores, 120GB RAM, 1x Mellanox CX-4 Lx Virtual Function
• CentOS7 guest (3.10 kernel) running High Performance Linpack
17
500	
550	
600	
650	
700	
750	
0	 20,000	 40,000	 60,000	 80,000	 100,000	 120,000	 140,000	
Gigaflops	
Linpack	Matrix	Size	
Hypervisor	 Guest	Without	Hpages	 Guest	With	Hpages	
m3a nodes
High Performance Linpack (HPL) performance characterisation
?
18400	
450	
500	
550	
600	
650	
700	
750	
120,000	
Hypervisor
VM
Hugepage backed VM
m3a HPL 120k Ns
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
GPU-accelerated OpenStack Instances
How-to?
1. Confirm hardware capability
• IOMMU - Intel VT-d, AMD-Vi (common in contemporary servers)
• GPU support
• https://etherpad.openstack.org/p/GPU-passthrough-model-
success-failure
2. Prep nova-compute hosts/hypervisors
3. Configure OpenStack nova-scheduler
4. Create GPU flavor
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
GPU-accelerated OpenStack Instances
1. Confirm hardware capability
2. Prep compute hosts/hypervisors
1. ensure IOMMU is enabled in BIOS
2. enable IOMMU in Linux, e.g., for Intel:
3. ensure no other drivers/modules claim GPUs, e.g., blacklist
nouveau
4. Configure nova-compute.conf pci_passthrough_whitelist:
# in /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt rd.modules-
load=vfio-pci”
~$ update-grub
~$ lspci -nn | grep NVIDIA
03:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f8] (rev a1)
82:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f8] (rev a1)
# in /etc/nova/nova.conf:
pci_passthrough_whitelist=[{"vendor_id":"10de", "product_id":"15f8"}]
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
GPU-accelerated OpenStack Instances
1. Confirm hardware capability
2. Prep compute hosts/hypervisors
3. Configure OpenStack nova-scheduler
1. On nova-scheduler / cloud-controllers
# in /etc/nova/nova.conf:
pci_alias={"vendor_id":"10de", "product_id":"15f8", "name":"P100"}
scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler
scheduler_available_filters=nova.scheduler.filters.all_filters
scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter
.PciPassthroughFilter
scheduler_default_filters=RamFilter,ComputeFilter,AvailabilityZoneFilter,
ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
GPU-accelerated OpenStack Instances
1. Confirm hardware capability
2. Prep compute hosts/hypervisors
3. Configure OpenStack nova-scheduler
4. Create GPU flavor
~$ openstack flavor create --ram 122880 --disk 30
--vcpus 24 mon.m3.c24r120.2gpu-p100.mlx
~$ openstack flavor set mon.m3.c24r120.2gpu-p100.mlx
--property pci_passthrough:alias='P100:2'
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
GPU-accelerated OpenStack Instances
~$ openstack flavor show 56cd053c-b6a2-4103-b870-a83dd5d27ec1
+----------------------------+--------------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 1000 |
| disk | 30 |
| id | 56cd053c-b6a2-4103-b870-a83dd5d27ec1 |
| name | mon.m3.c24r120.2gpu-p100.mlx |
| os-flavor-access:is_public | False |
| properties | pci_passthrough:alias='P100:2,MlxCX4-VF:1' |
| ram | 122880 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 24 |
+----------------------------+--------------------------------------------+
~$ openstack server list --all-projects --project d99… --flavor 56c…
+--------------------------------------+------------+--------+----------------------------------+
| ID | Name | Status | Networks |
+--------------------------------------+------------+--------+----------------------------------+
| 1d77bf12-0099-4580-bf6f-36c42225f2c0 | massive003 | ACTIVE | monash-03-internal=10.16.201.20 |
+--------------------------------------+------------+--------+----------------------------------+
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
GPU Instances - rough edges
• Hardware monitoring
• No OOB interface to monitor GPU hardware when it is assigned to
an instance (and doing so would require loading drivers in the host)
• P2P (peer-to-peer multi-GPU)
• PCIe topology not available in default guest configuration (not even
a PCIe bus on legacy QEMU i440fx machine type)
• PCIe ACS (Access Control Services - forces transactions through
the Root Complex which blocks/disallows P2P for security)
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
GPU Instances - rough edges
• PCIe security
• Compromised device could access privileged host memory via PCIe
ATS (Address Translation Services)
• Some special device registers should be blocked/proxied in multi-
tenant environment
• Common to use cloud images for base OS+driver versioning and
standardisation, but new NVIDIA driver versions do not support some
existing hardware (e.g. K1)
• Requires multiple images or automated driver deployment/config -
no big thing just inconvenient
bought to you by
MASSIVE Business Plan 2013 / 2014 DRAFT
OpenStack Cyborg -
accelerator management
… aims to provide a general purpose
management framework for acceleration
resources (i.e. various types of
accelerators such as Crypto cards,
GPUs, FPGAs, NVMe/NOF SSDs, ODP,
DPDK/SPDK and so on)
(https://wiki.openstack.org/wiki/Cyborg)
https://review.openstack.org/#/c/448228/
Title: Business Plan for the Multi-modal Australian ScienceS Imaging and
Visualisation Environment (MASSIVE) 2013 / 2014
Document no: MASSIVE-BP-2.3 DRAFT
Date: June 2013
Prepared by: Name: Wojtek J Goscinski
Title: MASSIVE Coordinator
Approved by: Name: MASSIVE Steering Committee
Date:
Open IaaS:
Technology:
30/10/2015 1:59 pmMyTardis | Automatically stores your instrument data for sharing.
Menu
MyTardis Tech Group Meeting
#3
Posted on August 20, 2015August 20, 2015 by steve.androulakissteve.androulakis
It’s been months since the last one, so a wealth of activity to report on.
MyTardis
Automatically stores your instrument data for sharing.
Application layers:

Más contenido relacionado

La actualidad más candente

Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed_Hat_Storage
 
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStackAdam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStackShapeBlue
 
Building a redundant CloudStack management cluster - Vladimir Melnik
Building a redundant CloudStack management cluster - Vladimir MelnikBuilding a redundant CloudStack management cluster - Vladimir Melnik
Building a redundant CloudStack management cluster - Vladimir MelnikShapeBlue
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatOpenStack
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?Tim Bell
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017Cloud Native Day Tel Aviv
 
Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)Idan Atias
 
CloudStack news
CloudStack newsCloudStack news
CloudStack newsShapeBlue
 
Wido den Hollander - building highly available cloud with Ceph and CloudStack
Wido den Hollander - building highly available cloud with Ceph and CloudStackWido den Hollander - building highly available cloud with Ceph and CloudStack
Wido den Hollander - building highly available cloud with Ceph and CloudStackShapeBlue
 
Paul Angus - what's new in ACS 4.11
Paul Angus - what's new in ACS 4.11Paul Angus - what's new in ACS 4.11
Paul Angus - what's new in ACS 4.11ShapeBlue
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageMatthew Sheppard
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceShapeBlue
 
7 distributed storage_open_stack
7 distributed storage_open_stack7 distributed storage_open_stack
7 distributed storage_open_stackopenstackindia
 
Giles Sirett - welcome and CloudStack news
Giles Sirett - welcome and CloudStack news Giles Sirett - welcome and CloudStack news
Giles Sirett - welcome and CloudStack news ShapeBlue
 
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...OpenNebula Project
 
Building a Microsoft cloud with open technologies
Building a Microsoft cloud with open technologiesBuilding a Microsoft cloud with open technologies
Building a Microsoft cloud with open technologiesAlessandro Pilotti
 
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'OpenStack Korea Community
 
Planning your OpenStack PoC
Planning your OpenStack PoCPlanning your OpenStack PoC
Planning your OpenStack PoCopenstackstl
 
Paul Angus - CloudStack Container Service
Paul  Angus - CloudStack Container ServicePaul  Angus - CloudStack Container Service
Paul Angus - CloudStack Container ServiceShapeBlue
 

La actualidad más candente (20)

Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and Future
 
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStackAdam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
 
Qct quick stack ubuntu openstack
Qct quick stack ubuntu openstackQct quick stack ubuntu openstack
Qct quick stack ubuntu openstack
 
Building a redundant CloudStack management cluster - Vladimir Melnik
Building a redundant CloudStack management cluster - Vladimir MelnikBuilding a redundant CloudStack management cluster - Vladimir Melnik
Building a redundant CloudStack management cluster - Vladimir Melnik
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
 
Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)Introduction to Container Storage Interface (CSI)
Introduction to Container Storage Interface (CSI)
 
CloudStack news
CloudStack newsCloudStack news
CloudStack news
 
Wido den Hollander - building highly available cloud with Ceph and CloudStack
Wido den Hollander - building highly available cloud with Ceph and CloudStackWido den Hollander - building highly available cloud with Ceph and CloudStack
Wido den Hollander - building highly available cloud with Ceph and CloudStack
 
Paul Angus - what's new in ACS 4.11
Paul Angus - what's new in ACS 4.11Paul Angus - what's new in ACS 4.11
Paul Angus - what's new in ACS 4.11
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experience
 
7 distributed storage_open_stack
7 distributed storage_open_stack7 distributed storage_open_stack
7 distributed storage_open_stack
 
Giles Sirett - welcome and CloudStack news
Giles Sirett - welcome and CloudStack news Giles Sirett - welcome and CloudStack news
Giles Sirett - welcome and CloudStack news
 
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
 
Building a Microsoft cloud with open technologies
Building a Microsoft cloud with open technologiesBuilding a Microsoft cloud with open technologies
Building a Microsoft cloud with open technologies
 
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
 
Planning your OpenStack PoC
Planning your OpenStack PoCPlanning your OpenStack PoC
Planning your OpenStack PoC
 
Paul Angus - CloudStack Container Service
Paul  Angus - CloudStack Container ServicePaul  Angus - CloudStack Container Service
Paul Angus - CloudStack Container Service
 

Similar a Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash University

CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014Olli-Pekka Lehto
 
Sven Vogel: Running CloudStack and OpenShift with NetApp on KVM
Sven Vogel: Running CloudStack and OpenShift with NetApp on KVMSven Vogel: Running CloudStack and OpenShift with NetApp on KVM
Sven Vogel: Running CloudStack and OpenShift with NetApp on KVMShapeBlue
 
Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftKangaroot
 
Ceph Day Amsterdam 2015 - Building your own disaster? The safe way to make C...
Ceph Day Amsterdam 2015 - Building your own disaster?  The safe way to make C...Ceph Day Amsterdam 2015 - Building your own disaster?  The safe way to make C...
Ceph Day Amsterdam 2015 - Building your own disaster? The safe way to make C...Ceph Community
 
Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentationRodrigo Missiaggia
 
Open cloud infrastructure built for the enterprise
Open cloud infrastructure built for the enterpriseOpen cloud infrastructure built for the enterprise
Open cloud infrastructure built for the enterpriseRedHatInc
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...Ceph Community
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AITyrone Systems
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackDataStax Academy
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...E-Commerce Brasil
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Red_Hat_Storage
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 
Cloud Strategies for a modern hybrid datacenter - Dec 2015
Cloud Strategies for a modern hybrid datacenter - Dec 2015Cloud Strategies for a modern hybrid datacenter - Dec 2015
Cloud Strategies for a modern hybrid datacenter - Dec 2015Miguel Pérez Colino
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Sumeet Singh
 

Similar a Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash University (20)

CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014
 
NSCC Training Introductory Class
NSCC Training Introductory Class NSCC Training Introductory Class
NSCC Training Introductory Class
 
Sven Vogel: Running CloudStack and OpenShift with NetApp on KVM
Sven Vogel: Running CloudStack and OpenShift with NetApp on KVMSven Vogel: Running CloudStack and OpenShift with NetApp on KVM
Sven Vogel: Running CloudStack and OpenShift with NetApp on KVM
 
Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShift
 
Ceph Day Amsterdam 2015 - Building your own disaster? The safe way to make C...
Ceph Day Amsterdam 2015 - Building your own disaster?  The safe way to make C...Ceph Day Amsterdam 2015 - Building your own disaster?  The safe way to make C...
Ceph Day Amsterdam 2015 - Building your own disaster? The safe way to make C...
 
Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentation
 
Open cloud infrastructure built for the enterprise
Open cloud infrastructure built for the enterpriseOpen cloud infrastructure built for the enterprise
Open cloud infrastructure built for the enterprise
 
NSCC Training - Introductory Class
NSCC Training - Introductory ClassNSCC Training - Introductory Class
NSCC Training - Introductory Class
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStack
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Cloud Strategies for a modern hybrid datacenter - Dec 2015
Cloud Strategies for a modern hybrid datacenter - Dec 2015Cloud Strategies for a modern hybrid datacenter - Dec 2015
Cloud Strategies for a modern hybrid datacenter - Dec 2015
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 

Más de OpenStack

Swinburne University of Technology - Shunde Zhang & Kieran Spear, Aptira
Swinburne University of Technology - Shunde Zhang & Kieran Spear, AptiraSwinburne University of Technology - Shunde Zhang & Kieran Spear, Aptira
Swinburne University of Technology - Shunde Zhang & Kieran Spear, AptiraOpenStack
 
Related OSS Projects - Peter Rowe, Flexera Software
Related OSS Projects - Peter Rowe, Flexera SoftwareRelated OSS Projects - Peter Rowe, Flexera Software
Related OSS Projects - Peter Rowe, Flexera SoftwareOpenStack
 
Supercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPCSupercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPCOpenStack
 
Federation and Interoperability in the Nectar Research Cloud
Federation and Interoperability in the Nectar Research CloudFederation and Interoperability in the Nectar Research Cloud
Federation and Interoperability in the Nectar Research CloudOpenStack
 
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red HatHyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red HatOpenStack
 
Migrating your infrastructure to OpenStack - Avi Miller, Oracle
Migrating your infrastructure to OpenStack - Avi Miller, OracleMigrating your infrastructure to OpenStack - Avi Miller, Oracle
Migrating your infrastructure to OpenStack - Avi Miller, OracleOpenStack
 
A glimpse into an industry Cloud using Open Source Technologies - Adrian Koh,...
A glimpse into an industry Cloud using Open Source Technologies - Adrian Koh,...A glimpse into an industry Cloud using Open Source Technologies - Adrian Koh,...
A glimpse into an industry Cloud using Open Source Technologies - Adrian Koh,...OpenStack
 
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEUnderstanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEOpenStack
 
Diving in the desert: A quick overview into OpenStack Sahara capabilities - A...
Diving in the desert: A quick overview into OpenStack Sahara capabilities - A...Diving in the desert: A quick overview into OpenStack Sahara capabilities - A...
Diving in the desert: A quick overview into OpenStack Sahara capabilities - A...OpenStack
 
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...OpenStack
 
Ironically, Infrastructure Doesn't Matter - Quinton Anderson, Commonwealth Ba...
Ironically, Infrastructure Doesn't Matter - Quinton Anderson, Commonwealth Ba...Ironically, Infrastructure Doesn't Matter - Quinton Anderson, Commonwealth Ba...
Ironically, Infrastructure Doesn't Matter - Quinton Anderson, Commonwealth Ba...OpenStack
 
Traditional Enterprise to OpenStack Cloud - An Unexpected Journey
Traditional Enterprise to OpenStack Cloud - An Unexpected JourneyTraditional Enterprise to OpenStack Cloud - An Unexpected Journey
Traditional Enterprise to OpenStack Cloud - An Unexpected JourneyOpenStack
 
Building a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash University
Building a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash UniversityBuilding a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash University
Building a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash UniversityOpenStack
 
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...OpenStack
 
Containers and OpenStack: Marc Van Hoof, Kumulus: Containers and OpenStack
Containers and OpenStack: Marc Van Hoof, Kumulus: Containers and OpenStackContainers and OpenStack: Marc Van Hoof, Kumulus: Containers and OpenStack
Containers and OpenStack: Marc Van Hoof, Kumulus: Containers and OpenStackOpenStack
 
Moving to Cloud for Good: Alexander Tsirel, HiveTec
Moving to Cloud for Good: Alexander Tsirel, HiveTecMoving to Cloud for Good: Alexander Tsirel, HiveTec
Moving to Cloud for Good: Alexander Tsirel, HiveTecOpenStack
 
We Are OpenStack: David F. Flanders & Tom Fifield, OpenStack Foundation
We Are OpenStack: David F. Flanders & Tom Fifield, OpenStack FoundationWe Are OpenStack: David F. Flanders & Tom Fifield, OpenStack Foundation
We Are OpenStack: David F. Flanders & Tom Fifield, OpenStack FoundationOpenStack
 
Big Data and OpenStack, a Love Story: Michael Still, Rackspace
Big Data and OpenStack, a Love Story: Michael Still, RackspaceBig Data and OpenStack, a Love Story: Michael Still, Rackspace
Big Data and OpenStack, a Love Story: Michael Still, RackspaceOpenStack
 
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...OpenStack
 
Crowbar and OpenStack: Steve Kowalik, SUSE
Crowbar and OpenStack: Steve Kowalik, SUSECrowbar and OpenStack: Steve Kowalik, SUSE
Crowbar and OpenStack: Steve Kowalik, SUSEOpenStack
 

Más de OpenStack (20)

Swinburne University of Technology - Shunde Zhang & Kieran Spear, Aptira
Swinburne University of Technology - Shunde Zhang & Kieran Spear, AptiraSwinburne University of Technology - Shunde Zhang & Kieran Spear, Aptira
Swinburne University of Technology - Shunde Zhang & Kieran Spear, Aptira
 
Related OSS Projects - Peter Rowe, Flexera Software
Related OSS Projects - Peter Rowe, Flexera SoftwareRelated OSS Projects - Peter Rowe, Flexera Software
Related OSS Projects - Peter Rowe, Flexera Software
 
Supercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPCSupercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPC
 
Federation and Interoperability in the Nectar Research Cloud
Federation and Interoperability in the Nectar Research CloudFederation and Interoperability in the Nectar Research Cloud
Federation and Interoperability in the Nectar Research Cloud
 
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red HatHyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
 
Migrating your infrastructure to OpenStack - Avi Miller, Oracle
Migrating your infrastructure to OpenStack - Avi Miller, OracleMigrating your infrastructure to OpenStack - Avi Miller, Oracle
Migrating your infrastructure to OpenStack - Avi Miller, Oracle
 
A glimpse into an industry Cloud using Open Source Technologies - Adrian Koh,...
A glimpse into an industry Cloud using Open Source Technologies - Adrian Koh,...A glimpse into an industry Cloud using Open Source Technologies - Adrian Koh,...
A glimpse into an industry Cloud using Open Source Technologies - Adrian Koh,...
 
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEUnderstanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
 
Diving in the desert: A quick overview into OpenStack Sahara capabilities - A...
Diving in the desert: A quick overview into OpenStack Sahara capabilities - A...Diving in the desert: A quick overview into OpenStack Sahara capabilities - A...
Diving in the desert: A quick overview into OpenStack Sahara capabilities - A...
 
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
 
Ironically, Infrastructure Doesn't Matter - Quinton Anderson, Commonwealth Ba...
Ironically, Infrastructure Doesn't Matter - Quinton Anderson, Commonwealth Ba...Ironically, Infrastructure Doesn't Matter - Quinton Anderson, Commonwealth Ba...
Ironically, Infrastructure Doesn't Matter - Quinton Anderson, Commonwealth Ba...
 
Traditional Enterprise to OpenStack Cloud - An Unexpected Journey
Traditional Enterprise to OpenStack Cloud - An Unexpected JourneyTraditional Enterprise to OpenStack Cloud - An Unexpected Journey
Traditional Enterprise to OpenStack Cloud - An Unexpected Journey
 
Building a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash University
Building a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash UniversityBuilding a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash University
Building a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash University
 
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...
 
Containers and OpenStack: Marc Van Hoof, Kumulus: Containers and OpenStack
Containers and OpenStack: Marc Van Hoof, Kumulus: Containers and OpenStackContainers and OpenStack: Marc Van Hoof, Kumulus: Containers and OpenStack
Containers and OpenStack: Marc Van Hoof, Kumulus: Containers and OpenStack
 
Moving to Cloud for Good: Alexander Tsirel, HiveTec
Moving to Cloud for Good: Alexander Tsirel, HiveTecMoving to Cloud for Good: Alexander Tsirel, HiveTec
Moving to Cloud for Good: Alexander Tsirel, HiveTec
 
We Are OpenStack: David F. Flanders & Tom Fifield, OpenStack Foundation
We Are OpenStack: David F. Flanders & Tom Fifield, OpenStack FoundationWe Are OpenStack: David F. Flanders & Tom Fifield, OpenStack Foundation
We Are OpenStack: David F. Flanders & Tom Fifield, OpenStack Foundation
 
Big Data and OpenStack, a Love Story: Michael Still, Rackspace
Big Data and OpenStack, a Love Story: Michael Still, RackspaceBig Data and OpenStack, a Love Story: Michael Still, Rackspace
Big Data and OpenStack, a Love Story: Michael Still, Rackspace
 
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
 
Crowbar and OpenStack: Steve Kowalik, SUSE
Crowbar and OpenStack: Steve Kowalik, SUSECrowbar and OpenStack: Steve Kowalik, SUSE
Crowbar and OpenStack: Steve Kowalik, SUSE
 

Último

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash University

  • 1. Building a GPU-enabled OpenStack Cloud for HPC Blair Bethwaite, Lance Wilson, (and many others) MONASH
 eRESEARCH
  • 2. Monash eResearch Centre: 
 Enabling and Accelerating 21st Century Discovery through the application of advanced computing, data informatics, tools and infrastructure, delivered at scale, and built by with “co-design” principle (researcher + technologist)
  • 3. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT • UniMelb, as lead agent for Nectar, established first Node/site of the Research Cloud in Jan 2012 and opened doors to the research community • Now seven Nodes (10+ DCs) and >40k cores around Australia • Nectar established an OpenStack ecosystem for research computing in Australia • M3 built as first service in a new “monash-03” zone of the Research Cloud focusing on HPC (computing) & HPDA (data-analytics)
  • 4. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT
  • 5. HPC 150 active projects 1000+ user accounts 100+ institutions across Australia Interactive Vis 600+ users Multi-modal Australian ScienceS Imaging and Visualisation Environment Specialised Facility for Imaging and Visualisation MASSIVE Instrument
 Integration Integrating with key Australian Instrument Facilities. – IMBL, XFM – CryoEM – MBI – NCRIS: NIF, AMMRF Large cohort of researchers new to HPC ~$2M per year funded by partners and national project funding Partners Monash University Australian Synchrotron CSIRO Affiliate Partners ARC Centre of Excellence in Integrative Brain Function ARC Centre of Excellence in Advanced Molecular Imaging
  • 6. M3 at Monash University
 (including recent upgrade) A Computer for 
 Next-Generation Data Science 2100 Intel Haswell CPU-cores 560 Intel Broadwell CPU-cores NVIDIA GPU coprocessors for data processing and visualisation: • 48 NVIDIA Tesla K80 • 40 NVIDIA Pascal P100 (16GB PCIe) (upgrade) • 8 NVIDIA Grid K1 (32 individual GPUs) for medium and low end visualisation A 1.15 petabyte Lustre parallel file system 100 Gb/s Ethernet Mellanox Spectrum Supplied by Dell, Mellanox and NVIDIA M3 Steve Oberlin, Chief Technology Officer Accelerated Computing, NVIDIA Alan Finkel 
 Australia’s Chief Scientist
  • 7. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT www.openstack.org/science openstack.org The Crossroads of Cloud and HPC: OpenStack for Scientific Research Exploring OpenStack cloud computing for scientific workloads
  • 8. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT Why OpenStack ‣Heterogeneous user requirements ‣same underlying infrastructure can be expanded to accommodate multiple distinct and dynamic clusters (e.g. bioinformatics focused, Hadoop) ‣Clusters need provisioning systems anyway ‣Forcing the cluster to be cloud-provisioning and managed makes it easier to leverage other cloud resources e.g. community science cloud, commercial cloud ‣OpenStack is a big focus of innovation and effort in the industry - benefits of association and osmosis ‣Business function boundaries at the APIs ?
  • 9. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT Key tuning for HPC ‣ With hardware features & software tuning this is very much possible and performance is almost native ‣ CPU host-model / host-passthrough ‣ Expose host CPU and NUMA cell topology ‣ Pin virtual cores to physical cores ‣ Pin virtual memory to physical memory ‣ Back guest memory with hugepages ‣ Disable kernel consolidation features ‣ Remove host network overheads for high- performance data http://frankdenneman.nl/2015/02/27/memory-deep-dive-numa-data-locality/
  • 10. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT https://www.mellanox.com/related-docs/whitepapers/WP_Solving_IO_Bottlenecks.pdf
  • 11. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT M3 HPFS Integration • special flavors for cluster instances which specify a PCI passthrough SRIOV vNIC • hypervisor has NICs with VFs tied to data VLAN(s) • data VLAN is RDMA capable so e.g. Lustre can use o2ib LNET driver
  • 12. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT HPC-Cloud Interconnect …
  • 13. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT M3 Compute Performance Snapshot • Early system and virtualisation tuning on an m3d node • Hardware & hypervisor: • Dell R730, 2x E5-2680 v3 (2x 12 cores, HT off), 256GB RAM, 2x NVIDIA K80 cards, Mellanox CX-4 50GbE DP • Ubuntu Trusty host with Xenial kernel (4.4) and Mitaka Ubuntu Cloud archive hypervisor (QEMU 2.5 + KVM) • (Kernel samepage merging and transparent huge pages disabled to avoid performance noise) • Guest: • M3 large GPU compute flavor (m3d) - 24 cores, 240GB RAM, 4x K80 GPUs, 1x Mellanox CX-4 Virtual Function • CentOS7 guest (3.10 kernel) running High Performance Linpack and Intel Optimised Linpack
  • 14. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT
  • 15. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT
  • 16. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT So, all 👍 ? • Early user on-boarding hit some speed bumps with inconsistent to poor performance on particular codes/ workloads, e.g., slower than legacy clusters • Initial tuning did not include hugepages because… • Couldn’t start 240GB RAM guests backed by static hugepages - initial memory allocation in KVM is single threaded and takes longer than 30 secs after which libvirt gives up and shoots guest • enabled transparent hugepages (THP) for large memory guests, configured 1G static hugepages for everything else and repeated tests for all hosts to ensure no “bad” nodes • benchmarks from m3a nodes: • Dell C6320, 2x E5-2680 v3 (2x 12 cores, HT off), 128GB RAM, Mellanox CX-4 Lx 25GbE DP • Ubuntu Trusty host with Xenial kernel (4.4) and Mitaka Ubuntu Cloud archive hypervisor (QEMU 2.5 + KVM) • M3 standard compute flavor (m3a) - 24 cores, 120GB RAM, 1x Mellanox CX-4 Lx Virtual Function • CentOS7 guest (3.10 kernel) running High Performance Linpack
  • 17. 17 500 550 600 650 700 750 0 20,000 40,000 60,000 80,000 100,000 120,000 140,000 Gigaflops Linpack Matrix Size Hypervisor Guest Without Hpages Guest With Hpages m3a nodes High Performance Linpack (HPL) performance characterisation ?
  • 19. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT GPU-accelerated OpenStack Instances How-to? 1. Confirm hardware capability • IOMMU - Intel VT-d, AMD-Vi (common in contemporary servers) • GPU support • https://etherpad.openstack.org/p/GPU-passthrough-model- success-failure 2. Prep nova-compute hosts/hypervisors 3. Configure OpenStack nova-scheduler 4. Create GPU flavor
  • 20. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT GPU-accelerated OpenStack Instances 1. Confirm hardware capability 2. Prep compute hosts/hypervisors 1. ensure IOMMU is enabled in BIOS 2. enable IOMMU in Linux, e.g., for Intel: 3. ensure no other drivers/modules claim GPUs, e.g., blacklist nouveau 4. Configure nova-compute.conf pci_passthrough_whitelist: # in /etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt rd.modules- load=vfio-pci” ~$ update-grub ~$ lspci -nn | grep NVIDIA 03:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f8] (rev a1) 82:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f8] (rev a1) # in /etc/nova/nova.conf: pci_passthrough_whitelist=[{"vendor_id":"10de", "product_id":"15f8"}]
  • 21. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT GPU-accelerated OpenStack Instances 1. Confirm hardware capability 2. Prep compute hosts/hypervisors 3. Configure OpenStack nova-scheduler 1. On nova-scheduler / cloud-controllers # in /etc/nova/nova.conf: pci_alias={"vendor_id":"10de", "product_id":"15f8", "name":"P100"} scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler scheduler_available_filters=nova.scheduler.filters.all_filters scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter .PciPassthroughFilter scheduler_default_filters=RamFilter,ComputeFilter,AvailabilityZoneFilter, ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter
  • 22. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT GPU-accelerated OpenStack Instances 1. Confirm hardware capability 2. Prep compute hosts/hypervisors 3. Configure OpenStack nova-scheduler 4. Create GPU flavor ~$ openstack flavor create --ram 122880 --disk 30 --vcpus 24 mon.m3.c24r120.2gpu-p100.mlx ~$ openstack flavor set mon.m3.c24r120.2gpu-p100.mlx --property pci_passthrough:alias='P100:2'
  • 23. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT GPU-accelerated OpenStack Instances ~$ openstack flavor show 56cd053c-b6a2-4103-b870-a83dd5d27ec1 +----------------------------+--------------------------------------------+ | Field | Value | +----------------------------+--------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 1000 | | disk | 30 | | id | 56cd053c-b6a2-4103-b870-a83dd5d27ec1 | | name | mon.m3.c24r120.2gpu-p100.mlx | | os-flavor-access:is_public | False | | properties | pci_passthrough:alias='P100:2,MlxCX4-VF:1' | | ram | 122880 | | rxtx_factor | 1.0 | | swap | | | vcpus | 24 | +----------------------------+--------------------------------------------+ ~$ openstack server list --all-projects --project d99… --flavor 56c… +--------------------------------------+------------+--------+----------------------------------+ | ID | Name | Status | Networks | +--------------------------------------+------------+--------+----------------------------------+ | 1d77bf12-0099-4580-bf6f-36c42225f2c0 | massive003 | ACTIVE | monash-03-internal=10.16.201.20 | +--------------------------------------+------------+--------+----------------------------------+
  • 24. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT GPU Instances - rough edges • Hardware monitoring • No OOB interface to monitor GPU hardware when it is assigned to an instance (and doing so would require loading drivers in the host) • P2P (peer-to-peer multi-GPU) • PCIe topology not available in default guest configuration (not even a PCIe bus on legacy QEMU i440fx machine type) • PCIe ACS (Access Control Services - forces transactions through the Root Complex which blocks/disallows P2P for security)
  • 25. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT GPU Instances - rough edges • PCIe security • Compromised device could access privileged host memory via PCIe ATS (Address Translation Services) • Some special device registers should be blocked/proxied in multi- tenant environment • Common to use cloud images for base OS+driver versioning and standardisation, but new NVIDIA driver versions do not support some existing hardware (e.g. K1) • Requires multiple images or automated driver deployment/config - no big thing just inconvenient
  • 26. bought to you by MASSIVE Business Plan 2013 / 2014 DRAFT OpenStack Cyborg - accelerator management … aims to provide a general purpose management framework for acceleration resources (i.e. various types of accelerators such as Crypto cards, GPUs, FPGAs, NVMe/NOF SSDs, ODP, DPDK/SPDK and so on) (https://wiki.openstack.org/wiki/Cyborg) https://review.openstack.org/#/c/448228/
  • 27. Title: Business Plan for the Multi-modal Australian ScienceS Imaging and Visualisation Environment (MASSIVE) 2013 / 2014 Document no: MASSIVE-BP-2.3 DRAFT Date: June 2013 Prepared by: Name: Wojtek J Goscinski Title: MASSIVE Coordinator Approved by: Name: MASSIVE Steering Committee Date: Open IaaS: Technology: 30/10/2015 1:59 pmMyTardis | Automatically stores your instrument data for sharing. Menu MyTardis Tech Group Meeting #3 Posted on August 20, 2015August 20, 2015 by steve.androulakissteve.androulakis It’s been months since the last one, so a wealth of activity to report on. MyTardis Automatically stores your instrument data for sharing. Application layers: