2. 2/30Bringing Private Cloud Computing to HPC and Science !
Contents
Building Private Cloud Computing to HPC and Science
This presentation is about:
• The Private HPC Cloud Use Case
• Main Challenges for Private HPC Cloud
• Private HPC Cloud Case Studies
• Private Cloud Trends in Industry
• About Grid and Cloud
3. 3/30Bringing Private Cloud Computing to HPC and Science !
The Private HPC and Science Cloud Use Case
The Pre-cloud Era!
LRMS (LSF, PBS, SGE…)
Grid Middleware
AccessProvision
4. 4/30Bringing Private Cloud Computing to HPC and Science !
The Private HPC and Science Cloud Use Case
OpenNebula as an Infrastructure Tool – Enhanced Capabilities!
Virtual Worker Nodes
LRMS (LSF, PBS, SGE…)
Grid Middleware
AccessProvisionService
• Common interfaces
• Grid integration
• Custom environments
• Dynamic elasticity
• Consolidation of WNs
• Simplified management
• Physical – Virtual WNs
• Dynamic capacity partitioning
• Faster upgrades
Service/Provisioning Decoupling!
5. 5/30Bringing Private Cloud Computing to HPC and Science !
The Private HPC and Science Cloud Use Case
OpenNebula as an Provisioning Tool – Enhanced Capabilities!
Pilot Jobs, SSH…
IaaS Interface
AccessProvisionService
• Simple Provisioning Interface
• Raw/Appliance VMs
• Dynamic scalable computing
• Custom access to capacity
• Not only batch workloads
• Not only scientific workloads
• Improve utilization
• Reduced service management
• Cost efficiency
6. 6/30Bringing Private Cloud Computing to HPC and Science !
Main Challenges for Private HPC Cloud
Main Demands from Engineering, Research and Supercomputing !
Flexible Definition of
Multi-tier Applications
Resource
Management
Scale-out and
Provisioning
Application
Performance
7. 7/30Bringing Private Cloud Computing to HPC and Science !
Main Challenges for Private HPC Cloud
Using the Cloud – Execution of Multi-tiered Applications !
Management of interconnected multi-VM applications:
• Definition of application flows
• Catalog with pre-defined applications
• Sharing between users and groups
• Management of persistent scientific data
• Automatic elasticity
Front-end
Worker Nodes
{ "name": ”Computing_Cluster",
"deployment": "straight",
"roles": [
{
"name": "frontend",
"vm_template": 0
},
{
"name": "worker",
"parents": frontend,
"cardinality": 2,
"vm_template": 3,
"min_vms" : 1,
"max_vms" : 5,
"elasticity_policies" :
{
”expressions" : ”CPU> 90%”,
"type" : "CHANGE",
"adjust" : 2,
"period_number" : 3,
"period" : 10
}, …
8. 8/30Bringing Private Cloud Computing to HPC and Science !
Main Challenges for Private HPC Cloud
Using the Cloud – Performance Penalty as a Small Tax You Have to Pay!
Overhead in Virtualization
• Single processor performance penalty between 1% and 5%
• NASA has reported an overhead between 9% and 25% (HPCC and NPB)1
• Growing number of users demanding containers (OpenVZ and LXC)
Need for Low-Latency High-Bandwidth Interconnection
• Lower performance, 10 GigE typically, used in clouds has a significant
negative (x2-x10, especially latency) impact on HPC applications1
• FermiCloud has reported MPI performance (HPL benchmark) on VMs and
SR-IOV/Infiniband with only a 4% overhead2
• The Center for HPC at CSR has contributed the KVM SR-IOV Drivers for
Infiniband3
(1) An Application-Based Performance Evaluation of Cloud Computing, NASA Ames, 2013
(2) FermiCloud Update, Keith Chadwick!, Fermilab, HePIX Spring Workshop 2013
(3) http://wiki.chpc.ac.za/acelab:opennebula_sr-iov_vmm_driver , 2013
Overhead in Input/Output
• Growing number of Big Data apps
• Support for multiple datastores including automatic scheduling
9. 9/30Bringing Private Cloud Computing to HPC and Science !
Main Challenges for Private HPC Cloud
Operating the Cloud – Resource Management!
Optimal Placement of Virtual Machines
• Automatic placement of VM near input data
• Striping policy to maximize the resources available to VMs
Fair Share of Resources
• Resource quota management to allocate, track and limit resource utilization.
Management of Different Hardware Profiles
• Resource pools (physical clusters) with specific Hw and Sw profiles, or
security levels for different workload profiles (HPC and HTC)
Isolated Execution of Applications
• Full Isolation of performance-sensitive applications
10. 10/30Bringing Private Cloud Computing to HPC and Science !
Main Challenges for Private HPC Cloud
Operating the Cloud – Scale out and Provisioning!
Multi-tier Deployment
• Management of multiple cloud instances
that may be hosted in different sites
Provide VOs with Isolated Cloud Environ
• Automatic provision of Virtual Data Centers
Hybrid Cloud Computing
• Cloudbursting to address peak or fluctuating
demands for no critical and HTC workloads
11. 11/30Bringing Private Cloud Computing to HPC and Science !
Private HPC Cloud Case Studies
One of Our Main User Communities!
Supercomputing Centers
Research Centers
Distributed Computing Infrastructures
Industry
12. 12/30Bringing Private Cloud Computing to HPC and Science !
Private HPC Cloud Case Studies
FermiCloud!
Nodes KVM on 23 nodes (1 TB RAM - 368 cores) Koi Computer
Network Gigabit and Infiniband
Storage CLVM+GFS2 on shared 120TB NexSAN SataBeats
AuthN X509
Linux Scientific Linux
Interface Sunstone Self-service and EC2 API
App Profile Legacy, HTC and MPI HPC
http://www-fermicloud.fnal.gov/
Typical Workloads
• Scientific stakeholders get access to on-
demand VMs
• Developers & integrators of new Grid
applications
13. 13/30Bringing Private Cloud Computing to HPC and Science !
Private HPC Cloud Case Studies
CESGA Cloud!
Nodes KVM on 27 nodes (0.5 TB RAM – 216 cores) HP ProLiant
Network 2 x Gigabit (1G and 10G)
Storage ssh from remote EMC storage server
AuthN X509 and core password
Linux Scientific Linux
Interface Sunstone Self-service and OCCI
App Profile Individual VMs and virtualised computing clusters
Typical Workloads
• 103 users
• Genomic, rendering…
• Grid services on production at CESGA
• Node at FedCloud project
• UMD middleware testing
http://cloud.cesga.es/
14. 14/30Bringing Private Cloud Computing to HPC and Science !
Private HPC Cloud Case Studies
SARA Cloud!
Nodes KVM on 19 HPC nodes (256 GB RAM 608 cores) Dell PowerEdge and 10 “light”
nodes (64 GB RAM 80 cores) Supermicro
Network 4 x Gigabit (10G) with Arista switch
Storage NFS on 400 GB NAS for HPC and ssh for “light”
AuthN Core password
Linux CentOS
Interface Sunstone and OCCI
App Profile MPI clusters, windows clusters and independent VMs
ww.cloud.sara.nl
Typical Workloads
• Ad-hoc clusters with MPI and pilot jobs
• Windows clusters for Windows-bound
software
• Single VMs, sometimes acting as web
servers to disseminate results
15. 15/30Bringing Private Cloud Computing to HPC and Science !
Private HPC Cloud Case Studies
SZTAKI Cloud!
Nodes KVM on 7 nodes (1.8 TB RAM – 448 cores) DELL PowerEdge
Network 2 x Gigabit (1G and 10G)
Storage iSCSI on DELL storage server 72 TB shared
AuthN X509
Linux CentOS
Interface Sunstone Self-service, EC2 and OCCI
App Profile Individual VMs and virtualised computing cluster
http://cloud.sztaki.hu/
.
Typical Workloads
• Run standard and grid services (e.g.:web
servers, grid middlewares…)
• Development and testing of new codes
• Research on performance and
opportunistic computing
16. 16/30Bringing Private Cloud Computing to HPC and Science !
Private HPC Cloud Case Studies
KTh Cloud!
Nodes KVM on 768 cores (768 GB RAM) HP ProLiant
Network Infiniband and Gigabit
Storage NFS and LVM
AuthN X509 and core password
Linux Ubuntu
Interface Sunstone self-service, OCCI and EC2
App Profile Individual VMs and virtualised computing cluster
http://www.pdc.kth.se/
Typical Workloads
• Mainly BIO
• Hadoop, Spark, Galaxy, Cloud Bio Linux…
17. 17/30Bringing Private Cloud Computing to HPC and Science !
Private Cloud Trends in Industry
Experimenting with ARM for the Private Cloud!
Why?
• Decrease power consumption, reduce costs, simplify solutions…
• Mostly managing bare metal and early experiences with virtualization
Tiniest Cloud Ever! (by Citrix and Linaro at LCU 2013)
Ubuntu on Versatile Express
Cortex-A15 Dual core
Ubuntu on Arndale Board
Cortex-A15 Dual core
http://www.youtube.com/watch?v=xZP9YKv3P_E
18. 18/30Bringing Private Cloud Computing to HPC and Science !
Private Cloud Trends in Industry
Cloud for Mission-critical Applications!
Availability and redundancy to keep it running in case of failure
• Cloud services availability => HA Architectures
• Application availability => Failover Solutions
Service Continuity (by European Aeronautic Company)
OpenNebula 4.0
Automatic failover and recovery within 1 minute
KVM
19. 19/30Bringing Private Cloud Computing to HPC and Science !
Private Cloud Trends in Industry
Hybrid Cloud Deployments !
Transparent and automatic access to the public cloud
• Dev&testing to the public cloud
• Security and performance sensitive workloads on the private cloud
Cloudbursting Deployment (by Telecom Company)
Public
Cloud
1
Public
Cloud
2
Local data center
OpenNebula
Private
Cloud
Cloud API is not relevant
20. 20/30Bringing Private Cloud Computing to HPC and Science !
About Grid and Cloud
What is the Difference between a Grid Site and a Cloud Provider?!
Definitions of Grid Site
• “A resource provider is a site which provides services and resources (e.g.
data storage) to this VO“ (GridPP)
• “What makes a grid site a grid site?. A single grid resource (a grid site)
offers compute and/ or storage services to remote users via standardized
interfaces” (GridKa)
• “A typical (minimal) grid site provides computing and storage to supported
Virtual Organizations (VOs) and runs a few services to make those resources
visible on the grid” (StratusLab)
Definitions of Cloud Provider
• “A cloud provider is a service provider that offers storage and compute
resources on a private or public network“ (IBM)
• “(Cloud) Providers offer resources to the customer – either via dedicated
APIs (PaaS), virtual machines and / or direct access to the resources
(IaaS)” (EC Report on The Future of Cloud Computing)
21. 21/30Bringing Private Cloud Computing to HPC and Science !
Virtual CE, WN… Other (web, mail...) Raw machines
LRMS (LSF, PBS…)
Grid Middleware IaaS Interface
Access
• Batch Job Processing
• Custom Execution Environments
• Grid Service Integration
• Industry Applications
• Other WMS (pilots)
• Complete Services (cluster)
Grid Site External Providers
ProvisionService
About Grid and Cloud
The OpenNebula Vision for Grid Sites: Extending the Range of Applications!
22. 22/30Bringing Private Cloud Computing to HPC and Science !
About Grid and Cloud
What is the Difference between a Grid and Cloud Federation?!
Definitions of Grid
• Ian Foster’s definition lists these primary attributes: “Computing resources
are not administered centrally, open standards are used, and nontrivial
quality of service is achieved”
• Plaszczak/Wellner: “The technology that enables resource virtualization, on-
demand provisioning, and service (resource) sharing between organizations”
• IBM: “The ability, using a set of open standards and protocols, to gain access
to applications and data, processing power, storage capacity and a vast
array of other computing resources over the Internet”
• CERN: “A service for sharing computer power and data storage capacity
over the Internet”
Definition of Cloud Federation
• “Cloud federation is the practice of interconnecting the cloud computing
environments of two or more service providers for the purpose of load
balancing traffic and accommodating spikes in demand”, Wikipedia
23. 23/30Bringing Private Cloud Computing to HPC and Science !
Grid Services
Grid API Cloud API Grid API Cloud API
Appliance Repo
MarketPlace
Cloud/Grid Site Cloud/Grid Site
• Sharing existing VM images
• Registry of metadata
• Image are kept elsewhere
• Supports trust
• Federation facilities
• Security
• Grid specific services
• Storage VM images
• Distributed
• Multi-protocol
About Grid and Cloud
The OpenNebula Vision for Grid Infrastructures, October 2008!
24. 24/30Bringing Private Cloud Computing to HPC and Science !
CloudsGridsUsage
Job Processing
Big Batch System
File Sharing Services
Achievements
Federation of Resources
VO Concept
But…
User experience
Complexity
Usage
Raw infrastructure
Elasticity & Pay-per-use
Simple Web Interface
Achievements
Agile Infrastructures
IT is another Utility
But…
Interoperability
Federation
Customize Environments
Uniform Security
Resource Management
Scientific Applications
Resource Sharing
Flexibility & Simplicity
About Grid and Cloud
Grid and Cloud as Complementary Computing Models!
25. 25/30Bringing Private Cloud Computing to HPC and Science !
About Grid and Cloud
EGI Federated Cloud Doing a Pioneering Work in the Field!
26. 26/30Bringing Private Cloud Computing to HPC and Science !
About Grid and Cloud
Different Names for the Same Model? Same Challenges but Different Technologies?!
Grid Computing
Cloud Computing
27. 27/30Bringing Private Cloud Computing to HPC and Science !
Try it Out!
OpenNebula Sandboxes!
● OpenNebula pre-installed in a VM: VirtualBox, KVM, VMware, Amazon
29. 29/30Bringing Private Cloud Computing to HPC and Science !
Visit our Booth!
Interoperability Features in
OpenNebula
Wednesday, 18
12:10 - 12:25
Room: Escudo
Building a OCCI-compatible
Cloud with OpenNebula
Thursday, 19
13:30 - 16:00
Room: Toledo
30. 30/30Bringing Private Cloud Computing to HPC and Science !
Thanks to People and Organizations that Provided Info to Prepare this Presentation!
Questions?
OpenNebula.org @OpenNebula