Cloud Provider built a new cloud architecture using Infiniband to address challenges with their previous architecture as client needs and the cloud market evolved. Their previous architecture using gigabit Ethernet as the storage network became a bottleneck, limiting performance and scalability. Their new architecture using Infiniband switching provides significantly higher performance, lower latency, and allows them to consolidate more VMs per hypervisor, improving resource utilization and reducing costs. Benchmark tests and client feedback show dramatic improvements in storage throughput, access times, and provisioning speeds with the new design.
2. Agenda
• Introduction of Cloud Provider
• Overview of our previous cloud architecture
• Challenges of that architecture due to evolving
cloud market and clients’ needs
• How we build an improved cloud architecture
• Comparison between the old and new setup
2
3. Introduction: Company
• Cloud hosting and Infrastructure-as-a-Service
(IaaS) provider in the Netherlands
• Founded in 2008, spin-off from a shared
hosting provider
• 2 products, both pay-as-you-go:
– Cloud Servers
– Cloud Apps
• Cloud platform is build using KVM
3
4. Introduction:
Differentiation
• Localized: support in Dutch, datacenter in
Amsterdam
• Ease of use & simplicity: in-house developed
management portal with value-added tools
• Good performance/cost ratio: cloud based on
SSD caching & Infiniband and low-latency
network starting at 13,25 euro per month
4
5. Introduction: Clients
• More than 500 clients and resellers
• Focus on developers, webhosts & ISP’s,
e-commerce, high-traffic sites
• Federation with other cloud providers to buy/sell
eachother’s cloud capacity
• References:
5
8. Previous cloud
architecture (1)
• 4 layers: storage, hypervisors, management, backup
• Hypervisors were connected with SAN using iSCSI
over gigabit ethernet
• Storage VLAN: Bonding 4 network interfaces for SAN,
2 interfaces for each hypervisor to increase
bandwidth
• 3 additional VLANs: public network, internal
management network, and backup network
8
9. Previous cloud
architecture (2)
Internal management network
Internet
Router Hypervisors SAN Control servers Backup SAN
Public network Storage network Backup network
9
10. Cloud market evolving
• Explosion of interest and adoption since 2011
– Number of clients growing, current clients deploy
more and larger VMs
– Expected cloud market growth from US$70.1
billion in 2012 to US$158.8 billion in 2014
• Different workloads are deployed nowadays
– Beside test/dev, also production environments
– The rise of “big data” / high transaction volume
applications and databases 10
11. Clients’ higher demands
• Better reliability & availability
• Higher performance & lower latency
– External: internet connectivity between client and
the cloud
– Internal: connection between hypervisor and SAN,
local network
• Competitive pricing
11
12. Challenges of previous
cloud architecture (1)
• Internal storage network became a
performance bottleneck
– Clients experienced higher latency (iowait)
– Storage performance was inconsistent
– Creating backups (snapshots) took too much time
– Only 30 VMs per hypervisor to keep performance
under control (= lower ROI)
• Difficult to manage and scale due to large
number of networks and cables 12
14. Alternative
interconnects (1)
• Faster alternatives to gigabit iSCSI:
Fibrechannel, 10 Gig-E iSCSI, Infiniband
• Important selection criteria:
– Performance
– Congestion control & Low latency
– Scalability
– Easy to manage
– Density: number of VMs per hypervisor
– Cost 14
15. Alternative
interconnects (2)
Host Cost adapter
Number of Quality of
connectivity Cost switch card in
ports Service
performance hypervisor
1 Gb iSCSI 2000 euro
2 Gb/s 5x NIC (included)
ethernet (24 ports)
10 Gb iSCSI 1x HBA + 24500 euro
10 Gb/s 530 euro ✓
ethernet 1x NIC (16 ports)
1x HBA + 3000 euro ✓
Fibrechannel 2 Gb/s 175 euro
1x NIC (18 ports)
3500 euro ✓
Infiniband 40 Gb/s 1x HCA 500 euro
(18 ports)
15
16. Infiniband selected (1)
• Low latency: < 1 usec end-to-end
• High performance: 40 Gb/s host connectivity
• Consolidation: multiple fabrics on single cable
– Up to 8 virtual lanes
– No interdependency between between different
traffic flows
• Highly scalable: tens of nodes possible
• Best performance/cost ratio
16
19. New cloud architecture
Infiniband switch 1
Internet
Router Hypervisors SAN Control servers Backup SAN
Infiniband switch 2
19
20. Results of new cloud
architecture (1)
• Hdparm - timing buffered disk reads:
– Old setup: 16.94 MB/sec
– New setup: 83.09 MB/sec
• Seeker - random access time:
– Old setup: 16.23 ms
– New setup: 4.679 ms
20
21. Results of new cloud
architecture (2)
• Larger number of VMs per hypervisor:
– Old setup: 30 VMs in average
– New setup: up to 120 VMs in average
• Duration for backup creation largely reduced
• Provisioning a new VM is faster
• Easier to manage and scale, smaller amount of
cables
21
22. Results of new cloud
architecture (3)
• Quote from one of our clients:
“We develop high-traffic business websites with the Drupal
CMS. On the old platform, we experienced disk performance
issues, as Drupal needs fast storage access for a large number
of MySQL database queries and for file-based caching.
The new cloud platform based on Infiniband has given a great
performance boost to our clients' websites, making us and our
clients happier.”
- Rick Bosscher, General Manager, Dycon.nl
22