In this presentation, which was supposed to be presented at the cancelled CENIC 2020 Annual Conference, I present an overview of what is possible to achieve in terms of networking inside the Clouds and when exchanging data between cloud resources and on-prem equipment, with an emphasis on research hosted hardware.
There is increased awareness and recognition that public cloud providers do provide capabilities not found elsewhere, with elasticity being a major driver, and funding agencies are taking an increasingly positive stance toward public clouds.
The value of elastic scaling is, however, tightly coupled to the capabilities of the networks that connect all involved resources, both in the public clouds and at the various research institutions.
This presentation tries to shed some light on what is possible today.
On Starlink, presented by Geoff Huston at NZNOG 2024
Demonstrating 100 Gbps in and out of the Clouds
1. Demonstrating 100 Gbps
in and out of the Clouds
by Igor Sfiligoi, UCSD/SDSC (isfiligoi@sdsc.edu)
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 1
2. Who cares about clouds?
• Like it or not, public Clouds are gaining popularity
• Especially great for prototyping
• But great for compute spikes and urgent compute activities, too
• Funding agencies are taking notice
• See NSF E-CAS award(s) as an example
• SDSC Expanse has Cloud explicitly in the proposal text
• NSF-funded CloudBank
• IceCube’s Cloud runs funded by an EAGER award
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 2
3. Characterizing Cloud Resources
• When choosing which resources to use (and pay for)
knowing what to expect out of them is important
• The compute part of public Cloud resources is very well documented
• They all use pretty standard CPUs and GPUs
• And the models used are relatively well documented:
https://aws.amazon.com/ec2/instance-types/
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general
https://cloud.google.com/compute/docs/machine-types
• The networking is instead hardly documented at all
• Yet networking can make a big difference
when doing distributed computing
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 3
4. Structure of this talk
• What Cloud networking is capable of
• Using Storage as a proxy
• Getting data in and out of the cloud
• The cost of Cloud networking
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 4
6. 0
200
400
600
800
1000
1200
1400
1600
1800
2000
0 200 400 600 800 1000 1200 1400
Gbps
Number of compute instances
AWS
Azure
GCP
In-region networking
• All Cloud providers operate large pools of distributed storage
• Fetching data from remote storage moves data over the network
• Did not really hit an upper bound in testing locally (in single-region)
• Easily exceeding 1 Tbps
• Network thus the bottleneck for remote transfers
• Characterizing Cloud Storage
is an interesting topic in itself
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 6
7. Cross-region measurements
• All cloud providers have multi-100Gbps networking
between major regions
• And around 100Gbps for lesser region connections
• Note:
Google tops 1Tbps
pretty much across
all tested connections
Mar 16th, 2020 Planned for CENIC Annual Conference 2020
Throughput Ping Throughput Ping
AWS East 440 Gbps 75 ms AWS West 440 Gbps 75 ms
AWS Korea 100 Gbps 124 ms AWS EU 460 Gbps 65 ms
AWS Australia 65 Gbps 138 ms AWS BR 90 Gbps 120 ms
AWS Brazil 80 Gbps 184 ms
Azure East 190 Gbps 70 ms Azure West 190 Gbps 70 ms
Azure Korea 110 Gbps 124 ms Azure EU 185 Gbps 81 ms
Azure Australia 88 Gbps 177 ms Azure BR 100 Gbps 118 ms
GCP East 1060 Gbps 68 ms GCP West 1060 Gbps 68 ms
GCP Taiwan 940 Gbps 119 ms GCP EU 980 Gbps 93 ms
Storage in US West Storage in US East
8. Networking Between
On-Prem equipment and Cloud Storage
On-prem equipment all part of the PRP/TNRP Kubernetes cluster
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 8
9. Getting data out of the Clouds
• Few people these days live 100% in the Clouds
• Often the Cloud compute is used to generate data, but final analysis happens on-prem
• Another very important Cloud use case seems to be to park data in Cloud storage and
access it on as-needed basis
• Understanding how long will it take to extract the data is thus essential
• In-cloud Storage bandwidth not a worry
• As shown in previous slides
• Networking between science network backbones and the Cloud the limit
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 9
10. Single node performance
• Not a typical final-user use case, but easier to measure
• In the US, we can easily exceed
• 30Gbps in the same region
• 20Gbps coast-to-coast
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 10
AWS
West
Azure
West
GCP
West
AWS
East
Azure
East
GCP
East
PRP West
(California)
35 Gbps 27 Gbps 29 Gbps 23 Gbps 21 Gbps 26 Gbps
PRP Central
(Illinois)
33 Gbps 27 Gbps 35 Gbps 35 Gbps 36 Gbps 36 Gbps
PRP East
(Virginia)
36 Gbps 29 Gbps 36 Gbps 31 Gbps
All in US
All nodes with 100 Gbps NICs
11. Aggregate performance
• Measured from PRP nodes in SoCal
• Using 23 pods on 20 nodes distributed between UCSD, SDSU, UCI and UCSC
• Most nodes had 10 Gbps NICs, a couple had 100 Gbps NICs
• Great connectivity to all Cloud providers
• About 100Gbps for all tested US Cloud storage
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 11
All in US
AWS West AWS Central GCP West GCP Central Azure West Azure S. Central
100 Gbps 90 Gbps 100 Gbps 100 Gbps 120 Gbps 120 Gbps
12. Aggregate performance
• A snapshot from PRP Nautilus Grafana monitoring
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 12
https://grafana.nautilus.optiputer.net/d/85a562078cdf77779eaa1add43ccec1e/kubernetes-compute-resources-namespace-pods?orgId=1&var-datasource=default&var-cluster=&var-namespace=isfiligoi&var-interval=4h&from=1581135300000&to=1581137100000
GCP WestAzure South Central
Fetching a fixed amount of data per pod, thus a tail
13. Networking Between
Cloud and On-Prem equipment
On-prem equipment on PRP/TNRP Kubernetes cluster or operated by IceCube
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 13
14. Getting data into the Clouds
• The other side of the equation is importing data into the Cloud in
support of the compute jobs
• Either asynchronously by uploading into Cloud Storage
• Or by just-in-time streaming
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 14
This is what we did for the
2nd IceCube Cloud run
15. Single server performance
• Still using many clients in the Clouds
• But measuring performance from one Cloud region at a time
• Tested OSG-supported StashCache
• Basically xrootd with HTTP as the transfer protocol
(which happens to be the same protocol used by Cloud storage)
• Results a bit uneven
• Inland nodes mostly
at 20 Gbps
• SoCal node ranged
widely
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 15
AWS West Azure West AWS East Azure East
PRP West
(California)
37 Gbps
34 ms
14 Gbps
32 ms
63 Gbps
63 ms
66 Gbps
58 ms
PRP Central
(Illinois)
21 Gbps
71 ms
18 Gbps
45 ms
21 Gbps
19 ms
20 Gbps
67 ms
PRP East
(Virginia)
21 Gbps
75 ms
20 Gbps
69 ms
23 Gbps
24 ms
23 Gbps
69 ms
All in US
All nodes with 100 Gbps NICs
16. Aggregate performance - IceCube
• IceCube deployed a load-balanced Web cluster at UW – Madison
• In support of the 2nd Cloud run
• 6 nodes, each with a 25 Gbps NIC, Lustre back-end
• Demonstrated sustained 100Gbps from Clouds
• Ready for the worst-case scenario
• Even though
we ended using much
less during the Cloud run
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 16
ThroughputinBytespersecond
IceCube Server Monitoring
Fetching a fixed amount of data per client, thus a tail
17. Aggregate performance - PRP
• Deployed httpd pods in PRP nodes in SoCal
• Using 20 pods on 20 nodes distributed between UCSD, SDSU, UCI and UCSC
• Most nodes had 10 Gbps NICs, a couple with 100 Gbps NICs
• Small dataset, on PRP’s CEPH backend
• Approaching 100Gbps here, too
• Lower peak than UW
but shorter tail
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 17
ThroughputinBytespersecond
Fetching a fixed amount of data per client, thus a tail
18. Aggregate performance - PRP
• A snapshot from PRP Nautilus Grafana monitoring
• Repeated the test a few times
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 18
https://grafana.nautilus.optiputer.net/d/85a562078cdf77779eaa1add43ccec1e/kubernetes-compute-resources-namespace-pods?orgId=1&var-datasource=default&var-cluster=&var-namespace=isfiligoi&var-interval=4h&from=1582144500000&to=1582146900000
Client-side traffic distribution, leading to tails
19. Cloud Storage exporting Everywhere
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 19
20. Aggregate performance – Google Storage
• What if we served data from Cloud storage?
• Even though the resources came from multiple Clouds (from multiple regions)
• Google Storage behaved best
• Around 600Gbps overall
• Great connectivity to
other Cloud providers, too
• AWS and Azure significally slower
• At 150 – 300 Gbps
• All faster than serving from on-prem
• But much more expensive
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 20
ThroughputinBytespersecond
IceCube’s Cloud run
use-case
Fetching a fixed amount of data per client, thus a tail
21. Aggregate performance – Cloud vs On-prem
• Quick visual comparison of the various measurements
• All data read when line ends
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 21
All tests used 40 instances
spread across AWS, Azure
and GCP, most in the US
but with a few instances
for each Cloud provider
in both Europe and
Asia-Pacific
Fetching a fixed amount of data per client, thus a tail
23. Paying for Cloud networking
• Many people get surprised that they have to pay for networking
when using the public Clouds
• Science networks do not charge final users a dime!
• All cloud providers have a similar model
• Incoming network traffic is free
• Most in-Cloud-zone network traffic is free
• Data owners must pay for any data leaving the Cloud
• Moving data between different Cloud regions
also requires payment, although generally much less expensive
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 23
24. How expensive?
• Ballpark numbers
• $50/TB – $120/TB egress from Cloud
• $20/TB data movement between Cloud regions
• Ingress free
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 24
Fetching data from
Cloud storage
Charges on an account
used for benchmarking
Compute
and ingress
25. Exemptions for academia
• Most Cloud providers have agreements with most research institutions
to waive network costs up to 15% of the total bill
• So, if you compute a lot in the Cloud, network may actually be free
• E.g. using AWS spot pricing, you get 1GB of out-networking free:
• every 12h of a single CPU core
• every 1h of an 18-core/36-thread CPU instance
• every 30 mins of a V100 GPU
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 25
27. Client tools
• All test were carried out using HTTP as the protocol
• All Cloud native storage is HTTP based
• OSG StashCache uses HTTP, too
• Using aria2 as the client tool
• Mostly because it allows for easy multi-file, multi-stream downloads
aria2 -j 80 -x 16 -i infile.lst
• Less processes to manage
• Also allows for retries; handy when stressing storage system
aria2c --max-tries=32 --retry-wait=1 --async-dns=false …
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 27
28. Aggregated testing
• The most useful benchmark data come from aggregate measurements
• Few heavy user workloads run on a single node, even fewer using a single process
• Using HTCondor as a scheduling system
• Flexible - No special network needs, can even work behind firewalls
• Secure – End-to-end authentication and authorization, very few assumptions
• Reliable – submit and forget, takes care of most error conditions, including preemption
• Fast – All jobs start within a second, great for timing
• Fun fact: Hosted the central manager in a small Cloud instance
• no on-prem infrastructure needed
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 28
29. Cloud ingress tests
• Reading from
all three major Cloud providers in parallel
• 13 instances in AWS
• 13 instances in GCP
• 14 instances in Azure
• Geographically distributed, although biased
toward the US
• 28 instances in North America
• 5 instances in Europe
• 7 instances in Asia Pacific
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 29
• Trying to mimic
the access pattern
used by IceCube photon
propagation simulation
used during Cloud burst runs
• Many small files
• Each about 45MB in size
• Using beefy Cloud instances
with 16 cores
• Each instance is running
24 aria2 processes in parallel
31. Conclusion
• Cloud computing is starting to be an important part of many science domains
• And networking has traditionally been
the least well understood part of the public Cloud model
• We show that Cloud providers invested heavily in networking,
and in-Cloud networking is a non-issue
• The largest science network providers also have good connectivity to the Clouds
• 100 Gbps easy to achieve these days
• Data egress can be expensive
• But not a big issue if paired with Cloud computing, too
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 31
32. Acknowledgements
• This work has been partially sponsored by NSF grants
OAC-1826967, OAC-1541349, OAC-1841530,
OAC-1836650, MPS-1148698, OAC-1941481,
OPP-1600823 and OAC-190444.
• We kindly thank Amazon, Microsoft and Google for
providing Cloud credits that covered a portion
of the incurred Cloud costs
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 32
33. Related presentations
• A subset of this data was presented at CHEP19 in Adelaide, Australia
https://indico.cern.ch/event/773049/contributions/3473824/
• A pre-print of the above is available at
https://arxiv.org/abs/2002.04568
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 33
34. External material used
• Cloud background downloaded from Pixabay
• https://pixabay.com/photos/cloud-blue-composition-unbelievable-2680242/
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 34