Demonstrating 100 Gbps in and out of the Clouds

Demonstrating 100 Gbps
in and out of the Clouds
by Igor Sfiligoi, UCSD/SDSC (isfiligoi@sdsc.edu)
Mar 16th, 2020 Planned for CENIC Annual Conference 2020 1

Who cares about clouds?
• Like it or not, public Clouds are gaining popularity
• Especially great for prototyping
• But great for compute spikes and urgent compute activities, too
• Funding agencies are taking notice
• See NSF E-CAS award(s) as an example
• SDSC Expanse has Cloud explicitly in the proposal text
• NSF-funded CloudBank
• IceCube’s Cloud runs funded by an EAGER award

Characterizing Cloud Resources
• When choosing which resources to use (and pay for)
knowing what to expect out of them is important
• The compute part of public Cloud resources is very well documented
• They all use pretty standard CPUs and GPUs
• And the models used are relatively well documented:
https://aws.amazon.com/ec2/instance-types/
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general
https://cloud.google.com/compute/docs/machine-types
• The networking is instead hardly documented at all
• Yet networking can make a big difference
when doing distributed computing

Structure of this talk
• What Cloud networking is capable of
• Using Storage as a proxy
• Getting data in and out of the cloud
• The cost of Cloud networking

In-Cloud Networking
to their Storage

0
200
400
600
800
1000
1200
1400
1600
1800
2000
0 200 400 600 800 1000 1200 1400
Gbps
Number of compute instances
AWS
Azure
GCP
In-region networking
• All Cloud providers operate large pools of distributed storage
• Fetching data from remote storage moves data over the network
• Did not really hit an upper bound in testing locally (in single-region)
• Easily exceeding 1 Tbps
• Network thus the bottleneck for remote transfers
• Characterizing Cloud Storage
is an interesting topic in itself

Cross-region measurements
• All cloud providers have multi-100Gbps networking
between major regions
• And around 100Gbps for lesser region connections
• Note:
Google tops 1Tbps
pretty much across
all tested connections
Mar 16th, 2020 Planned for CENIC Annual Conference 2020
Throughput Ping Throughput Ping
AWS East 440 Gbps 75 ms AWS West 440 Gbps 75 ms
AWS Korea 100 Gbps 124 ms AWS EU 460 Gbps 65 ms
AWS Australia 65 Gbps 138 ms AWS BR 90 Gbps 120 ms
AWS Brazil 80 Gbps 184 ms
Azure East 190 Gbps 70 ms Azure West 190 Gbps 70 ms
Azure Korea 110 Gbps 124 ms Azure EU 185 Gbps 81 ms
Azure Australia 88 Gbps 177 ms Azure BR 100 Gbps 118 ms
GCP East 1060 Gbps 68 ms GCP West 1060 Gbps 68 ms
GCP Taiwan 940 Gbps 119 ms GCP EU 980 Gbps 93 ms
Storage in US West Storage in US East

Networking Between
On-Prem equipment and Cloud Storage
On-prem equipment all part of the PRP/TNRP Kubernetes cluster

Getting data out of the Clouds
• Few people these days live 100% in the Clouds
• Often the Cloud compute is used to generate data, but final analysis happens on-prem
• Another very important Cloud use case seems to be to park data in Cloud storage and
access it on as-needed basis
• Understanding how long will it take to extract the data is thus essential
• In-cloud Storage bandwidth not a worry
• As shown in previous slides
• Networking between science network backbones and the Cloud the limit

Single node performance
• Not a typical final-user use case, but easier to measure
• In the US, we can easily exceed
• 30Gbps in the same region
• 20Gbps coast-to-coast
AWS
West
Azure
West
GCP
West
AWS
East
Azure
East
GCP
East
PRP West
(California)
35 Gbps 27 Gbps 29 Gbps 23 Gbps 21 Gbps 26 Gbps
PRP Central
(Illinois)
PRP East
(Virginia)
36 Gbps 29 Gbps 36 Gbps 31 Gbps
All in US
All nodes with 100 Gbps NICs

Aggregate performance
• Measured from PRP nodes in SoCal
• Using 23 pods on 20 nodes distributed between UCSD, SDSU, UCI and UCSC
• Most nodes had 10 Gbps NICs, a couple had 100 Gbps NICs
• Great connectivity to all Cloud providers
• About 100Gbps for all tested US Cloud storage
All in US
AWS West AWS Central GCP West GCP Central Azure West Azure S. Central

Aggregate performance
• A snapshot from PRP Nautilus Grafana monitoring
https://grafana.nautilus.optiputer.net/d/85a562078cdf77779eaa1add43ccec1e/kubernetes-compute-resources-namespace-pods?orgId=1&var-datasource=default&var-cluster=&var-namespace=isfiligoi&var-interval=4h&from=1581135300000&to=1581137100000
GCP WestAzure South Central
Fetching a fixed amount of data per pod, thus a tail

Networking Between
Cloud and On-Prem equipment
On-prem equipment on PRP/TNRP Kubernetes cluster or operated by IceCube

Getting data into the Clouds
• The other side of the equation is importing data into the Cloud in
support of the compute jobs
• Either asynchronously by uploading into Cloud Storage
• Or by just-in-time streaming
This is what we did for the
2nd IceCube Cloud run

Single server performance
• Still using many clients in the Clouds
• But measuring performance from one Cloud region at a time
• Tested OSG-supported StashCache
• Basically xrootd with HTTP as the transfer protocol
(which happens to be the same protocol used by Cloud storage)
• Results a bit uneven
• Inland nodes mostly
at 20 Gbps
• SoCal node ranged
widely
AWS West Azure West AWS East Azure East
PRP West
(California)
37 Gbps
34 ms
14 Gbps
32 ms
63 Gbps
63 ms
66 Gbps
58 ms
PRP Central
(Illinois)
21 Gbps
71 ms
18 Gbps
45 ms
21 Gbps
19 ms
20 Gbps
67 ms
PRP East
(Virginia)
21 Gbps
75 ms
20 Gbps
69 ms
23 Gbps
24 ms
23 Gbps
69 ms
All in US
All nodes with 100 Gbps NICs

Aggregate performance - IceCube
• IceCube deployed a load-balanced Web cluster at UW – Madison
• In support of the 2nd Cloud run
• 6 nodes, each with a 25 Gbps NIC, Lustre back-end
• Demonstrated sustained 100Gbps from Clouds
• Ready for the worst-case scenario
• Even though
we ended using much
less during the Cloud run
ThroughputinBytespersecond
IceCube Server Monitoring
Fetching a fixed amount of data per client, thus a tail

Aggregate performance - PRP
• Deployed httpd pods in PRP nodes in SoCal
• Using 20 pods on 20 nodes distributed between UCSD, SDSU, UCI and UCSC
• Most nodes had 10 Gbps NICs, a couple with 100 Gbps NICs
• Small dataset, on PRP’s CEPH backend
• Approaching 100Gbps here, too
• Lower peak than UW
but shorter tail

Aggregate performance - PRP
• A snapshot from PRP Nautilus Grafana monitoring
• Repeated the test a few times
https://grafana.nautilus.optiputer.net/d/85a562078cdf77779eaa1add43ccec1e/kubernetes-compute-resources-namespace-pods?orgId=1&var-datasource=default&var-cluster=&var-namespace=isfiligoi&var-interval=4h&from=1582144500000&to=1582146900000
Client-side traffic distribution, leading to tails

Cloud Storage exporting Everywhere

Aggregate performance – Google Storage
• What if we served data from Cloud storage?
• Even though the resources came from multiple Clouds (from multiple regions)
• Google Storage behaved best
• Around 600Gbps overall
• Great connectivity to
other Cloud providers, too
• AWS and Azure significally slower
• At 150 – 300 Gbps
• All faster than serving from on-prem
• But much more expensive
IceCube’s Cloud run
use-case

Aggregate performance – Cloud vs On-prem
• Quick visual comparison of the various measurements
• All data read when line ends
All tests used 40 instances
spread across AWS, Azure
and GCP, most in the US
but with a few instances
for each Cloud provider
in both Europe and
Asia-Pacific

Cloud Networking Costs

Paying for Cloud networking
• Many people get surprised that they have to pay for networking
when using the public Clouds
• Science networks do not charge final users a dime!
• All cloud providers have a similar model
• Incoming network traffic is free
• Most in-Cloud-zone network traffic is free
• Data owners must pay for any data leaving the Cloud
• Moving data between different Cloud regions
also requires payment, although generally much less expensive

How expensive?
• Ballpark numbers
• $50/TB – $120/TB egress from Cloud
• $20/TB data movement between Cloud regions
• Ingress free
Fetching data from
Cloud storage
Charges on an account
used for benchmarking
Compute
and ingress

Exemptions for academia
• Most Cloud providers have agreements with most research institutions
to waive network costs up to 15% of the total bill
• So, if you compute a lot in the Cloud, network may actually be free
• E.g. using AWS spot pricing, you get 1GB of out-networking free:
• every 12h of a single CPU core
• every 1h of an 18-core/36-thread CPU instance
• every 30 mins of a V100 GPU

Testing framework

Client tools
• All test were carried out using HTTP as the protocol
• All Cloud native storage is HTTP based
• OSG StashCache uses HTTP, too
• Using aria2 as the client tool
• Mostly because it allows for easy multi-file, multi-stream downloads
aria2 -j 80 -x 16 -i infile.lst
• Less processes to manage
• Also allows for retries; handy when stressing storage system
aria2c --max-tries=32 --retry-wait=1 --async-dns=false …

Aggregated testing
• The most useful benchmark data come from aggregate measurements
• Few heavy user workloads run on a single node, even fewer using a single process
• Using HTCondor as a scheduling system
• Flexible - No special network needs, can even work behind firewalls
• Secure – End-to-end authentication and authorization, very few assumptions
• Reliable – submit and forget, takes care of most error conditions, including preemption
• Fast – All jobs start within a second, great for timing
• Fun fact: Hosted the central manager in a small Cloud instance
• no on-prem infrastructure needed

Cloud ingress tests
• Reading from
all three major Cloud providers in parallel
• 13 instances in AWS
• 13 instances in GCP
• 14 instances in Azure
• Geographically distributed, although biased
toward the US
• 28 instances in North America
• 5 instances in Europe
• 7 instances in Asia Pacific
• Trying to mimic
the access pattern
used by IceCube photon
propagation simulation
used during Cloud burst runs
• Many small files
• Each about 45MB in size
• Using beefy Cloud instances
with 16 cores
• Each instance is running
24 aria2 processes in parallel

Conclusion

Conclusion
• Cloud computing is starting to be an important part of many science domains
• And networking has traditionally been
the least well understood part of the public Cloud model
• We show that Cloud providers invested heavily in networking,
and in-Cloud networking is a non-issue
• The largest science network providers also have good connectivity to the Clouds
• 100 Gbps easy to achieve these days
• Data egress can be expensive
• But not a big issue if paired with Cloud computing, too

Acknowledgements
• This work has been partially sponsored by NSF grants
OAC-1826967, OAC-1541349, OAC-1841530,
OAC-1836650, MPS-1148698, OAC-1941481,
OPP-1600823 and OAC-190444.
• We kindly thank Amazon, Microsoft and Google for
providing Cloud credits that covered a portion
of the incurred Cloud costs

Related presentations
• A subset of this data was presented at CHEP19 in Adelaide, Australia
https://indico.cern.ch/event/773049/contributions/3473824/
• A pre-print of the above is available at
https://arxiv.org/abs/2002.04568

External material used
• Cloud background downloaded from Pixabay
• https://pixabay.com/photos/cloud-blue-composition-unbelievable-2680242/

Demonstrating 100 Gbps in and out of the Clouds

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Demonstrating 100 Gbps in and out of the Clouds

Similar a Demonstrating 100 Gbps in and out of the Clouds (20)

Más de Igor Sfiligoi

Más de Igor Sfiligoi (20)

Último

Último (20)

Demonstrating 100 Gbps in and out of the Clouds