SlideShare una empresa de Scribd logo
1 de 9
Descargar para leer sin conexión
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon
Scalable processors offered better BERT machine
learning performance
vs. M5n instances with 2nd
Gen Intel Xeon Scalable processors and
M6a instances with 3rd
Gen AMD EPYC processors
Many machine learning workloads involve sorting, analyzing, and making relationships
between images, but how can organizations quickly make sense of large amounts of text?
Bidirectional Encoder Representations from Transformers (BERT) is a machine learning
framework for natural language processing (NLP). To analyze text, BERT looks at all the words
around a given word to put it in the correct context. This allows applications such as search
engines to predict sentences, answer questions, or generate conversational responses.
Using Intel optimization for TensorFlow and ZenDNN integrated with TensorFlow, we
compared the BERT machine learning performance of three types of Amazon Web Services
(AWS) EC2 series instances: M6i instances with 3rd
Gen Intel®
Xeon®
Scalable processors
featuring Intel DL Boost with Vector Neural Network Instructions, M5n instances with 2nd
Gen
Intel Xeon Scalable processors, and M6a instances with 3rd
Gen AMD EPYC™
processors.
In tests at multiple instance sizes, AWS M6i instances offered up to 45 percent better BERT
performance on a benchmark from the Intel Model Zoo than the M5n instances with previous-
gen processors and up to 6.4 times the BERT performance compared to M6a instances with
3rd
Gen AMD EPYC processors. This means that organizations running similar BERT workloads
in the cloud could get better performance per instance by choosing M6i instances featuring
3rd
Gen Intel Xeon Scalable processors.
Up to 5.2x the queries
per second
vs. M6a instances
Up to 5.1x the queries
per second
vs. M6a instances
Up to 6.4x the queries
per second
vs. M6a instances
4
vCPUs
8
vCPUs
16
vCPUs
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance June 2022
A Principled Technologies report: Hands-on testing. Real-world results.
Figure 1: Key specifications for each instance size we tested. Source: Principled Technologies.
How we tested
We purchased three sets of instances from three general-purpose AWS EC2 series:
• M6i instances featuring 3rd
Gen Intel Xeon Platinum 8375C processors (Ice Lake)
• M5n instances featuring 2nd
Gen Intel Xeon Platinum 8259CL processors (Cascade Lake)
• M6a instances featuring 3rd
Gen AMD EPYC 7R13 processors (Milan)
We ran each instance in the US East 1 region.
Figure 1 shows the specifications for the instances that we chose. To show how businesses of various sizes with
different machine learning demands can benefit from choosing M6i instances, we tested instances with 4 vCPUs,
8 vCPUs, and 16 vCPUs. To account for different types of datasets, we ran tests using a small batch size of 1
and a large batch size of 32—where batch size is the number of samples that go through the neural network at
a time. In this report, we present the comparisons between M6i and M5n instances first, and then present the
comparisons between M6i and M6a instances. (Note: For additional test results on even larger instances, see the
science behind the report.)
4
vCPUs
8
vCPUs
16
vCPUs
Testing BERT performance in the cloud
The BERT framework, which was trained on text from the English language Wikipedia with over 2.5 million
words, works by turning text into numbers to sort, analyze, and make predictions about that text.1
Depending
on the dataset on which an organization needs to run BERT machine leaning, the size of the AWS instances they
choose will vary. To account for these different needs, we tested using two batch sizes across three different
instance sizes. We used a BERT benchmark from Intel Model Zoo, which offers a range of machine learning
models and tools. At the time of our testing, AMD EPYC processors did not support INT8 precision for BERT,
so we present FP32 precision results for M6i instances as well for comparison. In all three, the M6i instances
enabled by 3rd
Gen Intel Xeon Scalable processors outperformed both the previous-gen M5n instances and the
current-gen M6a instances.
June 2022 | 2
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance
Why choose M6i instances with 3rd
Gen Intel Xeon Scalable processors?
New M6i instances with 3rd
Gen Intel Xeon Scalable processors offer the following:4
• All-core turbo frequency of up to 3.5 GHz
• Always-on memory encryption with Intel Total Memory Encryption (TME)
• Intel DL Boost with Vector Neural Network Instructions (VNNI) that accelerate INT8 performance
• Intel Advanced Vector Extensions 512 (Intel AVX-512) instructions for demanding machine
learning workloads
• Support for up to 128 vCPUs and 512 GB of memory per instance
• Up to 50Gbps networking
About 3rd
Generation Intel Xeon Scalable processors
According to Intel, 3rd
Generation Intel Xeon Scalable processors are “[o]ptimized for cloud, enterprise, HPC,
network, security, and IoT workloads with 8 to 40 powerful cores and a wide range of frequency, feature, and
power levels.”2
Intel continues to offer many models from the Platinum, Gold, Silver, and Bronze processor lines
that they “designed through decades of innovation for the most common workload requirements.3
For more information, visit http://intel.com/xeonscalable.
June 2022 | 3
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance
Instances with 4 vCPUs: M6i vs. M5n
First, we compared BERT performance on smaller instances, looking at the relative amount of text the instance
types analyzed on 4vCPU configurations. As Figure 2 shows, M6i instances enabled by 3rd
Gen Intel Xeon
Scalable processors analyzed up to 18 percent more examples per second than the M5n instances with 2nd
Gen
Intel Xeon Scalable processors.
Figure 2: Relative BERT performance for M6i and M5n instances using 4 vCPUs. Higher numbers are better.
Source: Principled Technologies.
Instances with 8 vCPUs: M6i vs. M5n
When we doubled the instance size to 8 vCPUs, M6i instances delivered a similar performance increase over
previous-gen M5n instances. Figure 3 compares the relative amount of text the instance types analyzed on
8vCPU configurations. The M6i instances enabled by 3rd
Gen Intel Xeon Scalable processors analyzed up to 11
percent more examples per second than the M5n instances with 2nd
Gen Intel Xeon Scalable processors.
Figure 3: Relative BERT performance for M6i and M5n instances using 8 vCPUs. Higher numbers are better.
Source: Principled Technologies.
Relative BERT performance of m6i.xlarge vs. m5n.xlarge
Larger is better
0
0.20
0.40
0.60
0.80
1.00
1.40
1.20
1.00
Relative
throughput
(examples/sec)
M6i (INT8) M5n (INT8) M6i (INT8)
Batch size: 1 Batch size: 32
1.13
1.00
1.18
M5n (INT8)
1.60
Relative BERT performance of m6i.2xlarge vs. m5n.2xlarge
Larger is better
0
0.20
0.40
0.60
0.80
1.00
1.40
1.20
1.00
Relative
throughput
(examples/sec)
M6i (INT8) M5n (INT8) M6i (INT8)
Batch size: 1 Batch size: 32
1.11
1.00
1.11
M5n (INT8)
1.60
up to 11%
better throughput
up to 18%
better throughput
4
vCPUs
8
vCPUs
June 2022 | 4
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance
Instances with 16 vCPUs: M6i vs. M5n
As Figure 4 shows, M6i instances offered the greatest relative BERT performance increase over previous-gen
M5n instances using larger 16vCPU configurations. The M6i instances enabled by 3rd
Gen Intel Xeon Scalable
processors analyzed up to 45 percent more examples per second than the M5n instances with 2nd
Gen Intel Xeon
Scalable processors. By improving textual data analysis throughput by 45 percent, organizations could reduce
the number of instances they need to purchase and manage when they select the M6i instance type.
Figure 4: Relative BERT performance for M6i and M5n instances using 16 vCPUs. Higher numbers are better.
Source: Principled Technologies.
Relative BERT performance of m6i.4xlarge vs. m5n.4xlarge
Larger is better
0
0.20
0.40
0.60
0.80
1.00
1.40
1.20
1.00
Relative
throughput
(examples/sec)
M6i (INT8) M5n (INT8) M6i (INT8)
Batch size: 1 Batch size: 32
1.21
1.00
1.45
M5n (INT8)
1.60
up to 45%
better throughput
16
vCPUs
June 2022 | 5
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance
Instances with 4 vCPUs: M6i vs. M6a
After comparing BERT performance of M6i instances against that of instances based on previous-gen processors,
we compared those three sizes of M6i instances against M6a instances with AMD EPYC processors. Figure 5
compares the relative amount of text these instance types analyzed on 4vCPU configurations. The M6i instances
enabled by 3rd
Gen Intel Xeon Scalable processors with INT8 precision analyzed data 5.29 times as fast as the
M6a instances with 3rd
Gen AMD EPYC processors using FP32 precision. Note: At the time of testing, INT8
precision—which can improve performance for these types of machine learning—was not available for
BERT workloads on AMD EPYC processors. Using FP32 precision, M6i instances improved performance over
M6a instances by as much as 68 percent.
Figure 5: Relative BERT performance for M6i and M6a instances using 4 vCPUs. Higher numbers are better.
Source: Principled Technologies.
Instances with 8 vCPUs: M6i vs. M6a
When we increased the instance sizes to 8 vCPUs, performance increases were similar to the 4vCPU
configurations. Figure 6 compares the relative amount of text the instance types analyzed on 8vCPU
configurations. The M6i instances enabled by 3rd
Gen Intel Xeon Scalable processors analyzed data up to 5.10
times as fast as the M6a instances with 3rd
Gen AMD EPYC processors.
Figure 6: Relative BERT performance for M6i and M6a instances using 8 vCPUs. Higher numbers are better.
Source: Principled Technologies.
Relative BERT performance of m6i.xlarge vs. m6a.xlarge
Larger is better
0
1.00
2.00
3.00
4.00
5.00
7.00
6.00
1.68
1.00
Relative
throughput
(examples/sec)
M6i (INT8) M6i (FP32) M6a (FP32) M6i (INT8) M6i (FP32) M6a (FP32)
Batch size: 1 Batch size: 32
4.24
1.00
1.57
5.29
Relative BERT performance of m6i.2xlarge vs. m6a.2xlarge
Larger is better
0
1.00
2.00
3.00
4.00
5.00
7.00
6.00
1.64
1.00
Relative
throughput
(examples/sec)
M6i (INT8) M6i (FP32) M6a (FP32) M6i (INT8) M6i (FP32) M6a (FP32)
Batch size: 1 Batch size: 32
4.44
1.00
1.68
5.10
up to 5.10x
the throughput
up to 5.29x
the throughput
4
vCPUs
8
vCPUs
June 2022 | 6
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance
Instances with 16 vCPUs: M6i vs. M6a
The biggest relative difference in BERT performance occurred in our 16vCPU comparison of M6i and M6a
configurations. Figure 7 compares the relative examples per second the instance types analyzed on 16vCPU
configurations. The M6i instances enabled by 3rd
Gen Intel Xeon Scalable processors analyzed data up to 6.40
times as fast as the M6a instances with 3rd
Gen AMD EPYC processors. These results show that for these types
of BERT workloads, selecting M6i instances that offer INT8 precision over M6a instances that don’t could allow
organizations to complete textual analysis workloads using fewer cloud instances.
Figure 7: Relative BERT performance for M6i and M6a instances using 16 vCPUs. Higher numbers are better.
Source: Principled Technologies.
Relative BERT performance of m6i.4xlarge vs. m6a.4xlarge
Larger is better
0
1.00
2.00
3.00
4.00
5.00
7.00
6.00
1.81
1.00
Relative
throughput
(examples/sec)
M6i (INT8) M6i (FP32) M6a (FP32) M6i (INT8) M6i (FP32) M6a (FP32)
Batch size: 1 Batch size: 32
6.37
1.00
2.24
6.40
up to 6.40x
the throughput
16
vCPUs
June 2022 | 7
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance
Scaling BERT workloads
Another consideration for assessing BERT performance is to see how the throughput scales as you increase
the size of the instance. Theoretically, performance could double as you double the vCPU count, which would
be perfect linear scaling. While resource allocation makes this unlikely in the real world, the closer an instance
approaches this ideal, the better.
As Figure 8 shows, using results from our batch size: 1 tests, the M6i instance with 3rd
Gen Intel Xeon Scalable
processors had better BERT performance scaling from 8 vCPUs to 16 vCPUs compared to the M6a instance with
AMD EPYC processors, though slightly worse scaling from 4 vCPUs to 8 vCPUs.
Figure 8: How BERT performance scaled across instance sizes, compared to results from the 4vCPU tests with batch size 1.
Higher numbers are better. Source: Principled Technologies.
Figure 9 makes the same comparison, but uses results from our batch size: 32 testing. Again, the M6i
instance with 3rd
Gen Intel Xeon Scalable processors scaled more linearly from 4 to 16 vCPUs compared to
the M6a instance.
Figure 9: How BERT performance scaled across instance sizes, compared to results from the 4vCPU tests with batch size
32. Higher numbers are better. Source: Principled Technologies.
Relative BERT performance scaling compared to 4vCPUs with batch size: 1
Larger is better
0
0.50
1.00
1.50
2.00
2.50
3.50
3.00
Relative
throughput
(examples/sec)
4 vCPUs 8 vCPUs
1.85
1.00
3.40
1.86 1.91
3.16
4.00
4.50
M6i (INT8) M6i (INT32) M6a (FP32)
16vCPUs
3.83
1.00 1.00
Relative BERT performance scaling compared to 4vCPUs with batch size: 32
Larger is better
Relative
throughput
(examples/sec)
0
0.50
1.00
1.50
2.00
2.50
3.50
3.00
4 vCPUs 8 vCPUs
1.86
1.00
1.90
3.52
2.52
1.78
4.00
4.50
M6i (INT8) M6i (INT32) M6a (FP32)
16vCPUs
3.79
1.00 1.00
By selecting M6i instances that offer more linear, predictable performance scaling, organizations could more
reliably fix their cloud operating budgets as textual analysis workloads continue to grow.
June 2022 | 8
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance
Conclusion
Organizations analyzing textual data using NLP through the BERT framework must decide which type of instance
can deliver the BERT performance they need. In our tests, we found that across instance sizes, AWS M6i
instances with 3rd
Gen Intel Xeon Scalable processors outperformed both M5n instances with 2nd
Gen Intel Xeon
Scalable processors and M6a instances with 3rd
Gen AMD EPYC processors for BERT machine learning. Plus, the
M6i instances offered more predictable scaling at 16vCPUs. These performance increases could help you get
quicker insight from textual data to better satisfy consumers and increase revenues.
1.	 TechTarget, “BERT language model,” accessed December 16, 2021,
https://www.techtarget.com/searchenterpriseai/definition/BERT-language-model.
2.	 Intel, “3rd Gen Intel®
Xeon®
Scalable Processors,” accessed December 14, 2021,
https://www.intel.com/content/www/us/en/products/docs/processors/xeon/3rd-gen-xeon-scalable-processors-brief.html.
3.	 Intel, “3rd Gen Intel®
Xeon®
Scalable Processors.”
4.	 Amazon, Amazon EC2 M6i Instances, accessed December 14, 2021,
https://aws.amazon.com/ec2/instance-types/m6i/.
Principled Technologies is a registered trademark of Principled Technologies, Inc.
All other product names are the trademarks of their respective owners.
For additional information, review the science behind this report.
Principled
Technologies®
Facts matter.®
Principled
Technologies®
Facts matter.®
This project was commissioned by Intel.
Read the science behind this report at https://facts.pt/ZymIIA3
June 2022 | 9
AWS EC2 M6i instances featuring 3rd
Gen Intel Xeon Scalable processors offered better BERT machine learning performance

Más contenido relacionado

Similar a AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance

Complete online analytics processing work faster with Google Cloud Platform N...
Complete online analytics processing work faster with Google Cloud Platform N...Complete online analytics processing work faster with Google Cloud Platform N...
Complete online analytics processing work faster with Google Cloud Platform N...Principled Technologies
 
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' GroupJanuary 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' GroupDavid McDaniel
 
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...Principled Technologies
 
Get higher performance for your MySQL databases with Dell APEX Private Cloud ...
Get higher performance for your MySQL databases with Dell APEX Private Cloud ...Get higher performance for your MySQL databases with Dell APEX Private Cloud ...
Get higher performance for your MySQL databases with Dell APEX Private Cloud ...Principled Technologies
 
Google Cloud N2 instances featuring 3rd Gen Intel Xeon Scalable processors ex...
Google Cloud N2 instances featuring 3rd Gen Intel Xeon Scalable processors ex...Google Cloud N2 instances featuring 3rd Gen Intel Xeon Scalable processors ex...
Google Cloud N2 instances featuring 3rd Gen Intel Xeon Scalable processors ex...Principled Technologies
 
Elastic Compute Cloud (EC2) on AWS Presentation
Elastic Compute Cloud (EC2) on AWS PresentationElastic Compute Cloud (EC2) on AWS Presentation
Elastic Compute Cloud (EC2) on AWS PresentationKnoldus Inc.
 
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...Principled Technologies
 
Get a clearer picture of potential cloud performance by looking beyond SPECra...
Get a clearer picture of potential cloud performance by looking beyond SPECra...Get a clearer picture of potential cloud performance by looking beyond SPECra...
Get a clearer picture of potential cloud performance by looking beyond SPECra...Principled Technologies
 
Boost your MariaDB online transaction processing performance with N2 standard...
Boost your MariaDB online transaction processing performance with N2 standard...Boost your MariaDB online transaction processing performance with N2 standard...
Boost your MariaDB online transaction processing performance with N2 standard...Principled Technologies
 
Open up new possibilities with higher transactional database performance from...
Open up new possibilities with higher transactional database performance from...Open up new possibilities with higher transactional database performance from...
Open up new possibilities with higher transactional database performance from...Principled Technologies
 
Get competitive logistic regression performance with servers with AMD EPYC 75...
Get competitive logistic regression performance with servers with AMD EPYC 75...Get competitive logistic regression performance with servers with AMD EPYC 75...
Get competitive logistic regression performance with servers with AMD EPYC 75...Principled Technologies
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIComprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIijtsrd
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Amazon Web Services
 
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...Principled Technologies
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
1.multicore processors
1.multicore processors1.multicore processors
1.multicore processorsHebeon1
 
Lesson 26. Optimization of 64-bit programs
Lesson 26. Optimization of 64-bit programsLesson 26. Optimization of 64-bit programs
Lesson 26. Optimization of 64-bit programsPVS-Studio
 

Similar a AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance (20)

Complete online analytics processing work faster with Google Cloud Platform N...
Complete online analytics processing work faster with Google Cloud Platform N...Complete online analytics processing work faster with Google Cloud Platform N...
Complete online analytics processing work faster with Google Cloud Platform N...
 
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' GroupJanuary 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
 
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
 
Get higher performance for your MySQL databases with Dell APEX Private Cloud ...
Get higher performance for your MySQL databases with Dell APEX Private Cloud ...Get higher performance for your MySQL databases with Dell APEX Private Cloud ...
Get higher performance for your MySQL databases with Dell APEX Private Cloud ...
 
Google Cloud N2 instances featuring 3rd Gen Intel Xeon Scalable processors ex...
Google Cloud N2 instances featuring 3rd Gen Intel Xeon Scalable processors ex...Google Cloud N2 instances featuring 3rd Gen Intel Xeon Scalable processors ex...
Google Cloud N2 instances featuring 3rd Gen Intel Xeon Scalable processors ex...
 
Elastic Compute Cloud (EC2) on AWS Presentation
Elastic Compute Cloud (EC2) on AWS PresentationElastic Compute Cloud (EC2) on AWS Presentation
Elastic Compute Cloud (EC2) on AWS Presentation
 
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
 
Get a clearer picture of potential cloud performance by looking beyond SPECra...
Get a clearer picture of potential cloud performance by looking beyond SPECra...Get a clearer picture of potential cloud performance by looking beyond SPECra...
Get a clearer picture of potential cloud performance by looking beyond SPECra...
 
Boost your MariaDB online transaction processing performance with N2 standard...
Boost your MariaDB online transaction processing performance with N2 standard...Boost your MariaDB online transaction processing performance with N2 standard...
Boost your MariaDB online transaction processing performance with N2 standard...
 
Open up new possibilities with higher transactional database performance from...
Open up new possibilities with higher transactional database performance from...Open up new possibilities with higher transactional database performance from...
Open up new possibilities with higher transactional database performance from...
 
Get competitive logistic regression performance with servers with AMD EPYC 75...
Get competitive logistic regression performance with servers with AMD EPYC 75...Get competitive logistic regression performance with servers with AMD EPYC 75...
Get competitive logistic regression performance with servers with AMD EPYC 75...
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIComprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
 
SRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 FoundationsSRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 Foundations
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
EC2 Foundations - Laura Thomson
EC2 Foundations - Laura ThomsonEC2 Foundations - Laura Thomson
EC2 Foundations - Laura Thomson
 
1.multicore processors
1.multicore processors1.multicore processors
1.multicore processors
 
Lesson 26. Optimization of 64-bit programs
Lesson 26. Optimization of 64-bit programsLesson 26. Optimization of 64-bit programs
Lesson 26. Optimization of 64-bit programs
 

Más de Principled Technologies

A comparison of security features in Dell, HP, and Lenovo PC systems
A comparison of security features in Dell, HP, and Lenovo PC systemsA comparison of security features in Dell, HP, and Lenovo PC systems
A comparison of security features in Dell, HP, and Lenovo PC systemsPrincipled Technologies
 
Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Principled Technologies
 
Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Principled Technologies
 
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...Principled Technologies
 
Scale up your storage with higher-performing Dell APEX Block Storage for AWS
Scale up your storage with higher-performing Dell APEX Block Storage for AWSScale up your storage with higher-performing Dell APEX Block Storage for AWS
Scale up your storage with higher-performing Dell APEX Block Storage for AWSPrincipled Technologies
 
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationGet in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationPrincipled Technologies
 
Improving database performance and value with an easy migration to Azure Data...
Improving database performance and value with an easy migration to Azure Data...Improving database performance and value with an easy migration to Azure Data...
Improving database performance and value with an easy migration to Azure Data...Principled Technologies
 
Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Principled Technologies
 
Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Principled Technologies
 
Set up students and teachers to excel now and in the future with Intel proces...
Set up students and teachers to excel now and in the future with Intel proces...Set up students and teachers to excel now and in the future with Intel proces...
Set up students and teachers to excel now and in the future with Intel proces...Principled Technologies
 
Finding the path to AI success with the Dell AI portfolio - Summary
Finding the path to AI success with the Dell AI portfolio - SummaryFinding the path to AI success with the Dell AI portfolio - Summary
Finding the path to AI success with the Dell AI portfolio - SummaryPrincipled Technologies
 
Finding the path to AI success with the Dell AI portfolio
Finding the path to AI success with the Dell AI portfolioFinding the path to AI success with the Dell AI portfolio
Finding the path to AI success with the Dell AI portfolioPrincipled Technologies
 
Achieve strong performance and value on Azure SQL Database Hyperscale
Achieve strong performance and value on Azure SQL Database HyperscaleAchieve strong performance and value on Azure SQL Database Hyperscale
Achieve strong performance and value on Azure SQL Database HyperscalePrincipled Technologies
 
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Principled Technologies
 
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Principled Technologies
 
Utilizing Azure Cosmos DB for intelligent AI‑powered applications
Utilizing Azure Cosmos DB for intelligent AI‑powered applicationsUtilizing Azure Cosmos DB for intelligent AI‑powered applications
Utilizing Azure Cosmos DB for intelligent AI‑powered applicationsPrincipled Technologies
 
Build an Azure OpenAI application using your own enterprise data
Build an Azure OpenAI application using your own enterprise dataBuild an Azure OpenAI application using your own enterprise data
Build an Azure OpenAI application using your own enterprise dataPrincipled Technologies
 
Dell Chromebooks: Durable, easy to deploy, and easy to service
Dell Chromebooks: Durable, easy to deploy, and easy to serviceDell Chromebooks: Durable, easy to deploy, and easy to service
Dell Chromebooks: Durable, easy to deploy, and easy to servicePrincipled Technologies
 
Back up and restore data faster with a Dell PowerProtect Data Manager Applian...
Back up and restore data faster with a Dell PowerProtect Data Manager Applian...Back up and restore data faster with a Dell PowerProtect Data Manager Applian...
Back up and restore data faster with a Dell PowerProtect Data Manager Applian...Principled Technologies
 
Back up and restore data faster with a Dell PowerProtect Data Manager Appliance
Back up and restore data faster with a Dell PowerProtect Data Manager ApplianceBack up and restore data faster with a Dell PowerProtect Data Manager Appliance
Back up and restore data faster with a Dell PowerProtect Data Manager AppliancePrincipled Technologies
 

Más de Principled Technologies (20)

A comparison of security features in Dell, HP, and Lenovo PC systems
A comparison of security features in Dell, HP, and Lenovo PC systemsA comparison of security features in Dell, HP, and Lenovo PC systems
A comparison of security features in Dell, HP, and Lenovo PC systems
 
Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...
 
Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...
 
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
 
Scale up your storage with higher-performing Dell APEX Block Storage for AWS
Scale up your storage with higher-performing Dell APEX Block Storage for AWSScale up your storage with higher-performing Dell APEX Block Storage for AWS
Scale up your storage with higher-performing Dell APEX Block Storage for AWS
 
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationGet in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
 
Improving database performance and value with an easy migration to Azure Data...
Improving database performance and value with an easy migration to Azure Data...Improving database performance and value with an easy migration to Azure Data...
Improving database performance and value with an easy migration to Azure Data...
 
Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...
 
Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...
 
Set up students and teachers to excel now and in the future with Intel proces...
Set up students and teachers to excel now and in the future with Intel proces...Set up students and teachers to excel now and in the future with Intel proces...
Set up students and teachers to excel now and in the future with Intel proces...
 
Finding the path to AI success with the Dell AI portfolio - Summary
Finding the path to AI success with the Dell AI portfolio - SummaryFinding the path to AI success with the Dell AI portfolio - Summary
Finding the path to AI success with the Dell AI portfolio - Summary
 
Finding the path to AI success with the Dell AI portfolio
Finding the path to AI success with the Dell AI portfolioFinding the path to AI success with the Dell AI portfolio
Finding the path to AI success with the Dell AI portfolio
 
Achieve strong performance and value on Azure SQL Database Hyperscale
Achieve strong performance and value on Azure SQL Database HyperscaleAchieve strong performance and value on Azure SQL Database Hyperscale
Achieve strong performance and value on Azure SQL Database Hyperscale
 
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
 
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
 
Utilizing Azure Cosmos DB for intelligent AI‑powered applications
Utilizing Azure Cosmos DB for intelligent AI‑powered applicationsUtilizing Azure Cosmos DB for intelligent AI‑powered applications
Utilizing Azure Cosmos DB for intelligent AI‑powered applications
 
Build an Azure OpenAI application using your own enterprise data
Build an Azure OpenAI application using your own enterprise dataBuild an Azure OpenAI application using your own enterprise data
Build an Azure OpenAI application using your own enterprise data
 
Dell Chromebooks: Durable, easy to deploy, and easy to service
Dell Chromebooks: Durable, easy to deploy, and easy to serviceDell Chromebooks: Durable, easy to deploy, and easy to service
Dell Chromebooks: Durable, easy to deploy, and easy to service
 
Back up and restore data faster with a Dell PowerProtect Data Manager Applian...
Back up and restore data faster with a Dell PowerProtect Data Manager Applian...Back up and restore data faster with a Dell PowerProtect Data Manager Applian...
Back up and restore data faster with a Dell PowerProtect Data Manager Applian...
 
Back up and restore data faster with a Dell PowerProtect Data Manager Appliance
Back up and restore data faster with a Dell PowerProtect Data Manager ApplianceBack up and restore data faster with a Dell PowerProtect Data Manager Appliance
Back up and restore data faster with a Dell PowerProtect Data Manager Appliance
 

Último

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Último (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance

  • 1. AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance vs. M5n instances with 2nd Gen Intel Xeon Scalable processors and M6a instances with 3rd Gen AMD EPYC processors Many machine learning workloads involve sorting, analyzing, and making relationships between images, but how can organizations quickly make sense of large amounts of text? Bidirectional Encoder Representations from Transformers (BERT) is a machine learning framework for natural language processing (NLP). To analyze text, BERT looks at all the words around a given word to put it in the correct context. This allows applications such as search engines to predict sentences, answer questions, or generate conversational responses. Using Intel optimization for TensorFlow and ZenDNN integrated with TensorFlow, we compared the BERT machine learning performance of three types of Amazon Web Services (AWS) EC2 series instances: M6i instances with 3rd Gen Intel® Xeon® Scalable processors featuring Intel DL Boost with Vector Neural Network Instructions, M5n instances with 2nd Gen Intel Xeon Scalable processors, and M6a instances with 3rd Gen AMD EPYC™ processors. In tests at multiple instance sizes, AWS M6i instances offered up to 45 percent better BERT performance on a benchmark from the Intel Model Zoo than the M5n instances with previous- gen processors and up to 6.4 times the BERT performance compared to M6a instances with 3rd Gen AMD EPYC processors. This means that organizations running similar BERT workloads in the cloud could get better performance per instance by choosing M6i instances featuring 3rd Gen Intel Xeon Scalable processors. Up to 5.2x the queries per second vs. M6a instances Up to 5.1x the queries per second vs. M6a instances Up to 6.4x the queries per second vs. M6a instances 4 vCPUs 8 vCPUs 16 vCPUs AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance June 2022 A Principled Technologies report: Hands-on testing. Real-world results.
  • 2. Figure 1: Key specifications for each instance size we tested. Source: Principled Technologies. How we tested We purchased three sets of instances from three general-purpose AWS EC2 series: • M6i instances featuring 3rd Gen Intel Xeon Platinum 8375C processors (Ice Lake) • M5n instances featuring 2nd Gen Intel Xeon Platinum 8259CL processors (Cascade Lake) • M6a instances featuring 3rd Gen AMD EPYC 7R13 processors (Milan) We ran each instance in the US East 1 region. Figure 1 shows the specifications for the instances that we chose. To show how businesses of various sizes with different machine learning demands can benefit from choosing M6i instances, we tested instances with 4 vCPUs, 8 vCPUs, and 16 vCPUs. To account for different types of datasets, we ran tests using a small batch size of 1 and a large batch size of 32—where batch size is the number of samples that go through the neural network at a time. In this report, we present the comparisons between M6i and M5n instances first, and then present the comparisons between M6i and M6a instances. (Note: For additional test results on even larger instances, see the science behind the report.) 4 vCPUs 8 vCPUs 16 vCPUs Testing BERT performance in the cloud The BERT framework, which was trained on text from the English language Wikipedia with over 2.5 million words, works by turning text into numbers to sort, analyze, and make predictions about that text.1 Depending on the dataset on which an organization needs to run BERT machine leaning, the size of the AWS instances they choose will vary. To account for these different needs, we tested using two batch sizes across three different instance sizes. We used a BERT benchmark from Intel Model Zoo, which offers a range of machine learning models and tools. At the time of our testing, AMD EPYC processors did not support INT8 precision for BERT, so we present FP32 precision results for M6i instances as well for comparison. In all three, the M6i instances enabled by 3rd Gen Intel Xeon Scalable processors outperformed both the previous-gen M5n instances and the current-gen M6a instances. June 2022 | 2 AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance
  • 3. Why choose M6i instances with 3rd Gen Intel Xeon Scalable processors? New M6i instances with 3rd Gen Intel Xeon Scalable processors offer the following:4 • All-core turbo frequency of up to 3.5 GHz • Always-on memory encryption with Intel Total Memory Encryption (TME) • Intel DL Boost with Vector Neural Network Instructions (VNNI) that accelerate INT8 performance • Intel Advanced Vector Extensions 512 (Intel AVX-512) instructions for demanding machine learning workloads • Support for up to 128 vCPUs and 512 GB of memory per instance • Up to 50Gbps networking About 3rd Generation Intel Xeon Scalable processors According to Intel, 3rd Generation Intel Xeon Scalable processors are “[o]ptimized for cloud, enterprise, HPC, network, security, and IoT workloads with 8 to 40 powerful cores and a wide range of frequency, feature, and power levels.”2 Intel continues to offer many models from the Platinum, Gold, Silver, and Bronze processor lines that they “designed through decades of innovation for the most common workload requirements.3 For more information, visit http://intel.com/xeonscalable. June 2022 | 3 AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance
  • 4. Instances with 4 vCPUs: M6i vs. M5n First, we compared BERT performance on smaller instances, looking at the relative amount of text the instance types analyzed on 4vCPU configurations. As Figure 2 shows, M6i instances enabled by 3rd Gen Intel Xeon Scalable processors analyzed up to 18 percent more examples per second than the M5n instances with 2nd Gen Intel Xeon Scalable processors. Figure 2: Relative BERT performance for M6i and M5n instances using 4 vCPUs. Higher numbers are better. Source: Principled Technologies. Instances with 8 vCPUs: M6i vs. M5n When we doubled the instance size to 8 vCPUs, M6i instances delivered a similar performance increase over previous-gen M5n instances. Figure 3 compares the relative amount of text the instance types analyzed on 8vCPU configurations. The M6i instances enabled by 3rd Gen Intel Xeon Scalable processors analyzed up to 11 percent more examples per second than the M5n instances with 2nd Gen Intel Xeon Scalable processors. Figure 3: Relative BERT performance for M6i and M5n instances using 8 vCPUs. Higher numbers are better. Source: Principled Technologies. Relative BERT performance of m6i.xlarge vs. m5n.xlarge Larger is better 0 0.20 0.40 0.60 0.80 1.00 1.40 1.20 1.00 Relative throughput (examples/sec) M6i (INT8) M5n (INT8) M6i (INT8) Batch size: 1 Batch size: 32 1.13 1.00 1.18 M5n (INT8) 1.60 Relative BERT performance of m6i.2xlarge vs. m5n.2xlarge Larger is better 0 0.20 0.40 0.60 0.80 1.00 1.40 1.20 1.00 Relative throughput (examples/sec) M6i (INT8) M5n (INT8) M6i (INT8) Batch size: 1 Batch size: 32 1.11 1.00 1.11 M5n (INT8) 1.60 up to 11% better throughput up to 18% better throughput 4 vCPUs 8 vCPUs June 2022 | 4 AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance
  • 5. Instances with 16 vCPUs: M6i vs. M5n As Figure 4 shows, M6i instances offered the greatest relative BERT performance increase over previous-gen M5n instances using larger 16vCPU configurations. The M6i instances enabled by 3rd Gen Intel Xeon Scalable processors analyzed up to 45 percent more examples per second than the M5n instances with 2nd Gen Intel Xeon Scalable processors. By improving textual data analysis throughput by 45 percent, organizations could reduce the number of instances they need to purchase and manage when they select the M6i instance type. Figure 4: Relative BERT performance for M6i and M5n instances using 16 vCPUs. Higher numbers are better. Source: Principled Technologies. Relative BERT performance of m6i.4xlarge vs. m5n.4xlarge Larger is better 0 0.20 0.40 0.60 0.80 1.00 1.40 1.20 1.00 Relative throughput (examples/sec) M6i (INT8) M5n (INT8) M6i (INT8) Batch size: 1 Batch size: 32 1.21 1.00 1.45 M5n (INT8) 1.60 up to 45% better throughput 16 vCPUs June 2022 | 5 AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance
  • 6. Instances with 4 vCPUs: M6i vs. M6a After comparing BERT performance of M6i instances against that of instances based on previous-gen processors, we compared those three sizes of M6i instances against M6a instances with AMD EPYC processors. Figure 5 compares the relative amount of text these instance types analyzed on 4vCPU configurations. The M6i instances enabled by 3rd Gen Intel Xeon Scalable processors with INT8 precision analyzed data 5.29 times as fast as the M6a instances with 3rd Gen AMD EPYC processors using FP32 precision. Note: At the time of testing, INT8 precision—which can improve performance for these types of machine learning—was not available for BERT workloads on AMD EPYC processors. Using FP32 precision, M6i instances improved performance over M6a instances by as much as 68 percent. Figure 5: Relative BERT performance for M6i and M6a instances using 4 vCPUs. Higher numbers are better. Source: Principled Technologies. Instances with 8 vCPUs: M6i vs. M6a When we increased the instance sizes to 8 vCPUs, performance increases were similar to the 4vCPU configurations. Figure 6 compares the relative amount of text the instance types analyzed on 8vCPU configurations. The M6i instances enabled by 3rd Gen Intel Xeon Scalable processors analyzed data up to 5.10 times as fast as the M6a instances with 3rd Gen AMD EPYC processors. Figure 6: Relative BERT performance for M6i and M6a instances using 8 vCPUs. Higher numbers are better. Source: Principled Technologies. Relative BERT performance of m6i.xlarge vs. m6a.xlarge Larger is better 0 1.00 2.00 3.00 4.00 5.00 7.00 6.00 1.68 1.00 Relative throughput (examples/sec) M6i (INT8) M6i (FP32) M6a (FP32) M6i (INT8) M6i (FP32) M6a (FP32) Batch size: 1 Batch size: 32 4.24 1.00 1.57 5.29 Relative BERT performance of m6i.2xlarge vs. m6a.2xlarge Larger is better 0 1.00 2.00 3.00 4.00 5.00 7.00 6.00 1.64 1.00 Relative throughput (examples/sec) M6i (INT8) M6i (FP32) M6a (FP32) M6i (INT8) M6i (FP32) M6a (FP32) Batch size: 1 Batch size: 32 4.44 1.00 1.68 5.10 up to 5.10x the throughput up to 5.29x the throughput 4 vCPUs 8 vCPUs June 2022 | 6 AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance
  • 7. Instances with 16 vCPUs: M6i vs. M6a The biggest relative difference in BERT performance occurred in our 16vCPU comparison of M6i and M6a configurations. Figure 7 compares the relative examples per second the instance types analyzed on 16vCPU configurations. The M6i instances enabled by 3rd Gen Intel Xeon Scalable processors analyzed data up to 6.40 times as fast as the M6a instances with 3rd Gen AMD EPYC processors. These results show that for these types of BERT workloads, selecting M6i instances that offer INT8 precision over M6a instances that don’t could allow organizations to complete textual analysis workloads using fewer cloud instances. Figure 7: Relative BERT performance for M6i and M6a instances using 16 vCPUs. Higher numbers are better. Source: Principled Technologies. Relative BERT performance of m6i.4xlarge vs. m6a.4xlarge Larger is better 0 1.00 2.00 3.00 4.00 5.00 7.00 6.00 1.81 1.00 Relative throughput (examples/sec) M6i (INT8) M6i (FP32) M6a (FP32) M6i (INT8) M6i (FP32) M6a (FP32) Batch size: 1 Batch size: 32 6.37 1.00 2.24 6.40 up to 6.40x the throughput 16 vCPUs June 2022 | 7 AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance
  • 8. Scaling BERT workloads Another consideration for assessing BERT performance is to see how the throughput scales as you increase the size of the instance. Theoretically, performance could double as you double the vCPU count, which would be perfect linear scaling. While resource allocation makes this unlikely in the real world, the closer an instance approaches this ideal, the better. As Figure 8 shows, using results from our batch size: 1 tests, the M6i instance with 3rd Gen Intel Xeon Scalable processors had better BERT performance scaling from 8 vCPUs to 16 vCPUs compared to the M6a instance with AMD EPYC processors, though slightly worse scaling from 4 vCPUs to 8 vCPUs. Figure 8: How BERT performance scaled across instance sizes, compared to results from the 4vCPU tests with batch size 1. Higher numbers are better. Source: Principled Technologies. Figure 9 makes the same comparison, but uses results from our batch size: 32 testing. Again, the M6i instance with 3rd Gen Intel Xeon Scalable processors scaled more linearly from 4 to 16 vCPUs compared to the M6a instance. Figure 9: How BERT performance scaled across instance sizes, compared to results from the 4vCPU tests with batch size 32. Higher numbers are better. Source: Principled Technologies. Relative BERT performance scaling compared to 4vCPUs with batch size: 1 Larger is better 0 0.50 1.00 1.50 2.00 2.50 3.50 3.00 Relative throughput (examples/sec) 4 vCPUs 8 vCPUs 1.85 1.00 3.40 1.86 1.91 3.16 4.00 4.50 M6i (INT8) M6i (INT32) M6a (FP32) 16vCPUs 3.83 1.00 1.00 Relative BERT performance scaling compared to 4vCPUs with batch size: 32 Larger is better Relative throughput (examples/sec) 0 0.50 1.00 1.50 2.00 2.50 3.50 3.00 4 vCPUs 8 vCPUs 1.86 1.00 1.90 3.52 2.52 1.78 4.00 4.50 M6i (INT8) M6i (INT32) M6a (FP32) 16vCPUs 3.79 1.00 1.00 By selecting M6i instances that offer more linear, predictable performance scaling, organizations could more reliably fix their cloud operating budgets as textual analysis workloads continue to grow. June 2022 | 8 AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance
  • 9. Conclusion Organizations analyzing textual data using NLP through the BERT framework must decide which type of instance can deliver the BERT performance they need. In our tests, we found that across instance sizes, AWS M6i instances with 3rd Gen Intel Xeon Scalable processors outperformed both M5n instances with 2nd Gen Intel Xeon Scalable processors and M6a instances with 3rd Gen AMD EPYC processors for BERT machine learning. Plus, the M6i instances offered more predictable scaling at 16vCPUs. These performance increases could help you get quicker insight from textual data to better satisfy consumers and increase revenues. 1. TechTarget, “BERT language model,” accessed December 16, 2021, https://www.techtarget.com/searchenterpriseai/definition/BERT-language-model. 2. Intel, “3rd Gen Intel® Xeon® Scalable Processors,” accessed December 14, 2021, https://www.intel.com/content/www/us/en/products/docs/processors/xeon/3rd-gen-xeon-scalable-processors-brief.html. 3. Intel, “3rd Gen Intel® Xeon® Scalable Processors.” 4. Amazon, Amazon EC2 M6i Instances, accessed December 14, 2021, https://aws.amazon.com/ec2/instance-types/m6i/. Principled Technologies is a registered trademark of Principled Technologies, Inc. All other product names are the trademarks of their respective owners. For additional information, review the science behind this report. Principled Technologies® Facts matter.® Principled Technologies® Facts matter.® This project was commissioned by Intel. Read the science behind this report at https://facts.pt/ZymIIA3 June 2022 | 9 AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offered better BERT machine learning performance