SlideShare una empresa de Scribd logo
1 de 3
We Implement Big Data.

Webinar: Performance Testing Approach for Big Data Applications
Recorded version available at http://lf1.me/cqb/
Questions and Answers from the session

Q. I request some more explanation on what technical grounds you proposed
the SandStorm tool in the case study presented during the webinar.
A. One of the critical requirements in the project was to determine the maximum throughput of the
Kafka servers and suggest production deployment. In order to achieve this, we had to simulate incoming
message flow and monitor the Kafka resources. We chose SandStorm as it provides user interface to
define message types and sizes and load Kafka servers for incoming message load. It also provides
monitoring of Kafka servers during test execution to identify the parameters that were not optimally
tuned. This resulted in identifying an optimum configuration for the Kafka cluster in production.

Q. Will the test approaches be any different for wireless platforms?
A. If you want to test applications on wireless platforms like mobile applications, one of the critical
factors is to simulate various network conditions in which the application will be used. The test
approach should take care of executing the tests under conditions with varying bandwidth, n/w
conditions like 3G, 4G etc. to measure end user performance and identify any potential issues in the
infrastructure.

Q. What is the short coming of using traditional tools such as loadrunner to
model and test application performance?
A. I would answer this question within the scope of big data performance testing. As of today,
traditional tools such as LoadRunner do not support Big Data technologies. These tools provide a record
and playback functionality to record the communication like Http or any other for a target application
and generate test scripts. As presented in webinar, big data applications involve multiple technologies
and components which might use different protocols to communicate. So, scripts cannot be recorded.
They need to be developed using the API interface or user interface. These tools do not provide any user
interface for such technologies. Hence, we need specific tools to test the underlying big data
components.

© 2013 Impetus Technologies
We Implement Big Data.

Q. SandStorm Vs JMeter ... any interesting difference?
A. SandStorm provides inherent support for Big Data and mobile applications. It has extensive
monitoring abilities to monitor the target application across different components to identify
performance bottlenecks. For mobile performance, it provides ability to simulate varying network
conditions like 3G, 4G, WIFI etc. for realistic testing. It has an intuitive user interface to develop test
scripts and design test scenarios. Another major difference lies in the extensive reporting capabilities
that help in identifying performance issues and detailed test analysis.

Q. I am not sure if "Going to the cloud" can be a good idea - test results from a
shared infrastructure cloud such as Amazon EC2 cannot usually be repeated, so
how would you manage this?
A. One of the biggest challenges that teams face today is setting up a performance environment. It
involves significant costs and efforts to maintain the environment. Setting up the environment in cloud
helps in lowering the total cost and provides elasticity to scale the environment up and down depending
on the test results and analysis. Though, I agree that cloud uses virtualization but we have seen
repeatable results while running the tests in cloud. As a best practice we do monitor the resource
consumption of our instances and trigger alerts if we see any abnormal activity.

Q. Do we have any profiling tools for Big Data technologies like Hadoop,
Cassandra, Kafka etc.?
A. Yes, there are profiling tools available for different technologies. For e.g. Mongo DB comes up with
their own profiling utility that can be used to profile a running Mongo database instance. The database
profiler collects fine grained data about MongoDB write operations, cursors, database commands on a
running MongoDB instance. Similarly, other Java based technologies like Apache Hadoop, Kafka etc. can
be profiled using profilers and diagnostic tools like VisualVM. Many APM vendors have started
developing agent for Cassandra, Hadoop that can help in identifying performance bottlenecks in these
components.

Q. What are the critical performance parameters that we should monitor or
keep track of for messaging servers and NoSQL databases?
A. Each technology has its own set of parameters critical for optimum performance. For e.g. The most
important server configurations for performance are those that control the disk flush rate. The more
often data is flushed to disk, the more "seek-bound" component will be and the lower the throughput.
However very low application flush rates can lead to high latency when the flush finally does occur

© 2013 Impetus Technologies
We Implement Big Data.

(because of the volume of data that must be flushed). You need sufficient memory to buffer active
readers and writers. The disk throughput is important. In general disk throughput is the performance
bottleneck, and more disks are better. If you configure multiple data directories partitions will be
assigned in a round-robin fashion to data directories. Each partition will be entirely in one of the data
directories. If data is not balanced among partitions this can lead to load imbalance between disks. Disk
writing is usually a bottleneck in database systems. Therefore, write to disk frequency and initial storage
allocation can highly effect your system performance. Notice that delaying disk writing can affect your
system recovery. Disabling some of unused services, may help you save some CPU cycles. Make sure
your commit log and data directories (sstables) are on different disks. Compression maximizes the
storage capacity of Cassandra nodes by reducing the volume of data on disk and disk I/O, particularly for
read-dominated workloads. Cassandra quickly finds the location of rows in the SSTable index and
decompresses the relevant row chunks.

Q. You mentioned a couple of performance testing solutions namely YCSB and
SandStorm. How do these two compare?
A. YCSB is a performance benchmark utility that is developed by Yahoo. This supports multiple NoSQL
databases and comes up with pre-built clients. You can define the workload and run the scripts in your
test environment. It will generate its own test data and report the performance statistics.
Impetus SandStorm is an enterprise performance testing tool that support NoSQL as well as messaging
servers along with web, mobile and cloud applications. It can be used to create custom test scripts
depending on your Big Data application and run with multiple users to measure the real performance of
the application. It also provides monitoring of big data applications and helps in quickly identifying
issues with the resource consumption in the underlying component or infrastructure.

Write to us at bigdata@impetus.com for more information

© 2013 Impetus Technologies

Más contenido relacionado

Más de Impetus Technologies

Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarImpetus Technologies
 
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarBuilding Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarImpetus Technologies
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Impetus Technologies
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in ElasticsearchImpetus Technologies
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarImpetus Technologies
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarImpetus Technologies
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Impetus Technologies
 
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Impetus Technologies
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Impetus Technologies
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...Impetus Technologies
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastImpetus Technologies
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Impetus Technologies
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Impetus Technologies
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Impetus Technologies
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabImpetus Technologies
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trendsImpetus Technologies
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labImpetus Technologies
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...Impetus Technologies
 
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarReal-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarImpetus Technologies
 
Webinar real-time predictive analytics in manufacturing
Webinar  real-time predictive analytics in manufacturingWebinar  real-time predictive analytics in manufacturing
Webinar real-time predictive analytics in manufacturingImpetus Technologies
 

Más de Impetus Technologies (20)

Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarBuilding Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus Webinar
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in Elasticsearch
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
 
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus Webcast
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trends
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
 
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarReal-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
 
Webinar real-time predictive analytics in manufacturing
Webinar  real-time predictive analytics in manufacturingWebinar  real-time predictive analytics in manufacturing
Webinar real-time predictive analytics in manufacturing
 

Último

Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 

Último (20)

201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 

Performance Testing of Big Data Applications - Impetus Webcast Q&A

  • 1. We Implement Big Data. Webinar: Performance Testing Approach for Big Data Applications Recorded version available at http://lf1.me/cqb/ Questions and Answers from the session Q. I request some more explanation on what technical grounds you proposed the SandStorm tool in the case study presented during the webinar. A. One of the critical requirements in the project was to determine the maximum throughput of the Kafka servers and suggest production deployment. In order to achieve this, we had to simulate incoming message flow and monitor the Kafka resources. We chose SandStorm as it provides user interface to define message types and sizes and load Kafka servers for incoming message load. It also provides monitoring of Kafka servers during test execution to identify the parameters that were not optimally tuned. This resulted in identifying an optimum configuration for the Kafka cluster in production. Q. Will the test approaches be any different for wireless platforms? A. If you want to test applications on wireless platforms like mobile applications, one of the critical factors is to simulate various network conditions in which the application will be used. The test approach should take care of executing the tests under conditions with varying bandwidth, n/w conditions like 3G, 4G etc. to measure end user performance and identify any potential issues in the infrastructure. Q. What is the short coming of using traditional tools such as loadrunner to model and test application performance? A. I would answer this question within the scope of big data performance testing. As of today, traditional tools such as LoadRunner do not support Big Data technologies. These tools provide a record and playback functionality to record the communication like Http or any other for a target application and generate test scripts. As presented in webinar, big data applications involve multiple technologies and components which might use different protocols to communicate. So, scripts cannot be recorded. They need to be developed using the API interface or user interface. These tools do not provide any user interface for such technologies. Hence, we need specific tools to test the underlying big data components. © 2013 Impetus Technologies
  • 2. We Implement Big Data. Q. SandStorm Vs JMeter ... any interesting difference? A. SandStorm provides inherent support for Big Data and mobile applications. It has extensive monitoring abilities to monitor the target application across different components to identify performance bottlenecks. For mobile performance, it provides ability to simulate varying network conditions like 3G, 4G, WIFI etc. for realistic testing. It has an intuitive user interface to develop test scripts and design test scenarios. Another major difference lies in the extensive reporting capabilities that help in identifying performance issues and detailed test analysis. Q. I am not sure if "Going to the cloud" can be a good idea - test results from a shared infrastructure cloud such as Amazon EC2 cannot usually be repeated, so how would you manage this? A. One of the biggest challenges that teams face today is setting up a performance environment. It involves significant costs and efforts to maintain the environment. Setting up the environment in cloud helps in lowering the total cost and provides elasticity to scale the environment up and down depending on the test results and analysis. Though, I agree that cloud uses virtualization but we have seen repeatable results while running the tests in cloud. As a best practice we do monitor the resource consumption of our instances and trigger alerts if we see any abnormal activity. Q. Do we have any profiling tools for Big Data technologies like Hadoop, Cassandra, Kafka etc.? A. Yes, there are profiling tools available for different technologies. For e.g. Mongo DB comes up with their own profiling utility that can be used to profile a running Mongo database instance. The database profiler collects fine grained data about MongoDB write operations, cursors, database commands on a running MongoDB instance. Similarly, other Java based technologies like Apache Hadoop, Kafka etc. can be profiled using profilers and diagnostic tools like VisualVM. Many APM vendors have started developing agent for Cassandra, Hadoop that can help in identifying performance bottlenecks in these components. Q. What are the critical performance parameters that we should monitor or keep track of for messaging servers and NoSQL databases? A. Each technology has its own set of parameters critical for optimum performance. For e.g. The most important server configurations for performance are those that control the disk flush rate. The more often data is flushed to disk, the more "seek-bound" component will be and the lower the throughput. However very low application flush rates can lead to high latency when the flush finally does occur © 2013 Impetus Technologies
  • 3. We Implement Big Data. (because of the volume of data that must be flushed). You need sufficient memory to buffer active readers and writers. The disk throughput is important. In general disk throughput is the performance bottleneck, and more disks are better. If you configure multiple data directories partitions will be assigned in a round-robin fashion to data directories. Each partition will be entirely in one of the data directories. If data is not balanced among partitions this can lead to load imbalance between disks. Disk writing is usually a bottleneck in database systems. Therefore, write to disk frequency and initial storage allocation can highly effect your system performance. Notice that delaying disk writing can affect your system recovery. Disabling some of unused services, may help you save some CPU cycles. Make sure your commit log and data directories (sstables) are on different disks. Compression maximizes the storage capacity of Cassandra nodes by reducing the volume of data on disk and disk I/O, particularly for read-dominated workloads. Cassandra quickly finds the location of rows in the SSTable index and decompresses the relevant row chunks. Q. You mentioned a couple of performance testing solutions namely YCSB and SandStorm. How do these two compare? A. YCSB is a performance benchmark utility that is developed by Yahoo. This supports multiple NoSQL databases and comes up with pre-built clients. You can define the workload and run the scripts in your test environment. It will generate its own test data and report the performance statistics. Impetus SandStorm is an enterprise performance testing tool that support NoSQL as well as messaging servers along with web, mobile and cloud applications. It can be used to create custom test scripts depending on your Big Data application and run with multiple users to measure the real performance of the application. It also provides monitoring of big data applications and helps in quickly identifying issues with the resource consumption in the underlying component or infrastructure. Write to us at bigdata@impetus.com for more information © 2013 Impetus Technologies