SlideShare una empresa de Scribd logo
1 de 17
Descargar para leer sin conexión
Enterprise Challenges of Test Data
Size, Change, Complexity, Disparity, and Privacy
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 2
Enterprise Challenges of Test Data
For simple applications, representative test data can be
relatively easy
What if you are testing enterprise-scale applications?
In enterprise data centers, dozens or hundreds of
applications co-exist
These applications are of various sizes, complexities, and
criticalities
They operate on various data repositories, in some cases
shared data repositories
In some cases, disparate data repositories hold related
data
Let’s examine the test data options, and the challenges
that arise, at the enterprise scale
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 3
Options for Test Data
Hand generated: Tester creates all test data by hand
Tool generated: Generic (e.g., Excel) or specific tools, open-
source or commercial, used to generate test data according to
tester-specified pattern
Production: all or a subset of production data copied to test
environment
Anonymized production: Tool(s) used to perform irreversible
encryption, substitution, or scrambling of values
Pseudo-anonymized production: Reversible anonymization
In this presentation, I’ll address factors to consider when
choosing these options
At the enterprise level, you’ll probably need one or more tools
More than one option might be useful when creating test data
Whatever options are chosen, be really careful about:
Testing on actual production data in the production environment
Copying test data into the production environment
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 4
Test Data Is Simple When Life Is Like…
So, if you are testing a simple mobile app, you might be
thinking, “Test data, what’s the big deal?”
If the production data is small enough, you can generate
data to populate the database
Or maybe there’s no personal information in the
production data, so you just use that
In some cases, your testing situation might simple, like the
one shown here, but eventually it could evolve as your
application’s abilities grow
But the Enterprise Be Like…
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 5
Actually, it’s usually more
complicated than this…
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 6
Production Data, the Testing Gold Standard
For most large-scale systems, testers cannot create
sufficient and sufficiently diverse test data by hand
Data generation tools can create large datasets, but
the data might lack diversity, similar value
distributions, or other attributes of production data
Production data is ideal, especially when large data
sets have accumulated over years with various
current and past application versions
Production data looks exactly like the data the
application will encounter on release, since it is
But using production data for testing poses many
challenges
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 7
Privacy Concerns: Fly Meets Ointment
Production data often contains personal data
Trying to secure test data handling during testing can
impose undesirable inefficiencies and constraints
Such inefficiencies and constraints apply especially when
using distributed and outsourced testing
Enter the anonymization tool
However, such tools are not magic
Misuse can result in trivially easy reversal
For example, letter substitution could lead to names like
“Kpio Cspxo,” obviously “John Brown”
In addition, Kpio Cspxo (and other nonsense) is not
realistic
For testing you must preserve data meaning and
meaningfulness
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 8
Meaning and Meaningfulness
Substitutions, good and bad
Good: “John Brown” to “Lester Camden”
Bad: “John Brown” to “Charlotte Dostoyevsky” (esp. if we
have gender field)
Queries, views, and joins of anonymized data must
yield same results as on production data
For example, if a production query on “John Brown”
returned 20 records, a test data query for “Lester
Camden” must also return 20 records
Problems here will affect functionality, performance,
reliability, and load testing
In addition, localization can be an issue with
different languages and cultures, especially if
multiple character sets can apply
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 9
Oh, Those Long-Distance Relationships
Data and relationships between data can span databases
For example:
Three applications have years of data describing the same population,
spread across three different databases
The applications interoperate and share data
Data-warehousing and analytical applications access the related data
across databases
Logical records are created with de facto joins across via de facto
foreign keys
Anonymization can break these integrity constraints
Meaningful end-to-end testing of functionality, performance,
throughput, reliability, localization, and security becomes impossible
For many of our clients, the usefulness of the test data for
interoperability testing poses the hardest challenge
This is true whether hand generated, tool generated,
production, pseudo-anonymized, or anonymized
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 10
Generic Test Data Requirements
Keep existing data quality levels
Most production data contains a large number (~25%) of
errors
Error rates must be the same for test data, but, if privacy is
an issue, must not enable reverse engineering of the data
Ensure maintainability
Be able to edit, add, update, and delete data
This includes individual fields and records, whole database,
and across logical records spanning databases
Maximize efficiency of the test data acquisition and
update processes
If production data is to be copied and/or
anonymized, ensure those operations occur on
quiescent data
Test Database Server Costs
Yet another challenge to production-scale test
data is cost
The test data must reside on test database
servers, which can involve:
DBMS licensing costs
Server costs
Disk space costs
Ongoing hardware and software maintenance
A regular backup process (with attendant costs for
backup media as well as labor)
Using open-source DBMS only addresses the first
of these costs
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 11
Storage Virtualization
This allows abstraction of physical storage
space
It can reduce some of the cost issues associated
with test data storage
It’s probably under-utilized by testing groups,
as I hear very little about it online, from clients,
or at conferences
It can help in terms of allocating/de-allocating
production-like storage for testing
The issue of anonymization of the production
data still exists, though
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 12
Coping with Data Size
Enterprise data have gotten huge, due to increases in disk capacity
Unfortunately, disk throughput has not kept up
This creates serious issues with copying, anonymizing, and
managing test data that comes near to or achieves production size
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 13
Figure from tylermuth.wordpress.com
Specialized Application Support
Many enterprises use tools such as SAP to help manage
their businesses
There are test data anonymization tools for the more
common such enterprise applications
If you are using a more obscure application, no such tool
may be available
Similar issues exist with NoSQL and other data formats
Further, the application vendor might not be willing to
share metadata information
This should be a consideration during enterprise
application selection
However, testers are rarely included in such decisions,
and once made, such decisions often don’t change for
years
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 14
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 15
Success with Test Data Tools
Success with test data tools is like any other test tool
Assemble a team
Elicit requirements from users and stakeholders (including
those who understand privacy/security issues)
Identify risks and constraints
Determine tool options (open-source and commercial)
Evaluate tools to identify short-list options
Select the tool
Pilot the tool
Deploy the tool
Short-cutting this process or imposing unrealistic
budget targets can cause significant inefficiencies and
possibly even inability to fulfill basic needs
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 16
Conclusions
Testing enterprise applications poses challenges
beyond what small-scale applications pose
This is especially true for test data
Challenges include sources of data, privacy,
availability of tools, size, cost, and testing
usefulness
Like any complex situation, success requires
understanding the challenges and planning to
meet them
This presentation should help you identify your
specific challenges and build an appropriate plan
Enterprise Challenges of Test Data
www.rbcs-us.com
Copyright (c) 2016 RBCS Page 17
For over twenty years, RBCS has delivered software and hardware testing services,
including consulting, outsourcing, and training. Employing the industry’s most experienced
and recognized consultants, RBCS conducts product testing, builds and improves testing
groups, and hires testing staff for hundreds of clients worldwide. Ranging from Fortune 20
companies to start-ups, RBCS clients save time and money through improved product
development, decreased tech support calls, improved corporate reputation and more. To
learn more about RBCS, visit www.rbcs-us.com.
Address: RBCS, Inc.
31520 Beck Road
Bulverde, TX 78163-3911
USA
Phone: +1 (830) 438-4830
E-mail: info@rbcs-us.com
Web: www.rbcs-us.com
Facebook: RBCS, Inc
Twitter: @RBCS, @LaikaTestDog
…Contact RBCS

Más contenido relacionado

La actualidad más candente

Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya GargBig Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
QA or the Highway
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in Action
DataWorks Summit
 
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Spark Summit
 

La actualidad más candente (20)

Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya GargBig Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
 
Building Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSetsBuilding Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSets
 
Building a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with RBuilding a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with R
 
Log I am your father
Log I am your fatherLog I am your father
Log I am your father
 
Real Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with SparkReal Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with Spark
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp
 
Big Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on SparkBig Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on Spark
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in Action
 
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystemOptimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystem
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeData Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
 
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
 
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...
 
TESTING IN BIG DATA WORLD
TESTING IN BIG DATA  WORLDTESTING IN BIG DATA  WORLD
TESTING IN BIG DATA WORLD
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing Architecture
 
Enabling Physics and Empirical-Based Algorithms with Spark Using the Integrat...
Enabling Physics and Empirical-Based Algorithms with Spark Using the Integrat...Enabling Physics and Empirical-Based Algorithms with Spark Using the Integrat...
Enabling Physics and Empirical-Based Algorithms with Spark Using the Integrat...
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
 

Destacado

ISTQB / ISEB Foundation Exam Practice
ISTQB / ISEB Foundation Exam PracticeISTQB / ISEB Foundation Exam Practice
ISTQB / ISEB Foundation Exam Practice
Yogindernath Gupta
 
KeytorcTestTalks #11 - Serkan Akoğlanoğlu, Release Management vs Test Management
KeytorcTestTalks #11 - Serkan Akoğlanoğlu, Release Management vs Test ManagementKeytorcTestTalks #11 - Serkan Akoğlanoğlu, Release Management vs Test Management
KeytorcTestTalks #11 - Serkan Akoğlanoğlu, Release Management vs Test Management
Keytorc Software Testing Services
 
KeytorcTestTalks #11 - Onur Başkirt, Agile Test Management with Testrail
KeytorcTestTalks #11 - Onur Başkirt, Agile Test Management with Testrail KeytorcTestTalks #11 - Onur Başkirt, Agile Test Management with Testrail
KeytorcTestTalks #11 - Onur Başkirt, Agile Test Management with Testrail
Keytorc Software Testing Services
 
EbruKazaskeroglu_CV2016_ENG.PDF
EbruKazaskeroglu_CV2016_ENG.PDFEbruKazaskeroglu_CV2016_ENG.PDF
EbruKazaskeroglu_CV2016_ENG.PDF
Ebru Kazaskeroglu
 
ISTQB / ISEB Foundation Exam Practice -1
ISTQB / ISEB Foundation Exam Practice -1ISTQB / ISEB Foundation Exam Practice -1
ISTQB / ISEB Foundation Exam Practice -1
Yogindernath Gupta
 
Keynote Systems - Mobile Solutions Overview Presentation
Keynote Systems - Mobile Solutions Overview PresentationKeynote Systems - Mobile Solutions Overview Presentation
Keynote Systems - Mobile Solutions Overview Presentation
vprathap
 
ISTQB / ISEB Foundation Exam Practice - 2
ISTQB / ISEB Foundation Exam Practice - 2ISTQB / ISEB Foundation Exam Practice - 2
ISTQB / ISEB Foundation Exam Practice - 2
Yogindernath Gupta
 
KeytorcTestTalks #11 - Berk Dülger, DevOps Tactical Adaption Theory
KeytorcTestTalks #11 - Berk Dülger, DevOps Tactical Adaption TheoryKeytorcTestTalks #11 - Berk Dülger, DevOps Tactical Adaption Theory
KeytorcTestTalks #11 - Berk Dülger, DevOps Tactical Adaption Theory
Keytorc Software Testing Services
 
Test Automation - Keytorc Approach
Test Automation - Keytorc Approach Test Automation - Keytorc Approach
Test Automation - Keytorc Approach
Keytorc Software Testing Services
 

Destacado (20)

ISTQB / ISEB Foundation Exam Practice
ISTQB / ISEB Foundation Exam PracticeISTQB / ISEB Foundation Exam Practice
ISTQB / ISEB Foundation Exam Practice
 
KeytorcTestTalks #11 - Serkan Akoğlanoğlu, Release Management vs Test Management
KeytorcTestTalks #11 - Serkan Akoğlanoğlu, Release Management vs Test ManagementKeytorcTestTalks #11 - Serkan Akoğlanoğlu, Release Management vs Test Management
KeytorcTestTalks #11 - Serkan Akoğlanoğlu, Release Management vs Test Management
 
KeytorcTestTalks #11 - Duygu Onaral, Agile QA'in rolü
KeytorcTestTalks #11 - Duygu Onaral, Agile QA'in rolüKeytorcTestTalks #11 - Duygu Onaral, Agile QA'in rolü
KeytorcTestTalks #11 - Duygu Onaral, Agile QA'in rolü
 
KeytorcTestTalks #11 - Onur Başkirt, Agile Test Management with Testrail
KeytorcTestTalks #11 - Onur Başkirt, Agile Test Management with Testrail KeytorcTestTalks #11 - Onur Başkirt, Agile Test Management with Testrail
KeytorcTestTalks #11 - Onur Başkirt, Agile Test Management with Testrail
 
Testistanbul 2016 - Keynote: "The Story of Appium" by Dan Cuellar
Testistanbul 2016 - Keynote: "The Story of Appium" by Dan CuellarTestistanbul 2016 - Keynote: "The Story of Appium" by Dan Cuellar
Testistanbul 2016 - Keynote: "The Story of Appium" by Dan Cuellar
 
EbruKazaskeroglu_CV2016_ENG.PDF
EbruKazaskeroglu_CV2016_ENG.PDFEbruKazaskeroglu_CV2016_ENG.PDF
EbruKazaskeroglu_CV2016_ENG.PDF
 
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
 
ISTQB / ISEB Foundation Exam Practice -1
ISTQB / ISEB Foundation Exam Practice -1ISTQB / ISEB Foundation Exam Practice -1
ISTQB / ISEB Foundation Exam Practice -1
 
Istqb ctfl-series - Black Box Testing
Istqb ctfl-series - Black Box TestingIstqb ctfl-series - Black Box Testing
Istqb ctfl-series - Black Box Testing
 
Keytorc Proje Ekibi Zubizu Sunumu - Emirhan Şen
Keytorc Proje Ekibi Zubizu Sunumu - Emirhan ŞenKeytorc Proje Ekibi Zubizu Sunumu - Emirhan Şen
Keytorc Proje Ekibi Zubizu Sunumu - Emirhan Şen
 
Keynote Systems - Mobile Solutions Overview Presentation
Keynote Systems - Mobile Solutions Overview PresentationKeynote Systems - Mobile Solutions Overview Presentation
Keynote Systems - Mobile Solutions Overview Presentation
 
İyi Bir Test Uzmanı Olmak İçin...
İyi Bir Test Uzmanı Olmak İçin...İyi Bir Test Uzmanı Olmak İçin...
İyi Bir Test Uzmanı Olmak İçin...
 
ISTQB / ISEB Foundation Exam Practice - 2
ISTQB / ISEB Foundation Exam Practice - 2ISTQB / ISEB Foundation Exam Practice - 2
ISTQB / ISEB Foundation Exam Practice - 2
 
KeytorcTestTalks #11 - Berk Dülger, DevOps Tactical Adaption Theory
KeytorcTestTalks #11 - Berk Dülger, DevOps Tactical Adaption TheoryKeytorcTestTalks #11 - Berk Dülger, DevOps Tactical Adaption Theory
KeytorcTestTalks #11 - Berk Dülger, DevOps Tactical Adaption Theory
 
Keytorc Proje Ekibi Zubizu Sunumu - Ozan İlhan
Keytorc Proje Ekibi Zubizu Sunumu - Ozan İlhanKeytorc Proje Ekibi Zubizu Sunumu - Ozan İlhan
Keytorc Proje Ekibi Zubizu Sunumu - Ozan İlhan
 
Introduction to ISTQB & ISEB Certifications
Introduction to ISTQB & ISEB CertificationsIntroduction to ISTQB & ISEB Certifications
Introduction to ISTQB & ISEB Certifications
 
ISTQB Foundation Level: Why, Why Not and How?
ISTQB Foundation Level: Why, Why Not and How?ISTQB Foundation Level: Why, Why Not and How?
ISTQB Foundation Level: Why, Why Not and How?
 
Keytorc Proje Ekibi Zubizu Sunumu - Miray Doğan
Keytorc Proje Ekibi Zubizu Sunumu - Miray DoğanKeytorc Proje Ekibi Zubizu Sunumu - Miray Doğan
Keytorc Proje Ekibi Zubizu Sunumu - Miray Doğan
 
Test Automation - Keytorc Approach
Test Automation - Keytorc Approach Test Automation - Keytorc Approach
Test Automation - Keytorc Approach
 
ISTQB / ISEB Foundation Exam Practice - 6
ISTQB / ISEB Foundation Exam Practice - 6ISTQB / ISEB Foundation Exam Practice - 6
ISTQB / ISEB Foundation Exam Practice - 6
 

Similar a Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black

Discussion 1 The incorrect implementation of databases ou
Discussion 1 The incorrect implementation of databases ouDiscussion 1 The incorrect implementation of databases ou
Discussion 1 The incorrect implementation of databases ou
huttenangela
 
BizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementBizDataX White paper Test Data Management
BizDataX White paper Test Data Management
Dragan Kinkela
 
Testing web applications
Testing web applicationsTesting web applications
Testing web applications
msksaba
 
Magical Performance tuning with Gomez
Magical Performance tuning with GomezMagical Performance tuning with Gomez
Magical Performance tuning with Gomez
mcsaha
 
Are You Ready For More Visitors Cognizant Gomez Jan20
Are You Ready For More Visitors   Cognizant  Gomez Jan20Are You Ready For More Visitors   Cognizant  Gomez Jan20
Are You Ready For More Visitors Cognizant Gomez Jan20
Compuware APM
 
Testing Data & Data-Centric Applications - Whitepaper
Testing Data & Data-Centric Applications - WhitepaperTesting Data & Data-Centric Applications - Whitepaper
Testing Data & Data-Centric Applications - Whitepaper
Ryan Dowd
 
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docxFai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
ssuser454af01
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18
Harvinder Atwal
 

Similar a Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black (20)

Critical Preflight Checks for Your EPM Applications
Critical Preflight Checks for Your EPM ApplicationsCritical Preflight Checks for Your EPM Applications
Critical Preflight Checks for Your EPM Applications
 
dq_fail.pdf
dq_fail.pdfdq_fail.pdf
dq_fail.pdf
 
Discussion 1 The incorrect implementation of databases ou
Discussion 1 The incorrect implementation of databases ouDiscussion 1 The incorrect implementation of databases ou
Discussion 1 The incorrect implementation of databases ou
 
Virtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise ScaleVirtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise Scale
 
BizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementBizDataX White paper Test Data Management
BizDataX White paper Test Data Management
 
Testing web applications
Testing web applicationsTesting web applications
Testing web applications
 
Magical Performance tuning with Gomez
Magical Performance tuning with GomezMagical Performance tuning with Gomez
Magical Performance tuning with Gomez
 
Are You Ready For More Visitors Cognizant Gomez Jan20
Are You Ready For More Visitors   Cognizant  Gomez Jan20Are You Ready For More Visitors   Cognizant  Gomez Jan20
Are You Ready For More Visitors Cognizant Gomez Jan20
 
Ibm Optim Techical Overview 01282009
Ibm Optim Techical Overview 01282009Ibm Optim Techical Overview 01282009
Ibm Optim Techical Overview 01282009
 
engage 2015 - - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
engage 2015 -  - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...engage 2015 -  - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
engage 2015 - - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
 
Testing Data & Data-Centric Applications - Whitepaper
Testing Data & Data-Centric Applications - WhitepaperTesting Data & Data-Centric Applications - Whitepaper
Testing Data & Data-Centric Applications - Whitepaper
 
White Paper, System Z Dataset Naming Standards
White Paper, System Z Dataset Naming StandardsWhite Paper, System Z Dataset Naming Standards
White Paper, System Z Dataset Naming Standards
 
How to generate Synthetic Data for an effective App Testing strategy.pdf
How to generate Synthetic Data for an effective App Testing strategy.pdfHow to generate Synthetic Data for an effective App Testing strategy.pdf
How to generate Synthetic Data for an effective App Testing strategy.pdf
 
Oracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagridOracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagrid
 
Case Study: Nationwide Building Society's CA Test Data Manager Success Story
Case Study: Nationwide Building Society's CA Test Data Manager Success StoryCase Study: Nationwide Building Society's CA Test Data Manager Success Story
Case Study: Nationwide Building Society's CA Test Data Manager Success Story
 
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docxFai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Finance Technologies: Buy or Rent
Finance Technologies: Buy or RentFinance Technologies: Buy or Rent
Finance Technologies: Buy or Rent
 
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Cloud Computing Realities - Getting past the hype and setting your cloud stra...Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black

  • 1. Enterprise Challenges of Test Data Size, Change, Complexity, Disparity, and Privacy
  • 2. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 2 Enterprise Challenges of Test Data For simple applications, representative test data can be relatively easy What if you are testing enterprise-scale applications? In enterprise data centers, dozens or hundreds of applications co-exist These applications are of various sizes, complexities, and criticalities They operate on various data repositories, in some cases shared data repositories In some cases, disparate data repositories hold related data Let’s examine the test data options, and the challenges that arise, at the enterprise scale
  • 3. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 3 Options for Test Data Hand generated: Tester creates all test data by hand Tool generated: Generic (e.g., Excel) or specific tools, open- source or commercial, used to generate test data according to tester-specified pattern Production: all or a subset of production data copied to test environment Anonymized production: Tool(s) used to perform irreversible encryption, substitution, or scrambling of values Pseudo-anonymized production: Reversible anonymization In this presentation, I’ll address factors to consider when choosing these options At the enterprise level, you’ll probably need one or more tools More than one option might be useful when creating test data Whatever options are chosen, be really careful about: Testing on actual production data in the production environment Copying test data into the production environment
  • 4. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 4 Test Data Is Simple When Life Is Like… So, if you are testing a simple mobile app, you might be thinking, “Test data, what’s the big deal?” If the production data is small enough, you can generate data to populate the database Or maybe there’s no personal information in the production data, so you just use that In some cases, your testing situation might simple, like the one shown here, but eventually it could evolve as your application’s abilities grow
  • 5. But the Enterprise Be Like… Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 5 Actually, it’s usually more complicated than this…
  • 6. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 6 Production Data, the Testing Gold Standard For most large-scale systems, testers cannot create sufficient and sufficiently diverse test data by hand Data generation tools can create large datasets, but the data might lack diversity, similar value distributions, or other attributes of production data Production data is ideal, especially when large data sets have accumulated over years with various current and past application versions Production data looks exactly like the data the application will encounter on release, since it is But using production data for testing poses many challenges
  • 7. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 7 Privacy Concerns: Fly Meets Ointment Production data often contains personal data Trying to secure test data handling during testing can impose undesirable inefficiencies and constraints Such inefficiencies and constraints apply especially when using distributed and outsourced testing Enter the anonymization tool However, such tools are not magic Misuse can result in trivially easy reversal For example, letter substitution could lead to names like “Kpio Cspxo,” obviously “John Brown” In addition, Kpio Cspxo (and other nonsense) is not realistic For testing you must preserve data meaning and meaningfulness
  • 8. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 8 Meaning and Meaningfulness Substitutions, good and bad Good: “John Brown” to “Lester Camden” Bad: “John Brown” to “Charlotte Dostoyevsky” (esp. if we have gender field) Queries, views, and joins of anonymized data must yield same results as on production data For example, if a production query on “John Brown” returned 20 records, a test data query for “Lester Camden” must also return 20 records Problems here will affect functionality, performance, reliability, and load testing In addition, localization can be an issue with different languages and cultures, especially if multiple character sets can apply
  • 9. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 9 Oh, Those Long-Distance Relationships Data and relationships between data can span databases For example: Three applications have years of data describing the same population, spread across three different databases The applications interoperate and share data Data-warehousing and analytical applications access the related data across databases Logical records are created with de facto joins across via de facto foreign keys Anonymization can break these integrity constraints Meaningful end-to-end testing of functionality, performance, throughput, reliability, localization, and security becomes impossible For many of our clients, the usefulness of the test data for interoperability testing poses the hardest challenge This is true whether hand generated, tool generated, production, pseudo-anonymized, or anonymized
  • 10. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 10 Generic Test Data Requirements Keep existing data quality levels Most production data contains a large number (~25%) of errors Error rates must be the same for test data, but, if privacy is an issue, must not enable reverse engineering of the data Ensure maintainability Be able to edit, add, update, and delete data This includes individual fields and records, whole database, and across logical records spanning databases Maximize efficiency of the test data acquisition and update processes If production data is to be copied and/or anonymized, ensure those operations occur on quiescent data
  • 11. Test Database Server Costs Yet another challenge to production-scale test data is cost The test data must reside on test database servers, which can involve: DBMS licensing costs Server costs Disk space costs Ongoing hardware and software maintenance A regular backup process (with attendant costs for backup media as well as labor) Using open-source DBMS only addresses the first of these costs Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 11
  • 12. Storage Virtualization This allows abstraction of physical storage space It can reduce some of the cost issues associated with test data storage It’s probably under-utilized by testing groups, as I hear very little about it online, from clients, or at conferences It can help in terms of allocating/de-allocating production-like storage for testing The issue of anonymization of the production data still exists, though Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 12
  • 13. Coping with Data Size Enterprise data have gotten huge, due to increases in disk capacity Unfortunately, disk throughput has not kept up This creates serious issues with copying, anonymizing, and managing test data that comes near to or achieves production size Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 13 Figure from tylermuth.wordpress.com
  • 14. Specialized Application Support Many enterprises use tools such as SAP to help manage their businesses There are test data anonymization tools for the more common such enterprise applications If you are using a more obscure application, no such tool may be available Similar issues exist with NoSQL and other data formats Further, the application vendor might not be willing to share metadata information This should be a consideration during enterprise application selection However, testers are rarely included in such decisions, and once made, such decisions often don’t change for years Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 14
  • 15. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 15 Success with Test Data Tools Success with test data tools is like any other test tool Assemble a team Elicit requirements from users and stakeholders (including those who understand privacy/security issues) Identify risks and constraints Determine tool options (open-source and commercial) Evaluate tools to identify short-list options Select the tool Pilot the tool Deploy the tool Short-cutting this process or imposing unrealistic budget targets can cause significant inefficiencies and possibly even inability to fulfill basic needs
  • 16. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 16 Conclusions Testing enterprise applications poses challenges beyond what small-scale applications pose This is especially true for test data Challenges include sources of data, privacy, availability of tools, size, cost, and testing usefulness Like any complex situation, success requires understanding the challenges and planning to meet them This presentation should help you identify your specific challenges and build an appropriate plan
  • 17. Enterprise Challenges of Test Data www.rbcs-us.com Copyright (c) 2016 RBCS Page 17 For over twenty years, RBCS has delivered software and hardware testing services, including consulting, outsourcing, and training. Employing the industry’s most experienced and recognized consultants, RBCS conducts product testing, builds and improves testing groups, and hires testing staff for hundreds of clients worldwide. Ranging from Fortune 20 companies to start-ups, RBCS clients save time and money through improved product development, decreased tech support calls, improved corporate reputation and more. To learn more about RBCS, visit www.rbcs-us.com. Address: RBCS, Inc. 31520 Beck Road Bulverde, TX 78163-3911 USA Phone: +1 (830) 438-4830 E-mail: info@rbcs-us.com Web: www.rbcs-us.com Facebook: RBCS, Inc Twitter: @RBCS, @LaikaTestDog …Contact RBCS