SlideShare una empresa de Scribd logo
1 de 15
© Hortonworks Inc. 2014
Designing and Testing
Accumulo Iterators
Josh Elser
Member of Technical Staff
PMC, Apache Accumulo
10, November 2015
Page 1
Apache, Accumulo, and Apache Accumulo are trademarks of the Apache Software Foundation.
© Hortonworks Inc. 2014
Design
Page 2
How do I know if my Iterator works?
What can I do in an Iterator?
How are these methods even called?!
© Hortonworks Inc. 2014
Common Patterns
Only a certain subset of algorithms fit into Accumulo
Iterators well.
(Avoid shoving a square peg into a round hole.)
• Filtering
• Reduction
• Bounded aggregations
–Keep an upper bound on the number of elements being
aggregated to avoid memory issues
• Transformations
–Key sort-order must be retained
–Best limited to the Value only
Page 3
© Hortonworks Inc. 2014
Design
Josh’s Iterator Design Principles:
•Always make forward-progress
•Think functional – Avoid unnecessary state
•Operate only on the data you have
•Do one thing and do it efficiently
Page 4
© Hortonworks Inc. 2014
Design
Page 5
Make
Forward
Progress
Start
End
© Hortonworks Inc. 2014
Think about your Iterator like a function
Unnecessary State
Page 6
def sum(list):
sum = 0
for entry in list:
sum += entry
return sum
• Avoid holding onto state when
at all possible.
• Think in terms of a stream
rather than chunks of data.
• Beware of memory implications
when performing aggregations.
© Hortonworks Inc. 2014
Operate locally
Daily Reminder: Iterators have no calls
for implementing a safe cleanup.
• Iterators cannot properly handle I/O-related issues to
external systems.
• Slow-external calls result in slow Accumulo.
• Some problems are more-safely implemented outside of
an Accumulo Iterators. Not a Coprocessor/Container.
Page 7
© Hortonworks Inc. 2014
Simplicity
Avoid doing multiple things in a
single Iterator.
•Object Oriented Design 101
•Iterators can be tricky to debug on their own
•Configuring multiple iterators are a feature
Page 8
© Hortonworks Inc. 2014
Testing
You should always test your code
before running it in any environment
to ensure that it functions as intended.
Page 9
© Hortonworks Inc. 2014
Testing
HOW?
Page 10
© Hortonworks Inc. 2014
Testing
A framework designed for testing Iterators given input,
a Range, options, and expected output.
Page 11
© Hortonworks Inc. 2014
Testing
Page 12
Test
Test
Test
Test
Test
Iterator Class
Range
Iterator Options
Sorted Input Data
Verification of
output records
OR
True/False check
User-Provided
Framework
© Hortonworks Inc. 2014
Features
•Auto-Discovery of test cases
•JUnit Parameterized test integration
•Provided Generic Tests
–Default Constructor
–Re-Seek (teardown)
–Deep Copy Verification
Page 13
© Hortonworks Inc. 2014
Future Work
•More Iterator tests!
•A final resting place for the code
•Documentation
•Usability testing
Page 14
https://issues.apache.org/jira/browse/ACCUMULO-626
© Hortonworks Inc. 2014
Thanks!
jelser@hortonworks.com
Page 15

Más contenido relacionado

La actualidad más candente

Impala presentation
Impala presentationImpala presentation
Impala presentationtrihug
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala InternalsDavid Groozman
 
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BIDataWorks Summit
 
Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Cloudera, Inc.
 
(Aaron myers) hdfs impala
(Aaron myers)   hdfs impala(Aaron myers)   hdfs impala
(Aaron myers) hdfs impalaNAVER D2
 
Cloudera Impala, updated for v1.0
Cloudera Impala, updated for v1.0Cloudera Impala, updated for v1.0
Cloudera Impala, updated for v1.0Scott Leberknight
 
Cloudera Impala technical deep dive
Cloudera Impala technical deep diveCloudera Impala technical deep dive
Cloudera Impala technical deep divehuguk
 
Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014alanfgates
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impalamarkgrover
 
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera, Inc.
 
All of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperAll of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperJeff Smith
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleYifeng Jiang
 
Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizonArtem Ervits
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopCloudera, Inc.
 
Impala Architecture presentation
Impala Architecture presentationImpala Architecture presentation
Impala Architecture presentationhadooparchbook
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesSQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesOReillyStrata
 
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...Cloudera, Inc.
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasyDataWorks Summit
 

La actualidad más candente (19)

Impala presentation
Impala presentationImpala presentation
Impala presentation
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
 
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BI
 
Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013
 
(Aaron myers) hdfs impala
(Aaron myers)   hdfs impala(Aaron myers)   hdfs impala
(Aaron myers) hdfs impala
 
Cloudera Impala, updated for v1.0
Cloudera Impala, updated for v1.0Cloudera Impala, updated for v1.0
Cloudera Impala, updated for v1.0
 
Cloudera Impala technical deep dive
Cloudera Impala technical deep diveCloudera Impala technical deep dive
Cloudera Impala technical deep dive
 
Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impala
 
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
 
All of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperAll of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL Developer
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
 
Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizon
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in Hadoop
 
Cloudera impala
Cloudera impalaCloudera impala
Cloudera impala
 
Impala Architecture presentation
Impala Architecture presentationImpala Architecture presentation
Impala Architecture presentation
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesSQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
 
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made Easy
 

Similar a Designing and Testing Accumulo Iterators

The View - Lotusscript coding best practices
The View - Lotusscript coding best practicesThe View - Lotusscript coding best practices
The View - Lotusscript coding best practicesBill Buchan
 
Integration strategies best practices- Mulesoft meetup April 2018
Integration strategies   best practices- Mulesoft meetup April 2018Integration strategies   best practices- Mulesoft meetup April 2018
Integration strategies best practices- Mulesoft meetup April 2018Rohan Rasane
 
Oracle Database Lifecycle Management
Oracle Database Lifecycle ManagementOracle Database Lifecycle Management
Oracle Database Lifecycle ManagementHari Srinivasan
 
Coherence 12.1.3 hidden gems
Coherence 12.1.3 hidden gemsCoherence 12.1.3 hidden gems
Coherence 12.1.3 hidden gemsharvraja
 
Oracle Open World Exadata Monitoring and Management with EM12c
Oracle Open World Exadata Monitoring and Management with EM12cOracle Open World Exadata Monitoring and Management with EM12c
Oracle Open World Exadata Monitoring and Management with EM12cKellyn Pot'Vin-Gorman
 
Using MySQL Enterprise Monitor for Continuous Performance Improvement
Using MySQL Enterprise Monitor for Continuous Performance ImprovementUsing MySQL Enterprise Monitor for Continuous Performance Improvement
Using MySQL Enterprise Monitor for Continuous Performance ImprovementMark Matthews
 
Agile Software Testing the Agilogy Way
Agile Software Testing the Agilogy WayAgile Software Testing the Agilogy Way
Agile Software Testing the Agilogy WayJordi Pradel
 
Hadoop engineering bo_f_final
Hadoop engineering bo_f_finalHadoop engineering bo_f_final
Hadoop engineering bo_f_finalRamya Sunil
 
Improving the Design of Existing Software
Improving the Design of Existing SoftwareImproving the Design of Existing Software
Improving the Design of Existing SoftwareSteven Smith
 
Practical Software Testing Tools
Practical Software Testing ToolsPractical Software Testing Tools
Practical Software Testing ToolsDr Ganesh Iyer
 
Load testing with Visual Studio and Azure - Andrew Siemer
Load testing with Visual Studio and Azure - Andrew SiemerLoad testing with Visual Studio and Azure - Andrew Siemer
Load testing with Visual Studio and Azure - Andrew SiemerAndrew Siemer
 
QA standup - workload analysis
QA standup  - workload analysisQA standup  - workload analysis
QA standup - workload analysisSerghei Radov
 
EARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxEARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxTechWell
 
Driving application development through behavior driven development
Driving application development through behavior driven developmentDriving application development through behavior driven development
Driving application development through behavior driven developmentEinar Ingebrigtsen
 
So You Think You Can Write a Test Case - XBOSoft Webinar
So You Think You Can Write a Test Case - XBOSoft WebinarSo You Think You Can Write a Test Case - XBOSoft Webinar
So You Think You Can Write a Test Case - XBOSoft WebinarXBOSoft
 
Putting Compilers to Work
Putting Compilers to WorkPutting Compilers to Work
Putting Compilers to WorkSingleStore
 
Leandro Melendez - Switching Performance Left & Right
Leandro Melendez - Switching Performance Left & RightLeandro Melendez - Switching Performance Left & Right
Leandro Melendez - Switching Performance Left & RightNeotys_Partner
 
Improving The Quality of Existing Software
Improving The Quality of Existing SoftwareImproving The Quality of Existing Software
Improving The Quality of Existing SoftwareSteven Smith
 

Similar a Designing and Testing Accumulo Iterators (20)

The View - Lotusscript coding best practices
The View - Lotusscript coding best practicesThe View - Lotusscript coding best practices
The View - Lotusscript coding best practices
 
Integration strategies best practices- Mulesoft meetup April 2018
Integration strategies   best practices- Mulesoft meetup April 2018Integration strategies   best practices- Mulesoft meetup April 2018
Integration strategies best practices- Mulesoft meetup April 2018
 
Oracle Database Lifecycle Management
Oracle Database Lifecycle ManagementOracle Database Lifecycle Management
Oracle Database Lifecycle Management
 
Coherence 12.1.3 hidden gems
Coherence 12.1.3 hidden gemsCoherence 12.1.3 hidden gems
Coherence 12.1.3 hidden gems
 
Oracle Open World Exadata Monitoring and Management with EM12c
Oracle Open World Exadata Monitoring and Management with EM12cOracle Open World Exadata Monitoring and Management with EM12c
Oracle Open World Exadata Monitoring and Management with EM12c
 
Using MySQL Enterprise Monitor for Continuous Performance Improvement
Using MySQL Enterprise Monitor for Continuous Performance ImprovementUsing MySQL Enterprise Monitor for Continuous Performance Improvement
Using MySQL Enterprise Monitor for Continuous Performance Improvement
 
Developer day v2
Developer day v2Developer day v2
Developer day v2
 
Agile Software Testing the Agilogy Way
Agile Software Testing the Agilogy WayAgile Software Testing the Agilogy Way
Agile Software Testing the Agilogy Way
 
Hadoop engineering bo_f_final
Hadoop engineering bo_f_finalHadoop engineering bo_f_final
Hadoop engineering bo_f_final
 
WMS Overview
WMS OverviewWMS Overview
WMS Overview
 
Improving the Design of Existing Software
Improving the Design of Existing SoftwareImproving the Design of Existing Software
Improving the Design of Existing Software
 
Practical Software Testing Tools
Practical Software Testing ToolsPractical Software Testing Tools
Practical Software Testing Tools
 
Load testing with Visual Studio and Azure - Andrew Siemer
Load testing with Visual Studio and Azure - Andrew SiemerLoad testing with Visual Studio and Azure - Andrew Siemer
Load testing with Visual Studio and Azure - Andrew Siemer
 
QA standup - workload analysis
QA standup  - workload analysisQA standup  - workload analysis
QA standup - workload analysis
 
EARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxEARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements Syntax
 
Driving application development through behavior driven development
Driving application development through behavior driven developmentDriving application development through behavior driven development
Driving application development through behavior driven development
 
So You Think You Can Write a Test Case - XBOSoft Webinar
So You Think You Can Write a Test Case - XBOSoft WebinarSo You Think You Can Write a Test Case - XBOSoft Webinar
So You Think You Can Write a Test Case - XBOSoft Webinar
 
Putting Compilers to Work
Putting Compilers to WorkPutting Compilers to Work
Putting Compilers to Work
 
Leandro Melendez - Switching Performance Left & Right
Leandro Melendez - Switching Performance Left & RightLeandro Melendez - Switching Performance Left & Right
Leandro Melendez - Switching Performance Left & Right
 
Improving The Quality of Existing Software
Improving The Quality of Existing SoftwareImproving The Quality of Existing Software
Improving The Quality of Existing Software
 

Más de Josh Elser

Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBaseJosh Elser
 
Apache Phoenix Query Server
Apache Phoenix Query ServerApache Phoenix Query Server
Apache Phoenix Query ServerJosh Elser
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandJosh Elser
 
Effective Testing of Apache Accumulo Iterators
Effective Testing of Apache Accumulo IteratorsEffective Testing of Apache Accumulo Iterators
Effective Testing of Apache Accumulo IteratorsJosh Elser
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseJosh Elser
 
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Josh Elser
 
Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Josh Elser
 
De-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServerDe-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServerJosh Elser
 
Data-Center Replication with Apache Accumulo
Data-Center Replication with Apache AccumuloData-Center Replication with Apache Accumulo
Data-Center Replication with Apache AccumuloJosh Elser
 
RPInventory 2-25-2010
RPInventory 2-25-2010RPInventory 2-25-2010
RPInventory 2-25-2010Josh Elser
 

Más de Josh Elser (10)

Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
 
Apache Phoenix Query Server
Apache Phoenix Query ServerApache Phoenix Query Server
Apache Phoenix Query Server
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
 
Effective Testing of Apache Accumulo Iterators
Effective Testing of Apache Accumulo IteratorsEffective Testing of Apache Accumulo Iterators
Effective Testing of Apache Accumulo Iterators
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
 
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016
 
Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Calcite meetup-2016-04-20
Calcite meetup-2016-04-20
 
De-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServerDe-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServer
 
Data-Center Replication with Apache Accumulo
Data-Center Replication with Apache AccumuloData-Center Replication with Apache Accumulo
Data-Center Replication with Apache Accumulo
 
RPInventory 2-25-2010
RPInventory 2-25-2010RPInventory 2-25-2010
RPInventory 2-25-2010
 

Último

Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 

Último (20)

2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 

Designing and Testing Accumulo Iterators

  • 1. © Hortonworks Inc. 2014 Designing and Testing Accumulo Iterators Josh Elser Member of Technical Staff PMC, Apache Accumulo 10, November 2015 Page 1 Apache, Accumulo, and Apache Accumulo are trademarks of the Apache Software Foundation.
  • 2. © Hortonworks Inc. 2014 Design Page 2 How do I know if my Iterator works? What can I do in an Iterator? How are these methods even called?!
  • 3. © Hortonworks Inc. 2014 Common Patterns Only a certain subset of algorithms fit into Accumulo Iterators well. (Avoid shoving a square peg into a round hole.) • Filtering • Reduction • Bounded aggregations –Keep an upper bound on the number of elements being aggregated to avoid memory issues • Transformations –Key sort-order must be retained –Best limited to the Value only Page 3
  • 4. © Hortonworks Inc. 2014 Design Josh’s Iterator Design Principles: •Always make forward-progress •Think functional – Avoid unnecessary state •Operate only on the data you have •Do one thing and do it efficiently Page 4
  • 5. © Hortonworks Inc. 2014 Design Page 5 Make Forward Progress Start End
  • 6. © Hortonworks Inc. 2014 Think about your Iterator like a function Unnecessary State Page 6 def sum(list): sum = 0 for entry in list: sum += entry return sum • Avoid holding onto state when at all possible. • Think in terms of a stream rather than chunks of data. • Beware of memory implications when performing aggregations.
  • 7. © Hortonworks Inc. 2014 Operate locally Daily Reminder: Iterators have no calls for implementing a safe cleanup. • Iterators cannot properly handle I/O-related issues to external systems. • Slow-external calls result in slow Accumulo. • Some problems are more-safely implemented outside of an Accumulo Iterators. Not a Coprocessor/Container. Page 7
  • 8. © Hortonworks Inc. 2014 Simplicity Avoid doing multiple things in a single Iterator. •Object Oriented Design 101 •Iterators can be tricky to debug on their own •Configuring multiple iterators are a feature Page 8
  • 9. © Hortonworks Inc. 2014 Testing You should always test your code before running it in any environment to ensure that it functions as intended. Page 9
  • 10. © Hortonworks Inc. 2014 Testing HOW? Page 10
  • 11. © Hortonworks Inc. 2014 Testing A framework designed for testing Iterators given input, a Range, options, and expected output. Page 11
  • 12. © Hortonworks Inc. 2014 Testing Page 12 Test Test Test Test Test Iterator Class Range Iterator Options Sorted Input Data Verification of output records OR True/False check User-Provided Framework
  • 13. © Hortonworks Inc. 2014 Features •Auto-Discovery of test cases •JUnit Parameterized test integration •Provided Generic Tests –Default Constructor –Re-Seek (teardown) –Deep Copy Verification Page 13
  • 14. © Hortonworks Inc. 2014 Future Work •More Iterator tests! •A final resting place for the code •Documentation •Usability testing Page 14 https://issues.apache.org/jira/browse/ACCUMULO-626
  • 15. © Hortonworks Inc. 2014 Thanks! jelser@hortonworks.com Page 15