SlideShare una empresa de Scribd logo
1 de 10
Descargar para leer sin conexión
6. April 2018
ABench: Big Data Architecture Stack
Benchmark
[Vision Paper]
Todor Ivanov todor@dbis.cs.uni-frankfurt.de
Goethe University Frankfurt am Main, Germany
http://www.bigdata.uni-frankfurt.de/
Rekha Singhal rekha.singhal@tcs.com
TCS Research – Mumbai, India
http://www.tcs.com
6. April 2018
Motivation
• Growing number of new Big Data technologies and connectors in the Big Data Stacks
 Challenges for Solution Architects, Data Engineers, Data Scientist, Developers, etc.
• Missing benchmarks for each technology, connector or a combination of them
• Consequence  Increasing complexity in the Big Data Architecture Stacks
• Our approach  ABench: Big Data Architecture Stack Benchmark
2ICPE 2018, Berlin, Germany, April 9-13
6. April 2018
ABench Features
• Benchmark Framework
 Data generators or plugins for custom data generators
 Include data generator or public data sets to simulate workload that stresses the
architecture
• Reuse of existing benchmarks
 Case study using BigBench (in the next slides, Streaming and Machine Learning)
• Open source implementation and extendable design
• Easy to setup and extend
• Supporting and combining all four types of benchmarks in ABench
3ICPE 2018, Berlin, Germany, April 9-13
6. April 2018
Benchmarks Types (adapted from Andersen and Pettersen [1])
1. Generic Benchmarking: checks whether an implementation
fulfills given business requirements and specifications (Is the
defined business specification implemented accurately?).
2. Competitive Benchmarking: is a performance comparison
between the best tools on the platform layer that offer similar
functionality (e.g., throughput of MapReduce vs. Spark vs.
Flink).
3. Functional Benchmarking is a functional comparison of the
features of the tool against technologies from the same area.
(e.g., Spark Streaming vs. Spark Structured Streaming vs. Flink
Streaming).
4. Internal Benchmarking: comparing different implementations
of a functionality (e.g., Spark Scala vs. Java vs. R vs. PySpark)
4ICPE 2018, Berlin, Germany, April 9-13
6. April 2018
ABench Framework
5ICPE 2018, Berlin, Germany, April 9-13
Data Model Data StorageData Generation
Workload Generator
Benchmark Control Knobs
Performance Data Collection
Benchmark Validation Benchmark Metrics
Data Model
System to
Benchmark
6. April 2018
Stream Processing Benchmark – Use Case
• Adding stream processing to BigBench [2,3]
• Reuse of the web click logs in JSON format from BigBench V2 [3]
• Adding new streaming workloads
 possibility to execute the queries on a subset of the incoming stream of data
• Provide benchmark implementations based on Spark Streaming and Kafka
• Work In-progress: Exploratory Analysis of Spark Structured Streaming, @PABS 2018, Todor Ivanov
and Jason Taaffe
6ICPE 2018, Berlin, Germany, April 9-13
6. April 2018
Machine Learning Benchmark – Use Case
• Expanding the type of Machine Learning workloads in BigBench [2]
 five (Q5, Q20, Q25, Q26 and Q28) out of the 30 queries cover common ML algorithms
• Proposal by Sweta Singh (IBM)[4] for new workload with Collaborative Filtering using
Matrix Factorization implementation in Spark MLlib via the Alternating Least Squares (ALS)
• Other types of advanced analytics inspired by Gartner [5]:
 descriptive analytics
 diagnostic analytics
 predictive analytics
 prescriptive analytics
• Introduce new ML metrics for scalability and accuracy
7ICPE 2018, Berlin, Germany, April 9-13
6. April 2018
Next Steps
• Building express version of the benchmark framework
• Provide open source implementation of the Use Case benchmarks to stress test the existing
Big Data Architecture Stacks
• Enable the comparison of the most popular technologies (e.g., Kafka, Spark, etc.)
8ICPE 2018, Berlin, Germany, April 9-13
6. April 2018
Thank you for your attention!
This research has been supported by the Research Group of the Standard Performance
Evaluation Corporation (SPEC).
ICPE 2018, Berlin, Germany, April 9-13 9
6. April 2018
REFERENCES
[1] Bjørn Andersen and P-G Pettersen. 1995. Benchmarking handbook. Champman & Hall.
[2] Ahmad Ghazal, Todor Ivanov, Pekka Kostamaa, Alain Crolotte, Ryan Voong, Mohammed Al-
Kateb, Waleed Ghazal, and Roberto V. Zicari. 2017. BigBench V2: The New and Improved
BigBench. In ICDE 2017, San Diego, CA, USA, April 19-22.
[3] Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte,
and Hans-Arno Jacobsen. 2013. BigBench: Towards An Industry Standard Benchmark for
Big Data Analytics. In SIGMOD 2013. 1197–1208.
[4] Sweta Singh. 2016. Benchmarking Spark Machine Learning Using BigBench. In 8th TPC
Technology Conference, TPCTC 2016, New Delhi, India, September 5-9, 2016.
[5] Gartner 2017, https://www.gartner.com/doc/3471553/-planning-guide-data-analytics
10ICPE 2018, Berlin, Germany, April 9-13

Más contenido relacionado

Más de DataBench

DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...DataBench
 
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench
 
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench
 
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...DataBench
 
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...DataBench
 
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019DataBench
 
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...DataBench
 
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019DataBench
 
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...DataBench
 
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...DataBench
 
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...DataBench
 
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...DataBench
 
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...DataBench
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 DataBench
 
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...DataBench
 
Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, B...
Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, B...Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, B...
Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, B...DataBench
 
DataBench - Project fiche
DataBench - Project ficheDataBench - Project fiche
DataBench - Project ficheDataBench
 

Más de DataBench (17)

DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
 
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
 
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
 
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
 
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
 
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
 
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
 
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
 
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
 
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
 
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
 
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
 
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
 
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
 
Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, B...
Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, B...Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, B...
Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, B...
 
DataBench - Project fiche
DataBench - Project ficheDataBench - Project fiche
DataBench - Project fiche
 

Último

Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

ABench: Big Data Architecture Stack Benchmark, Todor Ivanov, Rekha Singhal, ICPE 2018, 9-13/04/2018

  • 1. 6. April 2018 ABench: Big Data Architecture Stack Benchmark [Vision Paper] Todor Ivanov todor@dbis.cs.uni-frankfurt.de Goethe University Frankfurt am Main, Germany http://www.bigdata.uni-frankfurt.de/ Rekha Singhal rekha.singhal@tcs.com TCS Research – Mumbai, India http://www.tcs.com
  • 2. 6. April 2018 Motivation • Growing number of new Big Data technologies and connectors in the Big Data Stacks  Challenges for Solution Architects, Data Engineers, Data Scientist, Developers, etc. • Missing benchmarks for each technology, connector or a combination of them • Consequence  Increasing complexity in the Big Data Architecture Stacks • Our approach  ABench: Big Data Architecture Stack Benchmark 2ICPE 2018, Berlin, Germany, April 9-13
  • 3. 6. April 2018 ABench Features • Benchmark Framework  Data generators or plugins for custom data generators  Include data generator or public data sets to simulate workload that stresses the architecture • Reuse of existing benchmarks  Case study using BigBench (in the next slides, Streaming and Machine Learning) • Open source implementation and extendable design • Easy to setup and extend • Supporting and combining all four types of benchmarks in ABench 3ICPE 2018, Berlin, Germany, April 9-13
  • 4. 6. April 2018 Benchmarks Types (adapted from Andersen and Pettersen [1]) 1. Generic Benchmarking: checks whether an implementation fulfills given business requirements and specifications (Is the defined business specification implemented accurately?). 2. Competitive Benchmarking: is a performance comparison between the best tools on the platform layer that offer similar functionality (e.g., throughput of MapReduce vs. Spark vs. Flink). 3. Functional Benchmarking is a functional comparison of the features of the tool against technologies from the same area. (e.g., Spark Streaming vs. Spark Structured Streaming vs. Flink Streaming). 4. Internal Benchmarking: comparing different implementations of a functionality (e.g., Spark Scala vs. Java vs. R vs. PySpark) 4ICPE 2018, Berlin, Germany, April 9-13
  • 5. 6. April 2018 ABench Framework 5ICPE 2018, Berlin, Germany, April 9-13 Data Model Data StorageData Generation Workload Generator Benchmark Control Knobs Performance Data Collection Benchmark Validation Benchmark Metrics Data Model System to Benchmark
  • 6. 6. April 2018 Stream Processing Benchmark – Use Case • Adding stream processing to BigBench [2,3] • Reuse of the web click logs in JSON format from BigBench V2 [3] • Adding new streaming workloads  possibility to execute the queries on a subset of the incoming stream of data • Provide benchmark implementations based on Spark Streaming and Kafka • Work In-progress: Exploratory Analysis of Spark Structured Streaming, @PABS 2018, Todor Ivanov and Jason Taaffe 6ICPE 2018, Berlin, Germany, April 9-13
  • 7. 6. April 2018 Machine Learning Benchmark – Use Case • Expanding the type of Machine Learning workloads in BigBench [2]  five (Q5, Q20, Q25, Q26 and Q28) out of the 30 queries cover common ML algorithms • Proposal by Sweta Singh (IBM)[4] for new workload with Collaborative Filtering using Matrix Factorization implementation in Spark MLlib via the Alternating Least Squares (ALS) • Other types of advanced analytics inspired by Gartner [5]:  descriptive analytics  diagnostic analytics  predictive analytics  prescriptive analytics • Introduce new ML metrics for scalability and accuracy 7ICPE 2018, Berlin, Germany, April 9-13
  • 8. 6. April 2018 Next Steps • Building express version of the benchmark framework • Provide open source implementation of the Use Case benchmarks to stress test the existing Big Data Architecture Stacks • Enable the comparison of the most popular technologies (e.g., Kafka, Spark, etc.) 8ICPE 2018, Berlin, Germany, April 9-13
  • 9. 6. April 2018 Thank you for your attention! This research has been supported by the Research Group of the Standard Performance Evaluation Corporation (SPEC). ICPE 2018, Berlin, Germany, April 9-13 9
  • 10. 6. April 2018 REFERENCES [1] Bjørn Andersen and P-G Pettersen. 1995. Benchmarking handbook. Champman & Hall. [2] Ahmad Ghazal, Todor Ivanov, Pekka Kostamaa, Alain Crolotte, Ryan Voong, Mohammed Al- Kateb, Waleed Ghazal, and Roberto V. Zicari. 2017. BigBench V2: The New and Improved BigBench. In ICDE 2017, San Diego, CA, USA, April 19-22. [3] Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte, and Hans-Arno Jacobsen. 2013. BigBench: Towards An Industry Standard Benchmark for Big Data Analytics. In SIGMOD 2013. 1197–1208. [4] Sweta Singh. 2016. Benchmarking Spark Machine Learning Using BigBench. In 8th TPC Technology Conference, TPCTC 2016, New Delhi, India, September 5-9, 2016. [5] Gartner 2017, https://www.gartner.com/doc/3471553/-planning-guide-data-analytics 10ICPE 2018, Berlin, Germany, April 9-13