SlideShare una empresa de Scribd logo
1 de 13
Descargar para leer sin conexión
“Data! Data! Data!
I Can’t Make Bricks Without
Clay!”*
Shai Fine
Principal Engineer, Advanced Analytics, Intel
(*) Sherlock Holmes, The Adventure in the Copper Beeches
Big Data, Only a Few Years Back …
Executives Believe in Advanced Analytics
Analytics to the Rescue
• “Without big data analytics, companies are blind and deaf, wandering
out onto the web like deer on a freeway”
• Geoffrey Moore, Author of Crossing the Chasm
• … and who will lead the way?!
Big Data's High-Priests of Algorithms
The Wall Street Journal, Aug. 2014
Adoption of Analytics Faces Hurdles
• Developing Analytics solutions
• Far from being an engineering process
• There is a chasm to cross between “traditional” BI and Advanced Analytics
• Consumability of Analytics
• Deploying Analytics solutions is difficult
• Reliability, “Self Maintenance”
• Analytics Workloads are Challenging
• Speed (latency, time-to-solution), Throughput, Scalability, …
The ML Building Blocks Concept
There are “infinite” number of algorithms and datasets
But there are finite set of Building Blocks
Building Blocks:
A finite set of elements that can be mapped into HW and SW primitives and patterns
Building Blocks
Usages
High-level
Libraries
Low-level
Libraries
Hardware
Platforms
Xeon
Xeon Phi
Xeon FPGA
Iris Pro Graphics
Xeon Accel.
New ISA
Tier-1
Cloud
HPC
Enterprise
Academia
Machine Learning Building Blocks
• ML basic building blocks
1. Linear Algebra
2. Measures
3. Special Functions
4. Mathematical Optimization
5. Data Characteristics
6. Data-dependent Compute
7. Memory Access
8. Very large models
9. Hybrid Methods
• ML Meta building blocks
1. Learning Protocols
2. Learning Phases
3. Algorithmic Flow and Structure
Compute
Data
Compute - Data Interplay
Process
Towards a Comprehensive ML Workload Suite
• Workload design should cover elements of
• Compute
• Data Characteristics
• Data – Compute interplay
• Each workload includes
• Multiple data sets x Multiple algorithms
• Coverage of relevant data characteristics
• Coverage of compute patterns
The Building Block concept provides a mean for designing the ML Workload Suite
Machine Learning Workloads Suite
Workload Linear
Algebra
Measure
Calc.
Special
Funcs
Math
Optim.
Data
Characteristics
Data-dep.
Compute
Mem.
Access
large
model
Linear Algebra
Sparse
Dense
X X X
Un/Supervised,
Numeric
Data
Dependency
X X X
Un/Supervised,
Num/Cat
X X
Large Models X X X
Un/Supervised,
Numeric
X
Workload Dataset Type Characteristics
Linear Algebra
Clustered Dense, Numeric
Graphs Sparse, Numeric
Data
Dependency
Bio informatics High Dep - Dense/Sparse
Clustered Dense
Text High Dep – Sparse
Manufacturing High Dep – Numeric, Dense
Large Models Images Dense, Numeric
ALGORITHMS
DATASETS
Machine Learning Workloads Suite
Workload Linear
Algebra
Measure
Calc.
Special
Funcs
Math
Optim.
Data
Characteristics
Data-dep.
Compute
Mem.
Access
large
model
Linear Algebra
Sparse
Dense
X X X
Un/Supervised,
Numeric
Data
Dependency
X X X
Un/Supervised,
Num/Cat
X X
Large Models X X X
Un/Supervised,
Numeric
X
Workload Dataset Type Characteristics
Linear Algebra
Clustered Dense, Numeric
Graphs Sparse, Numeric
Data
Dependency
Bio informatics High Dep - Dense/Sparse
Clustered Dense
Text High Dep – Sparse
Manufacturing High Dep – Numeric, Dense
Large Models Images Dense, Numeric
ALGORITHMS
DATASETS
ML Bench 1.0
• Algorithm X Data
• Reference Models
• Data Generator
The “Dwarfs” Connection
• Phill Collela’s “Seven Dwarfs” (2004) –
• Patterns of computation and communication
that are important for science and engineering
• Berkley’s view (2006) –
• Extended to 13 Dwarfs after examining
the original 7 Dwarfs outside the HPC scope
• US National Research Council’s Committee
“Frontiers in Massive Data Analysis” (2013) –
• Chapter 10: “The Seven Computational Giants of Massive Data Analysis”
• The ML Building Blocks provide a further extension and a different perspective
• Introducing data characteristics and the interplay with compute, communication, memory
“Data! Data! Data!
I Can’t Make Bricks Without Clay!”
Thank You

Más contenido relacionado

La actualidad más candente

Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
 
Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsNYC Predictive Analytics
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
ML DL AI DS BD - An Introduction
ML DL AI DS BD - An IntroductionML DL AI DS BD - An Introduction
ML DL AI DS BD - An IntroductionDony Riyanto
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onDony Riyanto
 
Azure Machine Learning and ML on Premises
Azure Machine Learning and ML on PremisesAzure Machine Learning and ML on Premises
Azure Machine Learning and ML on PremisesIvo Andreev
 
Introduction to Azure Machine Learning
Introduction to Azure Machine LearningIntroduction to Azure Machine Learning
Introduction to Azure Machine LearningPaul Prae
 
Azure Machine Learning 101
Azure Machine Learning 101Azure Machine Learning 101
Azure Machine Learning 101Andrew Badera
 
Artificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data AnalysisArtificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data AnalysisManuel Martín
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningSetu Chokshi
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Interpretable machine learning
Interpretable machine learningInterpretable machine learning
Interpretable machine learningSri Ambati
 
Machine Learning and Applications
Machine Learning and ApplicationsMachine Learning and Applications
Machine Learning and ApplicationsGeeta Arora
 
Machine learning 101 dkom 2017
Machine learning 101 dkom 2017Machine learning 101 dkom 2017
Machine learning 101 dkom 2017fredverheul
 
4 pillars of visualization & communication by Noah Iliinsky
4 pillars of visualization & communication by Noah Iliinsky4 pillars of visualization & communication by Noah Iliinsky
4 pillars of visualization & communication by Noah Iliinskyiliinsky
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learningsafa cimenli
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningHaptik
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Gülden Bilgütay
 

La actualidad más candente (20)

Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media Analytics
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
ML DL AI DS BD - An Introduction
ML DL AI DS BD - An IntroductionML DL AI DS BD - An Introduction
ML DL AI DS BD - An Introduction
 
What is Machine Learning
What is Machine LearningWhat is Machine Learning
What is Machine Learning
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
Azure Machine Learning and ML on Premises
Azure Machine Learning and ML on PremisesAzure Machine Learning and ML on Premises
Azure Machine Learning and ML on Premises
 
Introduction to Azure Machine Learning
Introduction to Azure Machine LearningIntroduction to Azure Machine Learning
Introduction to Azure Machine Learning
 
Azure Machine Learning 101
Azure Machine Learning 101Azure Machine Learning 101
Azure Machine Learning 101
 
Artificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data AnalysisArtificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data Analysis
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Interpretable machine learning
Interpretable machine learningInterpretable machine learning
Interpretable machine learning
 
Machine Learning and Applications
Machine Learning and ApplicationsMachine Learning and Applications
Machine Learning and Applications
 
Machine learning 101 dkom 2017
Machine learning 101 dkom 2017Machine learning 101 dkom 2017
Machine learning 101 dkom 2017
 
4 pillars of visualization & communication by Noah Iliinsky
4 pillars of visualization & communication by Noah Iliinsky4 pillars of visualization & communication by Noah Iliinsky
4 pillars of visualization & communication by Noah Iliinsky
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21
 

Destacado

Judging from its first charges under Law 975, the Prosecutor’s Office does no...
Judging from its first charges under Law 975, the Prosecutor’s Office does no...Judging from its first charges under Law 975, the Prosecutor’s Office does no...
Judging from its first charges under Law 975, the Prosecutor’s Office does no...Comisión Colombiana de Juristas
 
The Colombian Government Wants to Circumvent the Constitutional Court’s Rulin...
The Colombian Government Wants to Circumvent the Constitutional Court’s Rulin...The Colombian Government Wants to Circumvent the Constitutional Court’s Rulin...
The Colombian Government Wants to Circumvent the Constitutional Court’s Rulin...Comisión Colombiana de Juristas
 
Practica 5
Practica 5Practica 5
Practica 5vir_97
 
Operaciones frecuente usuarios
Operaciones frecuente usuariosOperaciones frecuente usuarios
Operaciones frecuente usuarioseduenlasiberia
 
Developers Are People, Too
Developers Are People, TooDevelopers Are People, Too
Developers Are People, Toomor
 
Useful Online Software
Useful Online Software Useful Online Software
Useful Online Software bibliotecaria
 
Are you ready to start your business?
Are you ready to start your business?Are you ready to start your business?
Are you ready to start your business?Marshanda Powell
 

Destacado (15)

Witness named at “free-version” hearing assassinated
Witness named at “free-version” hearing assassinatedWitness named at “free-version” hearing assassinated
Witness named at “free-version” hearing assassinated
 
Judging from its first charges under Law 975, the Prosecutor’s Office does no...
Judging from its first charges under Law 975, the Prosecutor’s Office does no...Judging from its first charges under Law 975, the Prosecutor’s Office does no...
Judging from its first charges under Law 975, the Prosecutor’s Office does no...
 
Som luz 8ano
Som luz 8anoSom luz 8ano
Som luz 8ano
 
The Colombian Government Wants to Circumvent the Constitutional Court’s Rulin...
The Colombian Government Wants to Circumvent the Constitutional Court’s Rulin...The Colombian Government Wants to Circumvent the Constitutional Court’s Rulin...
The Colombian Government Wants to Circumvent the Constitutional Court’s Rulin...
 
La Información
La InformaciónLa Información
La Información
 
Paulz2
Paulz2Paulz2
Paulz2
 
Practica 5
Practica 5Practica 5
Practica 5
 
Operaciones frecuente usuarios
Operaciones frecuente usuariosOperaciones frecuente usuarios
Operaciones frecuente usuarios
 
Componentes de un sistema de Información
Componentes de un sistema de Información Componentes de un sistema de Información
Componentes de un sistema de Información
 
La ira
La ira La ira
La ira
 
The Speech Report
The Speech ReportThe Speech Report
The Speech Report
 
Developers Are People, Too
Developers Are People, TooDevelopers Are People, Too
Developers Are People, Too
 
Useful Online Software
Useful Online Software Useful Online Software
Useful Online Software
 
Are you ready to start your business?
Are you ready to start your business?Are you ready to start your business?
Are you ready to start your business?
 
Presentación Primer Encuentro
Presentación Primer EncuentroPresentación Primer Encuentro
Presentación Primer Encuentro
 

Similar a Data! Data! Analytics Building Blocks

Lunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataLunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataMelissa Hornbostel
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphP. Taylor Goetz
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphDataWorks Summit
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web developmentTung Nguyen
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureIvo Andreev
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learningIvo Andreev
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsGanesan Narayanasamy
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...Mihai Criveti
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?Ivo Andreev
 
Big data berlin
Big data berlinBig data berlin
Big data berlinkammeyer
 
02-Lifecycle.pptx
02-Lifecycle.pptx02-Lifecycle.pptx
02-Lifecycle.pptxShree Shree
 
The Challenges of Bringing Machine Learning to the Masses
The Challenges of Bringing Machine Learning to the MassesThe Challenges of Bringing Machine Learning to the Masses
The Challenges of Bringing Machine Learning to the MassesAlice Zheng
 
1559 mathematical and visualization software
1559 mathematical and visualization software1559 mathematical and visualization software
1559 mathematical and visualization softwareDr Fereidoun Dejahang
 
Tour of Big Data
Tour of Big DataTour of Big Data
Tour of Big DataRaymond Yu
 

Similar a Data! Data! Analytics Building Blocks (20)

Lunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataLunch & Learn Intro to Big Data
Lunch & Learn Intro to Big Data
 
Bigdata analytics
Bigdata analyticsBigdata analytics
Bigdata analytics
 
DA_01_Intro.pptx
DA_01_Intro.pptxDA_01_Intro.pptx
DA_01_Intro.pptx
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learning
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 
Big data berlin
Big data berlinBig data berlin
Big data berlin
 
02-Lifecycle.pptx
02-Lifecycle.pptx02-Lifecycle.pptx
02-Lifecycle.pptx
 
The Challenges of Bringing Machine Learning to the Masses
The Challenges of Bringing Machine Learning to the MassesThe Challenges of Bringing Machine Learning to the Masses
The Challenges of Bringing Machine Learning to the Masses
 
1559 mathematical and visualization software
1559 mathematical and visualization software1559 mathematical and visualization software
1559 mathematical and visualization software
 
Big Data on The Cloud
Big Data on The CloudBig Data on The Cloud
Big Data on The Cloud
 
Tour of Big Data
Tour of Big DataTour of Big Data
Tour of Big Data
 
Ds01 data science
Ds01   data scienceDs01   data science
Ds01 data science
 
00-01 DSnDA.pdf
00-01 DSnDA.pdf00-01 DSnDA.pdf
00-01 DSnDA.pdf
 

Más de Turi, Inc.

Webinar - Analyzing Video
Webinar - Analyzing VideoWebinar - Analyzing Video
Webinar - Analyzing VideoTuri, Inc.
 
Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission RiskTuri, Inc.
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Turi, Inc.
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Turi, Inc.
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Turi, Inc.
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Turi, Inc.
 
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsScaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsTuri, Inc.
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataTuri, Inc.
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsTuri, Inc.
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine LearningTuri, Inc.
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab CreateTuri, Inc.
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesTuri, Inc.
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinTuri, Inc.
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data scienceTuri, Inc.
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Turi, Inc.
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender SystemsTuri, Inc.
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in productionTuri, Inc.
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringTuri, Inc.
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with DatoTuri, Inc.
 

Más de Turi, Inc. (20)

Webinar - Analyzing Video
Webinar - Analyzing VideoWebinar - Analyzing Video
Webinar - Analyzing Video
 
Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission Risk
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
 
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsScaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log Data
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive Services
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender Systems
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
SFrame
SFrameSFrame
SFrame
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
 

Último

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Último (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Data! Data! Analytics Building Blocks

  • 1. “Data! Data! Data! I Can’t Make Bricks Without Clay!”* Shai Fine Principal Engineer, Advanced Analytics, Intel (*) Sherlock Holmes, The Adventure in the Copper Beeches
  • 2. Big Data, Only a Few Years Back …
  • 3. Executives Believe in Advanced Analytics
  • 4. Analytics to the Rescue • “Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway” • Geoffrey Moore, Author of Crossing the Chasm • … and who will lead the way?! Big Data's High-Priests of Algorithms The Wall Street Journal, Aug. 2014
  • 5. Adoption of Analytics Faces Hurdles • Developing Analytics solutions • Far from being an engineering process • There is a chasm to cross between “traditional” BI and Advanced Analytics • Consumability of Analytics • Deploying Analytics solutions is difficult • Reliability, “Self Maintenance” • Analytics Workloads are Challenging • Speed (latency, time-to-solution), Throughput, Scalability, …
  • 6. The ML Building Blocks Concept There are “infinite” number of algorithms and datasets But there are finite set of Building Blocks Building Blocks: A finite set of elements that can be mapped into HW and SW primitives and patterns Building Blocks Usages High-level Libraries Low-level Libraries Hardware Platforms Xeon Xeon Phi Xeon FPGA Iris Pro Graphics Xeon Accel. New ISA Tier-1 Cloud HPC Enterprise Academia
  • 7. Machine Learning Building Blocks • ML basic building blocks 1. Linear Algebra 2. Measures 3. Special Functions 4. Mathematical Optimization 5. Data Characteristics 6. Data-dependent Compute 7. Memory Access 8. Very large models 9. Hybrid Methods • ML Meta building blocks 1. Learning Protocols 2. Learning Phases 3. Algorithmic Flow and Structure Compute Data Compute - Data Interplay Process
  • 8. Towards a Comprehensive ML Workload Suite • Workload design should cover elements of • Compute • Data Characteristics • Data – Compute interplay • Each workload includes • Multiple data sets x Multiple algorithms • Coverage of relevant data characteristics • Coverage of compute patterns The Building Block concept provides a mean for designing the ML Workload Suite
  • 9. Machine Learning Workloads Suite Workload Linear Algebra Measure Calc. Special Funcs Math Optim. Data Characteristics Data-dep. Compute Mem. Access large model Linear Algebra Sparse Dense X X X Un/Supervised, Numeric Data Dependency X X X Un/Supervised, Num/Cat X X Large Models X X X Un/Supervised, Numeric X Workload Dataset Type Characteristics Linear Algebra Clustered Dense, Numeric Graphs Sparse, Numeric Data Dependency Bio informatics High Dep - Dense/Sparse Clustered Dense Text High Dep – Sparse Manufacturing High Dep – Numeric, Dense Large Models Images Dense, Numeric ALGORITHMS DATASETS
  • 10. Machine Learning Workloads Suite Workload Linear Algebra Measure Calc. Special Funcs Math Optim. Data Characteristics Data-dep. Compute Mem. Access large model Linear Algebra Sparse Dense X X X Un/Supervised, Numeric Data Dependency X X X Un/Supervised, Num/Cat X X Large Models X X X Un/Supervised, Numeric X Workload Dataset Type Characteristics Linear Algebra Clustered Dense, Numeric Graphs Sparse, Numeric Data Dependency Bio informatics High Dep - Dense/Sparse Clustered Dense Text High Dep – Sparse Manufacturing High Dep – Numeric, Dense Large Models Images Dense, Numeric ALGORITHMS DATASETS ML Bench 1.0 • Algorithm X Data • Reference Models • Data Generator
  • 11. The “Dwarfs” Connection • Phill Collela’s “Seven Dwarfs” (2004) – • Patterns of computation and communication that are important for science and engineering • Berkley’s view (2006) – • Extended to 13 Dwarfs after examining the original 7 Dwarfs outside the HPC scope • US National Research Council’s Committee “Frontiers in Massive Data Analysis” (2013) – • Chapter 10: “The Seven Computational Giants of Massive Data Analysis” • The ML Building Blocks provide a further extension and a different perspective • Introducing data characteristics and the interplay with compute, communication, memory
  • 12. “Data! Data! Data! I Can’t Make Bricks Without Clay!”