SlideShare a Scribd company logo
1 of 11
Download to read offline
Manuel Martín Salvador 
@draxus 
msalvador@bournemouth.ac.uk 
OpenML workshop 
Eindhoven 21/10/2014
Background 
● MSc. Computer Engineering 
● Master in Soft Computing and Intelligent Systems 
Currently 
● PhD Student – Automatic and adaptive pre-processing for building 
predictive models 
● Teaching – Data Mining lab
Data preparation and pre-processing
Data preparation and pre-processing
Data preparation and pre-processing 
Labour 
intensive 
tasks 
(up to 80% of 
a data mining 
process)
Automating pre-processing 
A lot of available techniques 
No free lunch 
Multiple combinations 
Order of pre-processing methods matters 
No semantic → some approaches use ontologies 
Meta-learning → needs a good database of 
experiments
Scientific workflow platforms and 
repositories with experiments 
Software Repository Applications 
DiscoveryNet (inactive) - 
Kepler - Various 
Taverna MyExperiment (open) Bioinformatics 
Pegasus - Various 
Galaxy - Biomedical 
Pipeline Pilot Accelrys (commercial) 
* MLComp (“open”) Machine Learning 
Weka,MOA,R,RapidMiner OpenML (open) Machine Learning
OpenML statistics 
Datasets: 1042 
Tasks: 3025 
Flows: 640 
Runs: 31540 
Valid: 24410 
With errors: 7130 
Datasets: 300 
Individual components: 136 
Paired components: 635 
“Flow size”: 1 – 8198 
2 – 12178 
3 – 1993 
4 – 1533 
5 – 502 
6 – 6 
4500 
4000 
3500 
3000 
2500 
2000 
1500 
1000 
500 
0 
Distribution of components 
1600 
1400 
1200 
1000 
800 
600 
400 
200 
0 
Distribution of datasets 
Only 3 Weka filters: 
Principal Components, Discretize, PLSFilter
TO DO 
How to increase the number of pre-processing methods in OpenML? 
- The only way right now is using FilteredClassifier in Weka 
- What about R, MOA, RapidMiner? 
Improving flow representation 
- Right now is difficult to see how components are connected 
- Clear distinction of parameters 
- What about including Weka flows (XML based) and ADAMS flows? 
- PMML support? 
Statistics for available data, tasks, flows and runs 
Flow recommendation system for a given dataset 
[dataset, data characteristics, prediction accuracy, flow_id] 
Flow validation before executing it 
[dataset, data characteristics, flow characteristics, failure]
A little bit further 
Adapting flows while processing data streams 
- Detecting changes in data characteristics 
- Locally checking input/output in each flow component 
- Change propagation 
- Reducing cost of adaptation
Photos CC by Cristina Granados 
Visit us! 
Data Science Institute @ Bournemouth University

More Related Content

What's hot

Graph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational Data
Benjamin Bengfort
 
Concept Drift Identification using Classifier Ensemble Approach
Concept Drift Identification using Classifier Ensemble Approach  Concept Drift Identification using Classifier Ensemble Approach
Concept Drift Identification using Classifier Ensemble Approach
IJECEIAES
 
Outlier detection for high dimensional data
Outlier detection for high dimensional dataOutlier detection for high dimensional data
Outlier detection for high dimensional data
Parag Tamhane
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.ppt
butest
 

What's hot (20)

Journals analysis ppt
Journals analysis pptJournals analysis ppt
Journals analysis ppt
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
A Beginner's Guide to Machine Learning with Scikit-Learn
A Beginner's Guide to Machine Learning with Scikit-LearnA Beginner's Guide to Machine Learning with Scikit-Learn
A Beginner's Guide to Machine Learning with Scikit-Learn
 
Materials Informatics and Python
Materials Informatics and PythonMaterials Informatics and Python
Materials Informatics and Python
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Machine learning
Machine learning Machine learning
Machine learning
 
resume_MH
resume_MHresume_MH
resume_MH
 
My
MyMy
My
 
IEEE 2014 JAVA DATA MINING PROJECTS Searching dimension incomplete databases
IEEE 2014 JAVA DATA MINING PROJECTS Searching dimension incomplete databasesIEEE 2014 JAVA DATA MINING PROJECTS Searching dimension incomplete databases
IEEE 2014 JAVA DATA MINING PROJECTS Searching dimension incomplete databases
 
Disease prediction using machine learning
Disease prediction using machine learningDisease prediction using machine learning
Disease prediction using machine learning
 
Graph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational Data
 
Data repository for sensor network a data mining approach
Data repository for sensor network  a data mining approachData repository for sensor network  a data mining approach
Data repository for sensor network a data mining approach
 
Machine learning in the life sciences with knime
Machine learning in the life sciences with knimeMachine learning in the life sciences with knime
Machine learning in the life sciences with knime
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
Concept Drift Identification using Classifier Ensemble Approach
Concept Drift Identification using Classifier Ensemble Approach  Concept Drift Identification using Classifier Ensemble Approach
Concept Drift Identification using Classifier Ensemble Approach
 
EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8
EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8
EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8
 
Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)
 
Outlier detection for high dimensional data
Outlier detection for high dimensional dataOutlier detection for high dimensional data
Outlier detection for high dimensional data
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.ppt
 

Viewers also liked

Presentación Día de la Libertad del Software 2011
Presentación Día de la Libertad del Software 2011Presentación Día de la Libertad del Software 2011
Presentación Día de la Libertad del Software 2011
Manuel Martín
 
Introducción a GNU/Linux
Introducción a GNU/LinuxIntroducción a GNU/Linux
Introducción a GNU/Linux
Manuel Martín
 
2012-07-03 Fisser Voogt Bom IFIP Taaltreffers
2012-07-03 Fisser Voogt Bom IFIP Taaltreffers2012-07-03 Fisser Voogt Bom IFIP Taaltreffers
2012-07-03 Fisser Voogt Bom IFIP Taaltreffers
Petra Fisser
 
Operaciones Colectivas en MPI
Operaciones Colectivas en MPIOperaciones Colectivas en MPI
Operaciones Colectivas en MPI
Manuel Martín
 

Viewers also liked (9)

Presentación Día de la Libertad del Software 2011
Presentación Día de la Libertad del Software 2011Presentación Día de la Libertad del Software 2011
Presentación Día de la Libertad del Software 2011
 
Introducción a GNU/Linux
Introducción a GNU/LinuxIntroducción a GNU/Linux
Introducción a GNU/Linux
 
Minería de secuencias de datos
Minería de secuencias de datosMinería de secuencias de datos
Minería de secuencias de datos
 
Decompiladores
DecompiladoresDecompiladores
Decompiladores
 
Minería de secuencias de datos
Minería de secuencias de datosMinería de secuencias de datos
Minería de secuencias de datos
 
2012-07-03 Fisser Voogt Bom IFIP Taaltreffers
2012-07-03 Fisser Voogt Bom IFIP Taaltreffers2012-07-03 Fisser Voogt Bom IFIP Taaltreffers
2012-07-03 Fisser Voogt Bom IFIP Taaltreffers
 
Operaciones Colectivas en MPI
Operaciones Colectivas en MPIOperaciones Colectivas en MPI
Operaciones Colectivas en MPI
 
Artificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data AnalysisArtificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data Analysis
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...
 

Similar to Quick presentation for the OpenML workshop in Eindhoven 2014

Python + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production FasterPython + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
Paige_Roberts
 
Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
VMware Tanzu
 
Big Data Analytics-Open Source Toolkits
Big Data Analytics-Open Source ToolkitsBig Data Analytics-Open Source Toolkits
Big Data Analytics-Open Source Toolkits
DataWorks Summit
 

Similar to Quick presentation for the OpenML workshop in Eindhoven 2014 (20)

Python + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production FasterPython + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
 
AzureML TechTalk
AzureML TechTalkAzureML TechTalk
AzureML TechTalk
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
 
A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)
 
An Analytics Platform for Connected Vehicles
An Analytics Platform for Connected VehiclesAn Analytics Platform for Connected Vehicles
An Analytics Platform for Connected Vehicles
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Agile data science with scala
Agile data science with scalaAgile data science with scala
Agile data science with scala
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
Dev Ops Training
Dev Ops TrainingDev Ops Training
Dev Ops Training
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run Graph
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
Big Data Analytics-Open Source Toolkits
Big Data Analytics-Open Source ToolkitsBig Data Analytics-Open Source Toolkits
Big Data Analytics-Open Source Toolkits
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
 

More from Manuel Martín

Effects of change propagation resulting from adaptive preprocessing in multic...
Effects of change propagation resulting from adaptive preprocessing in multic...Effects of change propagation resulting from adaptive preprocessing in multic...
Effects of change propagation resulting from adaptive preprocessing in multic...
Manuel Martín
 
Presentacion Taller de Introducción a Linux SFD2010
Presentacion Taller de Introducción a Linux SFD2010Presentacion Taller de Introducción a Linux SFD2010
Presentacion Taller de Introducción a Linux SFD2010
Manuel Martín
 

More from Manuel Martín (19)

Hogar (Des)Conectado
Hogar (Des)ConectadoHogar (Des)Conectado
Hogar (Des)Conectado
 
Automatizando el aprendizaje basado en datos
Automatizando el aprendizaje basado en datosAutomatizando el aprendizaje basado en datos
Automatizando el aprendizaje basado en datos
 
Modelling Multi-Component Predictive Systems as Petri Nets
Modelling Multi-Component Predictive Systems as Petri NetsModelling Multi-Component Predictive Systems as Petri Nets
Modelling Multi-Component Predictive Systems as Petri Nets
 
Brand engagement with mobile gamification apps from a developer perspective
Brand engagement with mobile gamification apps from a developer perspectiveBrand engagement with mobile gamification apps from a developer perspective
Brand engagement with mobile gamification apps from a developer perspective
 
Effects of change propagation resulting from adaptive preprocessing in multic...
Effects of change propagation resulting from adaptive preprocessing in multic...Effects of change propagation resulting from adaptive preprocessing in multic...
Effects of change propagation resulting from adaptive preprocessing in multic...
 
Improving transport timetables usability for mobile devices
Improving transport timetables usability for mobile devicesImproving transport timetables usability for mobile devices
Improving transport timetables usability for mobile devices
 
Towards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive SystemsTowards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive Systems
 
Online Detection of Shutdown Periods in Chemical Plants: A Case Study
Online Detection of Shutdown Periods in Chemical Plants: A Case StudyOnline Detection of Shutdown Periods in Chemical Plants: A Case Study
Online Detection of Shutdown Periods in Chemical Plants: A Case Study
 
Handling concept drift in data stream mining
Handling concept drift in data stream miningHandling concept drift in data stream mining
Handling concept drift in data stream mining
 
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de AndalucíaAndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
 
Presentacion Taller de Introducción a Linux SFD2010
Presentacion Taller de Introducción a Linux SFD2010Presentacion Taller de Introducción a Linux SFD2010
Presentacion Taller de Introducción a Linux SFD2010
 
Presentación Gnome 3.0 en Granada
Presentación Gnome 3.0 en GranadaPresentación Gnome 3.0 en Granada
Presentación Gnome 3.0 en Granada
 
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de AndalucíaAndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
 
Pintando gráficas con Python
Pintando gráficas con PythonPintando gráficas con Python
Pintando gráficas con Python
 
Charla de Introducción a Git
Charla de Introducción a GitCharla de Introducción a Git
Charla de Introducción a Git
 
Taller de introducción a Google App Engine
Taller de introducción a Google App EngineTaller de introducción a Google App Engine
Taller de introducción a Google App Engine
 
Presentación AndalucíaPeople en las BYMC6
Presentación AndalucíaPeople en las BYMC6Presentación AndalucíaPeople en las BYMC6
Presentación AndalucíaPeople en las BYMC6
 
Web 2.0 y Empresa 2.0
Web 2.0 y Empresa 2.0Web 2.0 y Empresa 2.0
Web 2.0 y Empresa 2.0
 
Taller de introducción al sistema operativo GNU/Linux
Taller de introducción al sistema operativo GNU/LinuxTaller de introducción al sistema operativo GNU/Linux
Taller de introducción al sistema operativo GNU/Linux
 

Recently uploaded

👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 

Recently uploaded (20)

Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 

Quick presentation for the OpenML workshop in Eindhoven 2014

  • 1. Manuel Martín Salvador @draxus msalvador@bournemouth.ac.uk OpenML workshop Eindhoven 21/10/2014
  • 2. Background ● MSc. Computer Engineering ● Master in Soft Computing and Intelligent Systems Currently ● PhD Student – Automatic and adaptive pre-processing for building predictive models ● Teaching – Data Mining lab
  • 3. Data preparation and pre-processing
  • 4. Data preparation and pre-processing
  • 5. Data preparation and pre-processing Labour intensive tasks (up to 80% of a data mining process)
  • 6. Automating pre-processing A lot of available techniques No free lunch Multiple combinations Order of pre-processing methods matters No semantic → some approaches use ontologies Meta-learning → needs a good database of experiments
  • 7. Scientific workflow platforms and repositories with experiments Software Repository Applications DiscoveryNet (inactive) - Kepler - Various Taverna MyExperiment (open) Bioinformatics Pegasus - Various Galaxy - Biomedical Pipeline Pilot Accelrys (commercial) * MLComp (“open”) Machine Learning Weka,MOA,R,RapidMiner OpenML (open) Machine Learning
  • 8. OpenML statistics Datasets: 1042 Tasks: 3025 Flows: 640 Runs: 31540 Valid: 24410 With errors: 7130 Datasets: 300 Individual components: 136 Paired components: 635 “Flow size”: 1 – 8198 2 – 12178 3 – 1993 4 – 1533 5 – 502 6 – 6 4500 4000 3500 3000 2500 2000 1500 1000 500 0 Distribution of components 1600 1400 1200 1000 800 600 400 200 0 Distribution of datasets Only 3 Weka filters: Principal Components, Discretize, PLSFilter
  • 9. TO DO How to increase the number of pre-processing methods in OpenML? - The only way right now is using FilteredClassifier in Weka - What about R, MOA, RapidMiner? Improving flow representation - Right now is difficult to see how components are connected - Clear distinction of parameters - What about including Weka flows (XML based) and ADAMS flows? - PMML support? Statistics for available data, tasks, flows and runs Flow recommendation system for a given dataset [dataset, data characteristics, prediction accuracy, flow_id] Flow validation before executing it [dataset, data characteristics, flow characteristics, failure]
  • 10. A little bit further Adapting flows while processing data streams - Detecting changes in data characteristics - Locally checking input/output in each flow component - Change propagation - Reducing cost of adaptation
  • 11. Photos CC by Cristina Granados Visit us! Data Science Institute @ Bournemouth University