SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
Abed Ajraou – Director of Data & Insights
& Lead Data Scientist
@First Utility
Putting Data Science in
Your Business: a First
Utility Feedback
First Utility – Putting customers in control; saving them money
Cheaper tariffs Great service More knowledge
Driving the Success of DS Solutions : Skills, Roles and Responsibilities
Source: https://whatsthebigdata.com/2016/05/01/data-scientists-spend-most-of-their-time-cleaning-data/
What have we missed here … ?
Right Technology
Data – THE NEW POWER
Internal Data
Allow us to deliver a
better service for our
customers
Allow us to optimise the
business and give the
better price to our
customers
Allow us to give more
knowledge to our
customers
Industry
Data
Individual
Transaction-Level Data Internal Data
 Better Agility
 Data Lake and Data Warehousing in the
same platform
 Enable Data Discovery
 Collect more data
 Analyse the data with high performance
 Next Gen of Data Visualisation on top of
Hadoop
Right Mind-set
Start with a business problem
Not considering the business outcome, it’s actually
the first reason of project failure!
Start with a business problem
Starting with the data and not with the question … ?
Right Methodology
Explore the data
● Exploratory Analysis by Visualizing the data
The creativity part and lot
of trial / error process.
Feature engineering
Andrew Fogg win the competition
by categorising the colours of cars.
● ML is often used in DS
● Currently, the buzz/trend ML is xgboost which gives most of the
time better result than the traditional Random Forest & Neural
Networks.
● Reason of the success? More Accurate, more efficient, easy to
use, customized and distributed.
● Need less spending time in Feature engineering but still need
some creativity.
Models to predict
Models to predict: gradient boosting
● ML is often used in DS
● Currently, the buzz/trend ML is xgboost which gives most of the
time better result than the traditional Random Forest & Neural
Networks.
● Reason of the success? More Accurate, more efficient, easy to
use, customized and distributed.
● Need less spending time in Feature engineering but still need
some creativity.
Models to predict
Evaluation - validations
● Overfitting/Underfitting
is the biggest fear of a
Data Scientist.
● Cross validation is one
way to protect the
model to not overfit
Feedback loop
● ML algorithm is a life system …
like any life specimen, it needs cares !!!
● Learning by his mistakes, it’s the only way
to progress and to fit a real AI model.
Bad Methodology
Main reasons:
• No clear business case
• Try to create the best accurate model in the first place
• No agility
• No code version control
An iterative delivery is key
Sprint 1
Sprint 2
Main take away:
• Agility is required
• Weekly delivered is highly recommended to avoid
falling to the “tunnel effect”
Going forward: AML
Automated
Machine learning
Gartner Says
“More Than 40
Percent of Data
Science Tasks Will
Be Automated by
2020”
Source: https://www.gartner.com/newsroom/id/3570917
Automation in Machine Learning is starting
Gain in Efficiency
● In the old age of BI world, we gain in efficiency by using ETL tool
rather than scripting codes.
However, ML is often associate with R/Python/Scala coding.
Dataiku Flow => enable AML
My favorite app
The Collaborative Data Science Platform: Dataiku
Data Science
is nothing
without a team
Data Science is a range of skills !
It’s quite rare to get them in a single person
Source: Dsradar.com
Thank you
for your attention
Any Questions?
Keep contact:
@AAjraou

Más contenido relacionado

La actualidad más candente

Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Thoughtworks
 

La actualidad más candente (20)

Giovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDrivenGiovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDriven
 
Anatomy of a data science project
Anatomy of a data science projectAnatomy of a data science project
Anatomy of a data science project
 
Evaluation of big data analysis
Evaluation of big data analysisEvaluation of big data analysis
Evaluation of big data analysis
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
 
Notilyze SAS
Notilyze SASNotilyze SAS
Notilyze SAS
 
Andreas weigend
Andreas weigendAndreas weigend
Andreas weigend
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his mac
 
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
 
Scaling for holiday season
Scaling for holiday seasonScaling for holiday season
Scaling for holiday season
 
Decision Engineering Pass conference presentation 2014
Decision Engineering Pass conference presentation 2014Decision Engineering Pass conference presentation 2014
Decision Engineering Pass conference presentation 2014
 
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoH2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
 
"Making Data Actionable" by Budiman Rusly (KMK Online)
"Making Data Actionable" by Budiman Rusly (KMK Online)"Making Data Actionable" by Budiman Rusly (KMK Online)
"Making Data Actionable" by Budiman Rusly (KMK Online)
 
1645 track 3 porter
1645 track 3 porter1645 track 3 porter
1645 track 3 porter
 
Predictive Analytics for Non-programmers
Predictive Analytics for Non-programmersPredictive Analytics for Non-programmers
Predictive Analytics for Non-programmers
 
The Data Science Product Management Toolkit
The Data Science Product Management ToolkitThe Data Science Product Management Toolkit
The Data Science Product Management Toolkit
 
Simplifying analytics
Simplifying analyticsSimplifying analytics
Simplifying analytics
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
1120 track 3 prendki_using our laptop
1120 track 3 prendki_using our laptop1120 track 3 prendki_using our laptop
1120 track 3 prendki_using our laptop
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
 
Hyf azure ml_1
Hyf azure ml_1Hyf azure ml_1
Hyf azure ml_1
 

Similar a Putting data science in your business a first utility feedback

Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
mark madsen
 
Challenges of Executing AI
Challenges of Executing AIChallenges of Executing AI
Challenges of Executing AI
Dr. Umesh Rao.Hodeghatta
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
mark madsen
 

Similar a Putting data science in your business a first utility feedback (20)

SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Demystifying ML/AI
Demystifying ML/AIDemystifying ML/AI
Demystifying ML/AI
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications today
 
Ezml Stanford 2015
Ezml Stanford 2015Ezml Stanford 2015
Ezml Stanford 2015
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
Webinar: AI as a Shared Service by Salesforce Senior Director of Product
Webinar: AI as a Shared Service by Salesforce Senior Director of ProductWebinar: AI as a Shared Service by Salesforce Senior Director of Product
Webinar: AI as a Shared Service by Salesforce Senior Director of Product
 
Challenges of Executing AI
Challenges of Executing AIChallenges of Executing AI
Challenges of Executing AI
 
AI as a Shared Service by Salesforce Senior Director of Product
AI as a Shared Service by Salesforce Senior Director of ProductAI as a Shared Service by Salesforce Senior Director of Product
AI as a Shared Service by Salesforce Senior Director of Product
 
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
 
How to classify documents automatically using NLP
How to classify documents automatically using NLPHow to classify documents automatically using NLP
How to classify documents automatically using NLP
 
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201... It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
 
Analytic next gen usecases - presented for ISB, Hyderabad
Analytic next gen usecases - presented for ISB, HyderabadAnalytic next gen usecases - presented for ISB, Hyderabad
Analytic next gen usecases - presented for ISB, Hyderabad
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

Putting data science in your business a first utility feedback

  • 1. Abed Ajraou – Director of Data & Insights & Lead Data Scientist @First Utility Putting Data Science in Your Business: a First Utility Feedback
  • 2. First Utility – Putting customers in control; saving them money Cheaper tariffs Great service More knowledge
  • 3. Driving the Success of DS Solutions : Skills, Roles and Responsibilities
  • 6. Data – THE NEW POWER Internal Data Allow us to deliver a better service for our customers Allow us to optimise the business and give the better price to our customers Allow us to give more knowledge to our customers
  • 7. Industry Data Individual Transaction-Level Data Internal Data  Better Agility  Data Lake and Data Warehousing in the same platform  Enable Data Discovery  Collect more data  Analyse the data with high performance  Next Gen of Data Visualisation on top of Hadoop
  • 9. Start with a business problem Not considering the business outcome, it’s actually the first reason of project failure!
  • 10. Start with a business problem
  • 11. Starting with the data and not with the question … ?
  • 13. Explore the data ● Exploratory Analysis by Visualizing the data
  • 14. The creativity part and lot of trial / error process. Feature engineering Andrew Fogg win the competition by categorising the colours of cars.
  • 15. ● ML is often used in DS ● Currently, the buzz/trend ML is xgboost which gives most of the time better result than the traditional Random Forest & Neural Networks. ● Reason of the success? More Accurate, more efficient, easy to use, customized and distributed. ● Need less spending time in Feature engineering but still need some creativity. Models to predict
  • 16. Models to predict: gradient boosting
  • 17. ● ML is often used in DS ● Currently, the buzz/trend ML is xgboost which gives most of the time better result than the traditional Random Forest & Neural Networks. ● Reason of the success? More Accurate, more efficient, easy to use, customized and distributed. ● Need less spending time in Feature engineering but still need some creativity. Models to predict
  • 18. Evaluation - validations ● Overfitting/Underfitting is the biggest fear of a Data Scientist. ● Cross validation is one way to protect the model to not overfit
  • 19. Feedback loop ● ML algorithm is a life system … like any life specimen, it needs cares !!! ● Learning by his mistakes, it’s the only way to progress and to fit a real AI model.
  • 20. Bad Methodology Main reasons: • No clear business case • Try to create the best accurate model in the first place • No agility • No code version control
  • 21. An iterative delivery is key Sprint 1 Sprint 2 Main take away: • Agility is required • Weekly delivered is highly recommended to avoid falling to the “tunnel effect”
  • 23. Gartner Says “More Than 40 Percent of Data Science Tasks Will Be Automated by 2020” Source: https://www.gartner.com/newsroom/id/3570917 Automation in Machine Learning is starting
  • 24. Gain in Efficiency ● In the old age of BI world, we gain in efficiency by using ETL tool rather than scripting codes. However, ML is often associate with R/Python/Scala coding.
  • 25. Dataiku Flow => enable AML My favorite app The Collaborative Data Science Platform: Dataiku
  • 27. Data Science is a range of skills ! It’s quite rare to get them in a single person Source: Dsradar.com
  • 28. Thank you for your attention Any Questions? Keep contact: @AAjraou