SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Big Data
explanations
&
use cases in
industrial sector
September 2015
Nicolas SARRAMAGNA
https://fr.linkedin.com/pub/nicolas-sarramagna/19/941/587
CONTENTS
 What’s Big Data ?
1. Definition, 3 V
2. General use cases
3. Technologies used
4. Market Overview
 Big Data in Industrial sector
1. What for ?
2. Vision
3. Demo Poc / PoV
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data – 3V
SEPTEMBER 2015
3
 BIG DATA :
 New contexts on data -> 3V
 New business ambitions, new technologies
 VOLUME : MASSIFICATION AND AUTOMATION OF DATA EXCHANGES
 80% data created last 12 months
 30 billions of contents on FB each month, Flickr 5 billions of page, 2 billions videos read on sur Youtube each day
 VARIETY : MULTIPLICATION OF SOURCES AND TYPES
 Mails, documents, logs (applications, networks, systems), databases, sensor data, open data, social networks,
blogs, forums, articles, browsing history, geolocation data, …
 Structured data (DB), semi-structured (html page, tweet, xml), unstructured (mail content, excel, ppt, video, audio)
 VELOCITY : NEED TO COLLECT AND PROCESS DATA IN REAL TIME
 Risk management (fraud, security of the SI – SIEM)
 Real time route optimization
 Personalized advertising
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data – new technologies
SEPTEMBER 2015
4
 BIG DATA :
 More efficient components but also throughput I/O -> grid architecture
 New technological knowledge : storage of large volumes of data in a cluster at a lower cost, distributed computing,
data mining industrialized, on-demand IT architecture with the cloud
 ORIGIN OF BIG DATA
 index the web and search engine for Google, Yahoo - years ~2006
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data - general use cases IT
SEPTEMBER 2015
5
 COMPLETE THE ARCHITECTURE OF THE DATA
 Vision of a Data lake / Enterprise data hub
 Bringing closer data applications and not duplicate data for each application
 "Deliver" managed data
 REDUCE STORAGE COSTS AND COMPUTING COSTS
 Big Data technologies use commodity hardware and / or cloud and parallel computing
 STRONG TECHNICAL CONSTRAINTS
 Manage + 1000 transactions / seconde
 Flow of + 1000 events to collect / seconde
 Computing + 10 threads /core cpu
 Storage of data set +10To for actions
Require major adaptations and material logic without big data technologies
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data - general use cases business
SEPTEMBER 2015
6
 END-USER CENTRIC
 Products recommendation
 Optimization of ads
 PROCESS CENTRIC
 Detection of unexpected events : fraud, network, predictive maintenance
 Path optimization
 DIVERSIFICATION OF THE BUSINESS MODEL
 Orange : resale of geolocation data
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data – misconceptions
SEPTEMBER 2015
7
Only used for
unstructured data
Only needed for
massive data sets
Only available from
open-source
Replaces my current
BI platform
Used with structured
and unstructured data
To store and analyse
all size of data
It is complimentary to
our existing BI
strategy and
investments
Big Data will become esential for Business Intelligence
All big editors are on
the bridge
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data – BD completes the architecture of the data
SEPTEMBER 2015
8
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data – BI opportunities
SEPTEMBER 2015 FOOTER CAN BE PERSIZED AS FOLLOW: INSERT / HEADER AND FOOTER
9
THE PAST - BI
BIG DATA ANALYTICS
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data - technologies under the hood - standard Hadoop
SEPTEMBER 2015 FOOTER CAN BE PERSONALIZED AS FOLLOW: INSERT / HEADER AND FOOTER
10
PLATEFORME HADOOP
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data - technologies under the hood
SEPTEMBER 2015 FOOTER CAN BE PERSONALIZED AS FOLLOW: INSERT / HEADER AND FOOTER
11
 COLLECT
 Spark, flume, Sqoop
 Inject data into HDFS and NoSql DB : command line, API REST, API Java, streaming injection, massive injection,
from RDBMS injection
 STORAGE
 Cloud, Hadoop -> distributed file system HDFS (large and small data set)
 NoSql, : not only sql : db distributed, schema-less : CAP theorem, DB key-value, column, document, graph oriented
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data - technologies under the hood
SEPTEMBER 2015
12
 ANALYSIS
 Data Science, Map / Reduce, Spark
 Analysis, clean data
 Goal : build a model
 Machine Learning : 1 data set to train the model (67% of the data set), 1 data set to evaluate the model (33%)
 VISUALIZATION
 DataViz : all visual representation techniques to do data mining.
 Build indicators decision easier
 Give indicator whatever size or type of data
 Innovate : give new perspectives to discover new opportunities
 Tableau, QlikView, Power Pivot
 Take data with ODBC connector, JDBC connector, API REST, native connector of the DataViz tool
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data - technologies under the hood
SEPTEMBER 2015
13
 CONCEPTS OF A BIG DATA ARCHITECTURE
 Data and actions distributed : the file-system, jobs (Map/Reduce, Spark, …) , databases (noSql)
 Data and actions co-location : replication, treatments strategy in Hadoop
 Horizontal elasticity : master / nodes architecture
 Shared nothing : when a node breaks down, no data is lost. Each node is independent.
 Design for failure : when a node breaks down, the cluster continues to work.
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data - technologies under the hood
SEPTEMBER 2015 FOOTER CAN BE PERSONALIZED AS FOLLOW: INSERT / HEADER AND FOOTER
14
 HDFS : HADOOP DISTRIBUTED FILE SYSTEM
 Name node : master of the system. Maintains and manages blocks presents on the datanodes
 Data nodes : slaves deployed on each machine and provide actual storage. Serve read and write requests for the
clients
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
What’s Big Data – technologies under the hood - storage costs
SEPTEMBER 2015 FOOTER CAN BE PERSIZED AS FOLLOW: INSERT / HEADER AND FOOTER
15
 USE COMMODITY HARDWARE
 In Big Data, the data center is not a collection of servers but is a collection of co-located cpus, ram and local disks
 1 MILLION $ GETS ->
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
 COTS DISTRIBUTION
 Cloudera, n°1
 Hortonworks, n°2
 MapR, n°3
 CLOUD (BASED ON A DISTRIB)
 Microsoft – Azure
 Amazon - AWS
 APPLIANCE EDITEURS, COSTS++
 Terradata
 Oracle
What’s Big Data - market Overview
SEPTEMBER 2015
16
leaders
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
 CLOUDERA
 Business model editor, 5-6k€ / year / node
 Amazon deploy Cloudera
 Better maturity than others distributions
 HORTONWORKS
 Free, business model based on support : 15k€ / year / slot of 4 nodes or per slot of 50To
 Azure, Amzon deploy Hortonworks
 Less mature than Cloudera on security, administration
 MAPR
 Business model editor
 Divergence with the standard Hadoop
Big Data – positioning of the distributions
SEPTEMBER 2015
17
0
20
40
60
80
100
Cloudera
Hortonworks
MapR
Between distributions, ratio 1 to 4
CONTENTS
 What’s Big Data ?
1. Definition, 3 V
2. Use cases
3. Technologies under the hood
4. Market Overview
 Big Data in Industrial sector
1. What for ?
2. Vision
3. Demo Poc / PoV
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
Big Data in Industrial sector – What for ? - use cases IT
 BUILD A DATA LAKE
 Reduce cost, move cold data from DataWarehouse
 Break the storage of the data in silos
 Stock raw data and can work (data mining) with all of the data
 Open the data, enrich them with metadata
 LOG ANALYSIS AND MONITORING - SIEM
 Monitoring of applications, networks, systems logs -> Splunk
 PREDICTIVE MAINTENANCE
 Monitoring of sensor data, predict breakdowns inter plants
SEPTEMBER 2015 FOOTER CAN BE PERSONALIZED AS FOLLOW: INSERT / HEADER AND FOOTER
19
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
Big Data in Industrial sector – What for ? - use cases HR
 SKILLS VISION AND MANAGEMENT
 Cross informations from professional networks : viadeo, linkedin and internal HR informations : build a map of the
skills in PO
 Build and manage groups of skills, enrich internal RH tools
 E REPUTATION
 Follow in real time the data about your brand, about the competitors, the customers
 Monitoring of social networks (twitter, facebook), press news, financial news, forums, blogs, …
 Quickly react in according with the results if necessary
SEPTEMBER 2015
20
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
Big Data in Industrial sector – What for ? - use cases Marketing
 VISION 360 OF CUSTOMERS, SUPPLIERS, COMPETITORS
 Have as much information about a company : social, legal, financial, competitive position.
 Evaluate risk, opportunity to work together
 VISION OF THE ROI OF PLANTS
 Real-time indicators from plants : invest, number of bumpers, tanks
 Rank the plants, predict gain
SEPTEMBER 2015
21
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
Big Data in Industrial sector – Vision & Roadmap
 2016 : BEGIN TO BUILD A DATA LAKE
 Make the data directly available for BI, Data Science and / or to transfer it in a Datawarehouse
 Collect data and manage it (who has access, metadata)
 Infrastructure : hybrid with cloud / on premise / appliance ?
 2016 : CREATE A NEW CROSS-DIVISION SERVICE AROUND THE DATA
 DataViz : create reporting, use your current dataViz tools -> current BI analyst, no change
 Data IS : know his data and could give metadata to classify it -> current IS , no change
 Data engineer : use collecting tools, coding jobs, transform data -> new skills
 Data Administrator IT : Big Data architecture integration and monitoring -> new skills
 Data Analysis & data mining : cross analysis the data, apply models, design indicators to the dataViz -> new skills
 2016+ : IMPLEMENT OTHER USER CASES
 Begin small and accelerate
SEPTEMBER 2015
22
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
Big Data in Industrial sector – Data Lake
 DATA LAKE / ENTERPRISE DATA HUB / DATA RESERVOIR
 Low cost storage of heterogeneous data (semi, non-structured and structured data)
 Raw data storage but data enriched and classified by metadata – a data reservoir, not a SWAMP
 Used for data exploration, analysis and data mining
 Data schema on read : old ETL, new ELT
 Can be directly used for BI (ELT mode)
 DATA LAKE AND DATA WAREHOUSE
 Complete the sources of the data warehouse
 Could stock cold data from Data Warehouse
 Feed the Data Warehouse
 DATA LAKE VISION
 Stores aggregated data, can stock all the data
 Data Lake centric vision : bring applications to Data and not copy Data to Applications
SEPTEMBER 2015
23
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
Big Data in Industrial sector – Data Lake - infrastructure
 BIG DATA INFRASTRUCTURE
 hybrid with cloud : NO if you want to keep your data inside (security), network effort, cloud skills
 appliance : infra, license, deployment -> TCO ++
 On-premise : best compromise between cost, convenience of deployment and usages.
 CHOICE : ON-PREMISE INFRASTRUCTURE
 Go for Cloudera (better administration and security functionalities, ‘real-time’ module : Impala) or Hortonworks
 Send your IT training : dev, admin, data mining
SEPTEMBER 2015
24
COMPAGNIE PLASTIC OMNIUM
CONFIDENTIAL
Big Data in Industrial sector – Proof of Concept – Proof of Value
SEPTEMBER 2015
25
 SUBJECT : E-REPUTATION
 GOALS
 Put in place indicators of e-Reputation of your enterprise/competitors/suppliers/customers
from various sources : news, social network
 Experiment of big data tools
 INDICATORS
 Who speaks about ? How (positive, negative, neutral) ? What’s the content ? Where in the world ? From what
source ?
 Different views of e-Reputation : financial, HR, societal, commercial
 DEMO
QUESTIONS ?
Nicolas SARRAMAGNA https://fr.linkedin.com/pub/nicolas-sarramagna/19/941/587

Más contenido relacionado

La actualidad más candente

Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public CloudIMC Institute
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Nicola Sandoli
 
GITEX Big Data Conference 2014 – SAP Presentation
GITEX Big Data Conference 2014 – SAP PresentationGITEX Big Data Conference 2014 – SAP Presentation
GITEX Big Data Conference 2014 – SAP PresentationPedro Pereira
 
Le big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseLe big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseRubedo, a WebTales solution
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data ReferencesRob Thomas
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabatinabati
 
Leveraging Taxonomy Management with Machine Learning
Leveraging Taxonomy Management with Machine LearningLeveraging Taxonomy Management with Machine Learning
Leveraging Taxonomy Management with Machine LearningSemantic Web Company
 
Core concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsCore concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsKaniska Mandal
 
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay PlatformsOn Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay PlatformsTokyo University of Science
 
Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0 Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0 ShiHeng1
 
IBM Big Data in the Cloud
IBM Big Data in the CloudIBM Big Data in the Cloud
IBM Big Data in the CloudRob Thomas
 
Green Compute and Storage - Why does it Matter and What is in Scope
Green Compute and Storage - Why does it Matter and What is in ScopeGreen Compute and Storage - Why does it Matter and What is in Scope
Green Compute and Storage - Why does it Matter and What is in ScopeNarayanan Subramaniam
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big AnalyticsAjay Ohri
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingMinhazul Arefin
 
Smart Manufacturing
Smart ManufacturingSmart Manufacturing
Smart ManufacturingLukas Ott
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyNishant Gandhi
 
DCD Big Discussion Guide
DCD Big Discussion GuideDCD Big Discussion Guide
DCD Big Discussion GuideJames Laker
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 

La actualidad más candente (20)

Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
 
GITEX Big Data Conference 2014 – SAP Presentation
GITEX Big Data Conference 2014 – SAP PresentationGITEX Big Data Conference 2014 – SAP Presentation
GITEX Big Data Conference 2014 – SAP Presentation
 
Le big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseLe big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entreprise
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data References
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
Leveraging Taxonomy Management with Machine Learning
Leveraging Taxonomy Management with Machine LearningLeveraging Taxonomy Management with Machine Learning
Leveraging Taxonomy Management with Machine Learning
 
BIG DATA
BIG DATABIG DATA
BIG DATA
 
Core concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsCore concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data Analytics
 
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay PlatformsOn Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
 
Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0 Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0
 
IBM Big Data in the Cloud
IBM Big Data in the CloudIBM Big Data in the Cloud
IBM Big Data in the Cloud
 
Green Compute and Storage - Why does it Matter and What is in Scope
Green Compute and Storage - Why does it Matter and What is in ScopeGreen Compute and Storage - Why does it Matter and What is in Scope
Green Compute and Storage - Why does it Matter and What is in Scope
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big Analytics
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computing
 
Smart Manufacturing
Smart ManufacturingSmart Manufacturing
Smart Manufacturing
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
 
DCD Big Discussion Guide
DCD Big Discussion GuideDCD Big Discussion Guide
DCD Big Discussion Guide
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 

Destacado

Big Data: Big Numbers Bigger Questions, A presentation at Big Data Week
Big Data: Big Numbers Bigger Questions, A presentation at Big Data WeekBig Data: Big Numbers Bigger Questions, A presentation at Big Data Week
Big Data: Big Numbers Bigger Questions, A presentation at Big Data WeekChloe Thomas
 
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldBig Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldDez Blanchfield
 
Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big DataRené Kuipers
 
Performance de l'ingénierie, l'approche Thalès
Performance de l'ingénierie, l'approche ThalèsPerformance de l'ingénierie, l'approche Thalès
Performance de l'ingénierie, l'approche ThalèsRenault Consulting France
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data AnalyticsS P Sajjan
 
Big Data
Big DataBig Data
Big DataNGDATA
 

Destacado (10)

Big Data: Big Numbers Bigger Questions, A presentation at Big Data Week
Big Data: Big Numbers Bigger Questions, A presentation at Big Data WeekBig Data: Big Numbers Bigger Questions, A presentation at Big Data Week
Big Data: Big Numbers Bigger Questions, A presentation at Big Data Week
 
Big data(1st presentation)
Big data(1st presentation)Big data(1st presentation)
Big data(1st presentation)
 
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldBig Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
 
Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big Data
 
Performance de l'ingénierie, l'approche Thalès
Performance de l'ingénierie, l'approche ThalèsPerformance de l'ingénierie, l'approche Thalès
Performance de l'ingénierie, l'approche Thalès
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data Analytics
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data
Big DataBig Data
Big Data
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similar a Big data presentation, explanations and use cases in industrial sector

Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life RevolutionCapgemini
 
IoT Connectivity & IoT Analytics
IoT Connectivity & IoT AnalyticsIoT Connectivity & IoT Analytics
IoT Connectivity & IoT AnalyticsSarah Takforyan
 
Welcome to Your Compact, Data-Driven, Generator-Free Data Center Future
Welcome to Your Compact, Data-Driven, Generator-Free Data Center FutureWelcome to Your Compact, Data-Driven, Generator-Free Data Center Future
Welcome to Your Compact, Data-Driven, Generator-Free Data Center FutureAbaram Network Solutions
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBig Data Value Association
 
Is it sensible to use Data Vault at all? Conclusions from a project.
Is it sensible to use Data Vault at all? Conclusions from a project.Is it sensible to use Data Vault at all? Conclusions from a project.
Is it sensible to use Data Vault at all? Conclusions from a project.Capgemini
 
Webinar Industrial Data Space Association: Introduction and Architecture
Webinar Industrial Data Space Association: Introduction and ArchitectureWebinar Industrial Data Space Association: Introduction and Architecture
Webinar Industrial Data Space Association: Introduction and ArchitectureThorsten Huelsmann
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Prof.Balakrishnan S
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!Gabi Bauer
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?Harald Erb
 
Real-time Energy Data Analytics with Storm
Real-time Energy Data Analytics with StormReal-time Energy Data Analytics with Storm
Real-time Energy Data Analytics with StormDataWorks Summit
 
Dell NVIDIA AI Powered Transformation in Financial Services Webinar
Dell NVIDIA AI Powered Transformation in Financial Services WebinarDell NVIDIA AI Powered Transformation in Financial Services Webinar
Dell NVIDIA AI Powered Transformation in Financial Services WebinarBill Wong
 
Big Data, Big Picture: Can You See It?
Big Data, Big Picture: Can You See It?Big Data, Big Picture: Can You See It?
Big Data, Big Picture: Can You See It?CA Technologies
 
Was steckt drinnen, im Data Market Austria?
Was steckt drinnen, im Data Market Austria?Was steckt drinnen, im Data Market Austria?
Was steckt drinnen, im Data Market Austria?Data Market Austria
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelKangaroot
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data BSP Media Group
 
Automating Data Lakes, Data Warehouses and Data Stores
Automating Data Lakes, Data Warehouses and Data StoresAutomating Data Lakes, Data Warehouses and Data Stores
Automating Data Lakes, Data Warehouses and Data StoresProfinit
 

Similar a Big data presentation, explanations and use cases in industrial sector (20)

Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
Datumize Deck 2019
Datumize Deck 2019 Datumize Deck 2019
Datumize Deck 2019
 
IoT Connectivity & IoT Analytics
IoT Connectivity & IoT AnalyticsIoT Connectivity & IoT Analytics
IoT Connectivity & IoT Analytics
 
Welcome to Your Compact, Data-Driven, Generator-Free Data Center Future
Welcome to Your Compact, Data-Driven, Generator-Free Data Center FutureWelcome to Your Compact, Data-Driven, Generator-Free Data Center Future
Welcome to Your Compact, Data-Driven, Generator-Free Data Center Future
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
 
Is it sensible to use Data Vault at all? Conclusions from a project.
Is it sensible to use Data Vault at all? Conclusions from a project.Is it sensible to use Data Vault at all? Conclusions from a project.
Is it sensible to use Data Vault at all? Conclusions from a project.
 
Webinar Industrial Data Space Association: Introduction and Architecture
Webinar Industrial Data Space Association: Introduction and ArchitectureWebinar Industrial Data Space Association: Introduction and Architecture
Webinar Industrial Data Space Association: Introduction and Architecture
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!
 
13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 
Real-time Energy Data Analytics with Storm
Real-time Energy Data Analytics with StormReal-time Energy Data Analytics with Storm
Real-time Energy Data Analytics with Storm
 
Dell NVIDIA AI Powered Transformation in Financial Services Webinar
Dell NVIDIA AI Powered Transformation in Financial Services WebinarDell NVIDIA AI Powered Transformation in Financial Services Webinar
Dell NVIDIA AI Powered Transformation in Financial Services Webinar
 
Big Data, Big Picture: Can You See It?
Big Data, Big Picture: Can You See It?Big Data, Big Picture: Can You See It?
Big Data, Big Picture: Can You See It?
 
Was steckt drinnen, im Data Market Austria?
Was steckt drinnen, im Data Market Austria?Was steckt drinnen, im Data Market Austria?
Was steckt drinnen, im Data Market Austria?
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff Scheel
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data
 
Automating Data Lakes, Data Warehouses and Data Stores
Automating Data Lakes, Data Warehouses and Data StoresAutomating Data Lakes, Data Warehouses and Data Stores
Automating Data Lakes, Data Warehouses and Data Stores
 

Último

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Último (20)

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

Big data presentation, explanations and use cases in industrial sector

  • 1. Big Data explanations & use cases in industrial sector September 2015 Nicolas SARRAMAGNA https://fr.linkedin.com/pub/nicolas-sarramagna/19/941/587
  • 2. CONTENTS  What’s Big Data ? 1. Definition, 3 V 2. General use cases 3. Technologies used 4. Market Overview  Big Data in Industrial sector 1. What for ? 2. Vision 3. Demo Poc / PoV
  • 3. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data – 3V SEPTEMBER 2015 3  BIG DATA :  New contexts on data -> 3V  New business ambitions, new technologies  VOLUME : MASSIFICATION AND AUTOMATION OF DATA EXCHANGES  80% data created last 12 months  30 billions of contents on FB each month, Flickr 5 billions of page, 2 billions videos read on sur Youtube each day  VARIETY : MULTIPLICATION OF SOURCES AND TYPES  Mails, documents, logs (applications, networks, systems), databases, sensor data, open data, social networks, blogs, forums, articles, browsing history, geolocation data, …  Structured data (DB), semi-structured (html page, tweet, xml), unstructured (mail content, excel, ppt, video, audio)  VELOCITY : NEED TO COLLECT AND PROCESS DATA IN REAL TIME  Risk management (fraud, security of the SI – SIEM)  Real time route optimization  Personalized advertising
  • 4. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data – new technologies SEPTEMBER 2015 4  BIG DATA :  More efficient components but also throughput I/O -> grid architecture  New technological knowledge : storage of large volumes of data in a cluster at a lower cost, distributed computing, data mining industrialized, on-demand IT architecture with the cloud  ORIGIN OF BIG DATA  index the web and search engine for Google, Yahoo - years ~2006
  • 5. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data - general use cases IT SEPTEMBER 2015 5  COMPLETE THE ARCHITECTURE OF THE DATA  Vision of a Data lake / Enterprise data hub  Bringing closer data applications and not duplicate data for each application  "Deliver" managed data  REDUCE STORAGE COSTS AND COMPUTING COSTS  Big Data technologies use commodity hardware and / or cloud and parallel computing  STRONG TECHNICAL CONSTRAINTS  Manage + 1000 transactions / seconde  Flow of + 1000 events to collect / seconde  Computing + 10 threads /core cpu  Storage of data set +10To for actions Require major adaptations and material logic without big data technologies
  • 6. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data - general use cases business SEPTEMBER 2015 6  END-USER CENTRIC  Products recommendation  Optimization of ads  PROCESS CENTRIC  Detection of unexpected events : fraud, network, predictive maintenance  Path optimization  DIVERSIFICATION OF THE BUSINESS MODEL  Orange : resale of geolocation data
  • 7. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data – misconceptions SEPTEMBER 2015 7 Only used for unstructured data Only needed for massive data sets Only available from open-source Replaces my current BI platform Used with structured and unstructured data To store and analyse all size of data It is complimentary to our existing BI strategy and investments Big Data will become esential for Business Intelligence All big editors are on the bridge
  • 8. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data – BD completes the architecture of the data SEPTEMBER 2015 8
  • 9. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data – BI opportunities SEPTEMBER 2015 FOOTER CAN BE PERSIZED AS FOLLOW: INSERT / HEADER AND FOOTER 9 THE PAST - BI BIG DATA ANALYTICS
  • 10. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data - technologies under the hood - standard Hadoop SEPTEMBER 2015 FOOTER CAN BE PERSONALIZED AS FOLLOW: INSERT / HEADER AND FOOTER 10 PLATEFORME HADOOP
  • 11. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data - technologies under the hood SEPTEMBER 2015 FOOTER CAN BE PERSONALIZED AS FOLLOW: INSERT / HEADER AND FOOTER 11  COLLECT  Spark, flume, Sqoop  Inject data into HDFS and NoSql DB : command line, API REST, API Java, streaming injection, massive injection, from RDBMS injection  STORAGE  Cloud, Hadoop -> distributed file system HDFS (large and small data set)  NoSql, : not only sql : db distributed, schema-less : CAP theorem, DB key-value, column, document, graph oriented
  • 12. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data - technologies under the hood SEPTEMBER 2015 12  ANALYSIS  Data Science, Map / Reduce, Spark  Analysis, clean data  Goal : build a model  Machine Learning : 1 data set to train the model (67% of the data set), 1 data set to evaluate the model (33%)  VISUALIZATION  DataViz : all visual representation techniques to do data mining.  Build indicators decision easier  Give indicator whatever size or type of data  Innovate : give new perspectives to discover new opportunities  Tableau, QlikView, Power Pivot  Take data with ODBC connector, JDBC connector, API REST, native connector of the DataViz tool
  • 13. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data - technologies under the hood SEPTEMBER 2015 13  CONCEPTS OF A BIG DATA ARCHITECTURE  Data and actions distributed : the file-system, jobs (Map/Reduce, Spark, …) , databases (noSql)  Data and actions co-location : replication, treatments strategy in Hadoop  Horizontal elasticity : master / nodes architecture  Shared nothing : when a node breaks down, no data is lost. Each node is independent.  Design for failure : when a node breaks down, the cluster continues to work.
  • 14. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data - technologies under the hood SEPTEMBER 2015 FOOTER CAN BE PERSONALIZED AS FOLLOW: INSERT / HEADER AND FOOTER 14  HDFS : HADOOP DISTRIBUTED FILE SYSTEM  Name node : master of the system. Maintains and manages blocks presents on the datanodes  Data nodes : slaves deployed on each machine and provide actual storage. Serve read and write requests for the clients
  • 15. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL What’s Big Data – technologies under the hood - storage costs SEPTEMBER 2015 FOOTER CAN BE PERSIZED AS FOLLOW: INSERT / HEADER AND FOOTER 15  USE COMMODITY HARDWARE  In Big Data, the data center is not a collection of servers but is a collection of co-located cpus, ram and local disks  1 MILLION $ GETS ->
  • 16. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL  COTS DISTRIBUTION  Cloudera, n°1  Hortonworks, n°2  MapR, n°3  CLOUD (BASED ON A DISTRIB)  Microsoft – Azure  Amazon - AWS  APPLIANCE EDITEURS, COSTS++  Terradata  Oracle What’s Big Data - market Overview SEPTEMBER 2015 16 leaders
  • 17. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL  CLOUDERA  Business model editor, 5-6k€ / year / node  Amazon deploy Cloudera  Better maturity than others distributions  HORTONWORKS  Free, business model based on support : 15k€ / year / slot of 4 nodes or per slot of 50To  Azure, Amzon deploy Hortonworks  Less mature than Cloudera on security, administration  MAPR  Business model editor  Divergence with the standard Hadoop Big Data – positioning of the distributions SEPTEMBER 2015 17 0 20 40 60 80 100 Cloudera Hortonworks MapR Between distributions, ratio 1 to 4
  • 18. CONTENTS  What’s Big Data ? 1. Definition, 3 V 2. Use cases 3. Technologies under the hood 4. Market Overview  Big Data in Industrial sector 1. What for ? 2. Vision 3. Demo Poc / PoV
  • 19. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL Big Data in Industrial sector – What for ? - use cases IT  BUILD A DATA LAKE  Reduce cost, move cold data from DataWarehouse  Break the storage of the data in silos  Stock raw data and can work (data mining) with all of the data  Open the data, enrich them with metadata  LOG ANALYSIS AND MONITORING - SIEM  Monitoring of applications, networks, systems logs -> Splunk  PREDICTIVE MAINTENANCE  Monitoring of sensor data, predict breakdowns inter plants SEPTEMBER 2015 FOOTER CAN BE PERSONALIZED AS FOLLOW: INSERT / HEADER AND FOOTER 19
  • 20. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL Big Data in Industrial sector – What for ? - use cases HR  SKILLS VISION AND MANAGEMENT  Cross informations from professional networks : viadeo, linkedin and internal HR informations : build a map of the skills in PO  Build and manage groups of skills, enrich internal RH tools  E REPUTATION  Follow in real time the data about your brand, about the competitors, the customers  Monitoring of social networks (twitter, facebook), press news, financial news, forums, blogs, …  Quickly react in according with the results if necessary SEPTEMBER 2015 20
  • 21. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL Big Data in Industrial sector – What for ? - use cases Marketing  VISION 360 OF CUSTOMERS, SUPPLIERS, COMPETITORS  Have as much information about a company : social, legal, financial, competitive position.  Evaluate risk, opportunity to work together  VISION OF THE ROI OF PLANTS  Real-time indicators from plants : invest, number of bumpers, tanks  Rank the plants, predict gain SEPTEMBER 2015 21
  • 22. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL Big Data in Industrial sector – Vision & Roadmap  2016 : BEGIN TO BUILD A DATA LAKE  Make the data directly available for BI, Data Science and / or to transfer it in a Datawarehouse  Collect data and manage it (who has access, metadata)  Infrastructure : hybrid with cloud / on premise / appliance ?  2016 : CREATE A NEW CROSS-DIVISION SERVICE AROUND THE DATA  DataViz : create reporting, use your current dataViz tools -> current BI analyst, no change  Data IS : know his data and could give metadata to classify it -> current IS , no change  Data engineer : use collecting tools, coding jobs, transform data -> new skills  Data Administrator IT : Big Data architecture integration and monitoring -> new skills  Data Analysis & data mining : cross analysis the data, apply models, design indicators to the dataViz -> new skills  2016+ : IMPLEMENT OTHER USER CASES  Begin small and accelerate SEPTEMBER 2015 22
  • 23. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL Big Data in Industrial sector – Data Lake  DATA LAKE / ENTERPRISE DATA HUB / DATA RESERVOIR  Low cost storage of heterogeneous data (semi, non-structured and structured data)  Raw data storage but data enriched and classified by metadata – a data reservoir, not a SWAMP  Used for data exploration, analysis and data mining  Data schema on read : old ETL, new ELT  Can be directly used for BI (ELT mode)  DATA LAKE AND DATA WAREHOUSE  Complete the sources of the data warehouse  Could stock cold data from Data Warehouse  Feed the Data Warehouse  DATA LAKE VISION  Stores aggregated data, can stock all the data  Data Lake centric vision : bring applications to Data and not copy Data to Applications SEPTEMBER 2015 23
  • 24. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL Big Data in Industrial sector – Data Lake - infrastructure  BIG DATA INFRASTRUCTURE  hybrid with cloud : NO if you want to keep your data inside (security), network effort, cloud skills  appliance : infra, license, deployment -> TCO ++  On-premise : best compromise between cost, convenience of deployment and usages.  CHOICE : ON-PREMISE INFRASTRUCTURE  Go for Cloudera (better administration and security functionalities, ‘real-time’ module : Impala) or Hortonworks  Send your IT training : dev, admin, data mining SEPTEMBER 2015 24
  • 25. COMPAGNIE PLASTIC OMNIUM CONFIDENTIAL Big Data in Industrial sector – Proof of Concept – Proof of Value SEPTEMBER 2015 25  SUBJECT : E-REPUTATION  GOALS  Put in place indicators of e-Reputation of your enterprise/competitors/suppliers/customers from various sources : news, social network  Experiment of big data tools  INDICATORS  Who speaks about ? How (positive, negative, neutral) ? What’s the content ? Where in the world ? From what source ?  Different views of e-Reputation : financial, HR, societal, commercial  DEMO
  • 26. QUESTIONS ? Nicolas SARRAMAGNA https://fr.linkedin.com/pub/nicolas-sarramagna/19/941/587

Notas del editor

  1. "Big Data" : terme designant une rupture avec le traitement traditionnel de la donnee Le Big Data permet de solutionner de nouvelles problematiques ou des anciennes d’une meilleure maniere
  2. Goulet d’étranglement sur les accès écriture/lecture disque, le débit disque ne suit la croissance des espaces de stockage
  3. Big Data ne remplace pas l’architecture existante du BI mais la complete et la réoriente : applications vers data et non data (et ses duplications) vers applications
  4. Descriptive , Diagnostic : regarder le passé et trouver les raisons d’un succes ou d’un echec -> BI Predictive : dégager un modèle qui donne les futurs tendances -> BIG DATA Prescriptive : sous différentes contraintes, déterminer le meilleur moyen d’y parvenir -> BIG DATA
  5. Raconter le cycle de vie de la donnée selon un ordre chrono depuis la source de données jusqu’à la restit. Ods : data opérationnelles. Edw : entrepots de données data agrégée. Datamart : /s ens d’un entrepot. Hdfs système de fichiers distribués. Event -> Kafka (syst. Message distribue) -> Storm (traitement en tps reel du msg, opt.) -> Nosql