SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
ALTIC Big Data Stack
Charly Clairmont, ALTIC
@egwada
charly.clairmont@altic.org
http://www.altic.org
smart #OpenSource Software
#BusinessIntelligence

assembler

www.ow2.org

Twitter #ow2con @egwada
Our historical tools

• ETL : Talend
• Reporting : JasperReports, Birt
• OLAP : Mondrian, Palo
• BI platform : SpagoBI

www.ow2.org

Twitter #ow2con @egwada
Smart assembling
Innovation & customers'needs
●

●

●

Identify when applied research
is an opportunity for us, our
solutions and our customers.

➔

Understand the business
process of our customer &
assess the impact of Open IT
on their activities

➔

Offer an approach of the project
both a technical and a operative

➔

➔

➔

Altic projects
Allows our customer to optimize
their business process
Takes the customer job into
account
Offers perennial solutions
Follows the customer present
needs and not the editors'
agenda

www.ow2.org

Twitter #ow2con @egwada
Identify Big Data potential / Hadoop

www.ow2.org

Twitter #ow2con @egwada
Our first Big Data project at Altic
●

eFraudBox project (2010 – 2013)
●

Goal : predict frauds on Internet

●

Context :
–
–
–

●

Customer : GIE carte bancaire
European Research and Development project
Lot of industrial and academic partners

Data :
–
–

Type : Banking transactions
Volume : One GB per day

www.ow2.org

Twitter #ow2con @egwada
How did we start our first BigData project ?

www.ow2.org

Twitter #ow2con @egwada
« In data mining processing is done
line by line »
… [ there's not about a data volume
issue ]

www.ow2.org

Twitter #ow2con @egwada
But we have too much data !

www.ow2.org

Twitter #ow2con @egwada
Let's have a look at Hadoop ?
●

Open Source

●

MPP compute platform
●

●

●

Distributed file system
MapReduce processing

Cost efficient
●

Fault tolerant

●

Infinite scale

●

Enterprise Information System ready

●

Continuous Improvement

●

« Even transactions are possible
on Hadoop - it's inevitable that ALL
kinds of workloads will move there
in the future »

Growing community

Doug CUTTING
Hadoop Creator
Octobre 2013

www.ow2.org

Twitter #ow2con @egwada
How do we query Hadoop ?

Java
● Very optimised
● Very customisable
●

Pig Latin
● Easy syntax
● Support
unstructured data
●

www.ow2.org

SQL like
● Easy development
●

Twitter #ow2con @egwada
How do we query Hadoop ?

Need to code
evertything
●

●

Why not ?

www.ow2.org

We already
know SQL !
●

Twitter #ow2con @egwada
Ok, we have our storage and
computation engine, but how can we
manage data ?
By using our Swiss Army Knife !

www.ow2.org

Twitter #ow2con @egwada
Now our Hadoop / Hive platform is filled
with Big Data,
but It's a little bit too slow to query for
end users...

http://ih2.redbubble.net/image.13088996.5766/sticker,375x360.png

www.ow2.org

Twitter #ow2con @egwada
Aggregate data
Processing data with Hive and store results in
fast databases

www.ow2.org

Twitter #ow2con @egwada
Ok, now we have our fast queryable
datasets, but how can we visualize these ?
To manage users and visualizations

To quickly have a vision of your data

To go deeper in your visualizations

www.ow2.org

Twitter #ow2con @egwada
BigData and Datamining : tMahout

+
+

= tMahout
www.ow2.org

Twitter #ow2con @egwada
BigData and Datamining v2
●

Spark : new InMemory data processing framework
●

Very appropriate for Machine learning

●

MLBase : Machine learning library

●

Spark-clustering : Implementation of SOM algorithm

●

Proof Of Concept : Analysis of mobile
telecommunications

www.ow2.org

Twitter #ow2con @egwada
We have now a Big Data stack !

www.ow2.org

Twitter #ow2con @egwada
BI & Big Data for Altic
●

Eventually, we still do BI as usual
●

Tools evolve :
–
–

●

New storage and processing
We do not change our tools, fortunately THEY progress
for us and we contribute

Fundamental does not really change, only
technologies do
–
–

Hadoop
Spark
www.ow2.org

Twitter #ow2con @egwada
We improve our Big Data stack and its
approach...
And support Big Analytic customer project

Our Big Data Stack

Our Big Data Approach

www.ow2.org

Twitter #ow2con @egwada
Questions ?
Thanks !

Charly CLAIRMONT
CTO at ALTIC
@egwada
charly.clairmont@altic.org
http://altic.org
www.ow2.org

Twitter #ow2con @egwada

Más contenido relacionado

La actualidad más candente

Traveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analyticsTraveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analyticsRendy Bambang Junior
 
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia Seahorse
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia SeahorseWizualne budowanie aplikacji na Sparku przy pomocy narzędzia Seahorse
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia SeahorseData Science Warsaw
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Cindy Gross
 
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Neo4j
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataTrieu Nguyen
 
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...Data Science Warsaw
 
Converging Big Data and Application Infrastructure by Steven Poutsy
Converging Big Data and Application Infrastructure by Steven PoutsyConverging Big Data and Application Infrastructure by Steven Poutsy
Converging Big Data and Application Infrastructure by Steven PoutsyBig Data Spain
 
Webinar - SpagoBI 5: here comes the Social Network analysis
Webinar - SpagoBI 5: here comes the Social Network analysis Webinar - SpagoBI 5: here comes the Social Network analysis
Webinar - SpagoBI 5: here comes the Social Network analysis SpagoWorld
 
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...SpagoWorld
 
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDBMongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDBMongoDB
 
Druid meetup 2018-03-13
Druid meetup 2018-03-13Druid meetup 2018-03-13
Druid meetup 2018-03-13gianmerlino
 
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!SpagoWorld
 
Openhab Grafana and Influxdb
Openhab Grafana and InfluxdbOpenhab Grafana and Influxdb
Openhab Grafana and InfluxdbCode-House
 
Webinar: BI Mobile with SpagoBI: be aware everywhere!
Webinar: BI Mobile with SpagoBI: be aware everywhere!Webinar: BI Mobile with SpagoBI: be aware everywhere!
Webinar: BI Mobile with SpagoBI: be aware everywhere!SpagoWorld
 

La actualidad más candente (15)

Traveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analyticsTraveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analytics
 
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia Seahorse
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia SeahorseWizualne budowanie aplikacji na Sparku przy pomocy narzędzia Seahorse
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia Seahorse
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
 
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...
 
Converging Big Data and Application Infrastructure by Steven Poutsy
Converging Big Data and Application Infrastructure by Steven PoutsyConverging Big Data and Application Infrastructure by Steven Poutsy
Converging Big Data and Application Infrastructure by Steven Poutsy
 
Webinar - SpagoBI 5: here comes the Social Network analysis
Webinar - SpagoBI 5: here comes the Social Network analysis Webinar - SpagoBI 5: here comes the Social Network analysis
Webinar - SpagoBI 5: here comes the Social Network analysis
 
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...
 
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDBMongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
 
Druid meetup 2018-03-13
Druid meetup 2018-03-13Druid meetup 2018-03-13
Druid meetup 2018-03-13
 
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!
 
Openhab Grafana and Influxdb
Openhab Grafana and InfluxdbOpenhab Grafana and Influxdb
Openhab Grafana and Influxdb
 
Webinar: BI Mobile with SpagoBI: be aware everywhere!
Webinar: BI Mobile with SpagoBI: be aware everywhere!Webinar: BI Mobile with SpagoBI: be aware everywhere!
Webinar: BI Mobile with SpagoBI: be aware everywhere!
 

Destacado

IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big dataIEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big dataIEEEMEMTECHSTUDENTPROJECTS
 
Building k-nn Graphs From Large Text Data
Building k-nn Graphs From Large Text DataBuilding k-nn Graphs From Large Text Data
Building k-nn Graphs From Large Text DataThibault Debatty
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
Big Data Benchmarking Tutorial
Big Data Benchmarking TutorialBig Data Benchmarking Tutorial
Big Data Benchmarking TutorialTilmann Rabl
 
BigData - Hadoop -by 侯圣文@secooler
BigData - Hadoop -by 侯圣文@secooler BigData - Hadoop -by 侯圣文@secooler
BigData - Hadoop -by 侯圣文@secooler Shengwen HOU(侯圣文)
 
IBM Bluemix Paris Meetup #20 - 20161214
IBM Bluemix Paris Meetup #20 - 20161214IBM Bluemix Paris Meetup #20 - 20161214
IBM Bluemix Paris Meetup #20 - 20161214IBM France Lab
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysisPoonam Kshirsagar
 
Textual Robot programming
Textual Robot programmingTextual Robot programming
Textual Robot programmingCHEMGLOBE
 
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris.
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris. BlueMind : next gen mail and collaboration solution, OW2con'16, Paris.
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris. OW2
 
Indexing Still and Moving Images
Indexing Still and Moving ImagesIndexing Still and Moving Images
Indexing Still and Moving ImagesIan Davis
 
Jaspersoft Open Source Business Intelligence
Jaspersoft Open Source Business IntelligenceJaspersoft Open Source Business Intelligence
Jaspersoft Open Source Business IntelligenceOW2
 
Mobile integration
Mobile integrationMobile integration
Mobile integrationwall530
 
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...OW2
 
Palacio Gobierno del Ecuador
Palacio Gobierno del EcuadorPalacio Gobierno del Ecuador
Palacio Gobierno del EcuadorPablo Guaña
 
OS Approach Industrializing Research Tools
OS Approach Industrializing Research ToolsOS Approach Industrializing Research Tools
OS Approach Industrializing Research ToolsOW2
 
Logic Circuit Project Final Presentation
Logic Circuit Project Final PresentationLogic Circuit Project Final Presentation
Logic Circuit Project Final PresentationMatthew Chang
 
Jasmine Probe, OW2con11, Nov 24-25, Paris
Jasmine Probe, OW2con11, Nov 24-25, ParisJasmine Probe, OW2con11, Nov 24-25, Paris
Jasmine Probe, OW2con11, Nov 24-25, ParisOW2
 
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris.
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris. DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris.
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris. OW2
 

Destacado (20)

IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big dataIEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
 
Building k-nn Graphs From Large Text Data
Building k-nn Graphs From Large Text DataBuilding k-nn Graphs From Large Text Data
Building k-nn Graphs From Large Text Data
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Big Data Benchmarking Tutorial
Big Data Benchmarking TutorialBig Data Benchmarking Tutorial
Big Data Benchmarking Tutorial
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
BigData - Hadoop -by 侯圣文@secooler
BigData - Hadoop -by 侯圣文@secooler BigData - Hadoop -by 侯圣文@secooler
BigData - Hadoop -by 侯圣文@secooler
 
IBM Bluemix Paris Meetup #20 - 20161214
IBM Bluemix Paris Meetup #20 - 20161214IBM Bluemix Paris Meetup #20 - 20161214
IBM Bluemix Paris Meetup #20 - 20161214
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
 
Textual Robot programming
Textual Robot programmingTextual Robot programming
Textual Robot programming
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
 
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris.
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris. BlueMind : next gen mail and collaboration solution, OW2con'16, Paris.
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris.
 
Indexing Still and Moving Images
Indexing Still and Moving ImagesIndexing Still and Moving Images
Indexing Still and Moving Images
 
Jaspersoft Open Source Business Intelligence
Jaspersoft Open Source Business IntelligenceJaspersoft Open Source Business Intelligence
Jaspersoft Open Source Business Intelligence
 
Mobile integration
Mobile integrationMobile integration
Mobile integration
 
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...
 
Palacio Gobierno del Ecuador
Palacio Gobierno del EcuadorPalacio Gobierno del Ecuador
Palacio Gobierno del Ecuador
 
OS Approach Industrializing Research Tools
OS Approach Industrializing Research ToolsOS Approach Industrializing Research Tools
OS Approach Industrializing Research Tools
 
Logic Circuit Project Final Presentation
Logic Circuit Project Final PresentationLogic Circuit Project Final Presentation
Logic Circuit Project Final Presentation
 
Jasmine Probe, OW2con11, Nov 24-25, Paris
Jasmine Probe, OW2con11, Nov 24-25, ParisJasmine Probe, OW2con11, Nov 24-25, Paris
Jasmine Probe, OW2con11, Nov 24-25, Paris
 
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris.
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris. DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris.
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris.
 

Similar a ALTIC's Journey to Building a Big Data Stack

Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance ComputingLuciano Mammino
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsMark Rittman
 
Satisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentationSatisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentationJerome Banks
 
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...Marcin Bielak
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudTreasure Data, Inc.
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudEduardo Silva Pereira
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...Mark Rittman
 
Data Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixData Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixStefan Krawczyk
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperMárton Kodok
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningProvectus
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogC4Media
 
When it all GOes right
When it all GOes rightWhen it all GOes right
When it all GOes rightPavlo Golub
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointInside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dan Lynn
 
Building Modern Data Pipelines on GCP via a FREE online Bootcamp
Building Modern Data Pipelines on GCP via a FREE online BootcampBuilding Modern Data Pipelines on GCP via a FREE online Bootcamp
Building Modern Data Pipelines on GCP via a FREE online BootcampData Con LA
 
Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance ComputingLuciano Mammino
 
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch FixData Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch FixStefan Krawczyk
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the tradeFangda Wang
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimSpark Summit
 

Similar a ALTIC's Journey to Building a Big Data Stack (20)

Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance Computing
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
 
Satisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentationSatisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentation
 
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
 
Data Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixData Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch Fix
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday Developer
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
When it all GOes right
When it all GOes rightWhen it all GOes right
When it all GOes right
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016
 
Building Modern Data Pipelines on GCP via a FREE online Bootcamp
Building Modern Data Pipelines on GCP via a FREE online BootcampBuilding Modern Data Pipelines on GCP via a FREE online Bootcamp
Building Modern Data Pipelines on GCP via a FREE online Bootcamp
 
Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance Computing
 
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch FixData Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 

Más de OW2

OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in Roma
OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in RomaOW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in Roma
OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in RomaOW2
 
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...OW2
 
GLPi v.10, les fonctionnalités principales et l'offre cloud
GLPi v.10, les fonctionnalités principales et l'offre cloudGLPi v.10, les fonctionnalités principales et l'offre cloud
GLPi v.10, les fonctionnalités principales et l'offre cloudOW2
 
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...OW2
 
FusionIAM : la gestion des identités et des accés open source
FusionIAM : la gestion des identités et des accés open sourceFusionIAM : la gestion des identités et des accés open source
FusionIAM : la gestion des identités et des accés open sourceOW2
 
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...OW2
 
SFScon'20 Bringing the User into the Equation
SFScon'20 Bringing the User into the EquationSFScon'20 Bringing the User into the Equation
SFScon'20 Bringing the User into the EquationOW2
 
Towards a sustainable solution to open source sustainability, OW2online20, Ju...
Towards a sustainable solution to open source sustainability, OW2online20, Ju...Towards a sustainable solution to open source sustainability, OW2online20, Ju...
Towards a sustainable solution to open source sustainability, OW2online20, Ju...OW2
 
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...Advanced proactive and polymorphing cloud application adaptation with MORPHEM...
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...OW2
 
Open Source governance and the Eclipse Foundation, OW2online, June 2020
Open Source governance and the Eclipse Foundation, OW2online, June 2020Open Source governance and the Eclipse Foundation, OW2online, June 2020
Open Source governance and the Eclipse Foundation, OW2online, June 2020OW2
 
Open source contribution policies, OW2online, June 2020
Open source contribution policies, OW2online, June 2020Open source contribution policies, OW2online, June 2020
Open source contribution policies, OW2online, June 2020OW2
 
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...OW2
 
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020OW2
 
Open Source Compliance at Orange, OW2online, June 2020
Open Source Compliance at Orange, OW2online, June 2020Open Source Compliance at Orange, OW2online, June 2020
Open Source Compliance at Orange, OW2online, June 2020OW2
 
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020OW2
 
Intelligent package management with FASTEN, OW2online, June 2020
Intelligent package management with FASTEN, OW2online, June 2020Intelligent package management with FASTEN, OW2online, June 2020
Intelligent package management with FASTEN, OW2online, June 2020OW2
 
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020OW2
 
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...OW2
 
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...OW2
 
Cacti and Big Data at Orange France, OW2online, June 2020
Cacti and Big Data at Orange France, OW2online, June 2020Cacti and Big Data at Orange France, OW2online, June 2020
Cacti and Big Data at Orange France, OW2online, June 2020OW2
 

Más de OW2 (20)

OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in Roma
OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in RomaOW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in Roma
OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in Roma
 
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...
 
GLPi v.10, les fonctionnalités principales et l'offre cloud
GLPi v.10, les fonctionnalités principales et l'offre cloudGLPi v.10, les fonctionnalités principales et l'offre cloud
GLPi v.10, les fonctionnalités principales et l'offre cloud
 
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...
 
FusionIAM : la gestion des identités et des accés open source
FusionIAM : la gestion des identités et des accés open sourceFusionIAM : la gestion des identités et des accés open source
FusionIAM : la gestion des identités et des accés open source
 
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
 
SFScon'20 Bringing the User into the Equation
SFScon'20 Bringing the User into the EquationSFScon'20 Bringing the User into the Equation
SFScon'20 Bringing the User into the Equation
 
Towards a sustainable solution to open source sustainability, OW2online20, Ju...
Towards a sustainable solution to open source sustainability, OW2online20, Ju...Towards a sustainable solution to open source sustainability, OW2online20, Ju...
Towards a sustainable solution to open source sustainability, OW2online20, Ju...
 
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...Advanced proactive and polymorphing cloud application adaptation with MORPHEM...
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...
 
Open Source governance and the Eclipse Foundation, OW2online, June 2020
Open Source governance and the Eclipse Foundation, OW2online, June 2020Open Source governance and the Eclipse Foundation, OW2online, June 2020
Open Source governance and the Eclipse Foundation, OW2online, June 2020
 
Open source contribution policies, OW2online, June 2020
Open source contribution policies, OW2online, June 2020Open source contribution policies, OW2online, June 2020
Open source contribution policies, OW2online, June 2020
 
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...
 
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020
 
Open Source Compliance at Orange, OW2online, June 2020
Open Source Compliance at Orange, OW2online, June 2020Open Source Compliance at Orange, OW2online, June 2020
Open Source Compliance at Orange, OW2online, June 2020
 
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020
 
Intelligent package management with FASTEN, OW2online, June 2020
Intelligent package management with FASTEN, OW2online, June 2020Intelligent package management with FASTEN, OW2online, June 2020
Intelligent package management with FASTEN, OW2online, June 2020
 
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020
 
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
 
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...
 
Cacti and Big Data at Orange France, OW2online, June 2020
Cacti and Big Data at Orange France, OW2online, June 2020Cacti and Big Data at Orange France, OW2online, June 2020
Cacti and Big Data at Orange France, OW2online, June 2020
 

Último

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Último (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

ALTIC's Journey to Building a Big Data Stack

  • 1. ALTIC Big Data Stack Charly Clairmont, ALTIC @egwada charly.clairmont@altic.org http://www.altic.org
  • 3. Our historical tools • ETL : Talend • Reporting : JasperReports, Birt • OLAP : Mondrian, Palo • BI platform : SpagoBI www.ow2.org Twitter #ow2con @egwada
  • 4. Smart assembling Innovation & customers'needs ● ● ● Identify when applied research is an opportunity for us, our solutions and our customers. ➔ Understand the business process of our customer & assess the impact of Open IT on their activities ➔ Offer an approach of the project both a technical and a operative ➔ ➔ ➔ Altic projects Allows our customer to optimize their business process Takes the customer job into account Offers perennial solutions Follows the customer present needs and not the editors' agenda www.ow2.org Twitter #ow2con @egwada
  • 5. Identify Big Data potential / Hadoop www.ow2.org Twitter #ow2con @egwada
  • 6. Our first Big Data project at Altic ● eFraudBox project (2010 – 2013) ● Goal : predict frauds on Internet ● Context : – – – ● Customer : GIE carte bancaire European Research and Development project Lot of industrial and academic partners Data : – – Type : Banking transactions Volume : One GB per day www.ow2.org Twitter #ow2con @egwada
  • 7. How did we start our first BigData project ? www.ow2.org Twitter #ow2con @egwada
  • 8. « In data mining processing is done line by line » … [ there's not about a data volume issue ] www.ow2.org Twitter #ow2con @egwada
  • 9. But we have too much data ! www.ow2.org Twitter #ow2con @egwada
  • 10. Let's have a look at Hadoop ? ● Open Source ● MPP compute platform ● ● ● Distributed file system MapReduce processing Cost efficient ● Fault tolerant ● Infinite scale ● Enterprise Information System ready ● Continuous Improvement ● « Even transactions are possible on Hadoop - it's inevitable that ALL kinds of workloads will move there in the future » Growing community Doug CUTTING Hadoop Creator Octobre 2013 www.ow2.org Twitter #ow2con @egwada
  • 11. How do we query Hadoop ? Java ● Very optimised ● Very customisable ● Pig Latin ● Easy syntax ● Support unstructured data ● www.ow2.org SQL like ● Easy development ● Twitter #ow2con @egwada
  • 12. How do we query Hadoop ? Need to code evertything ● ● Why not ? www.ow2.org We already know SQL ! ● Twitter #ow2con @egwada
  • 13. Ok, we have our storage and computation engine, but how can we manage data ? By using our Swiss Army Knife ! www.ow2.org Twitter #ow2con @egwada
  • 14. Now our Hadoop / Hive platform is filled with Big Data, but It's a little bit too slow to query for end users... http://ih2.redbubble.net/image.13088996.5766/sticker,375x360.png www.ow2.org Twitter #ow2con @egwada
  • 15. Aggregate data Processing data with Hive and store results in fast databases www.ow2.org Twitter #ow2con @egwada
  • 16. Ok, now we have our fast queryable datasets, but how can we visualize these ? To manage users and visualizations To quickly have a vision of your data To go deeper in your visualizations www.ow2.org Twitter #ow2con @egwada
  • 17. BigData and Datamining : tMahout + + = tMahout www.ow2.org Twitter #ow2con @egwada
  • 18. BigData and Datamining v2 ● Spark : new InMemory data processing framework ● Very appropriate for Machine learning ● MLBase : Machine learning library ● Spark-clustering : Implementation of SOM algorithm ● Proof Of Concept : Analysis of mobile telecommunications www.ow2.org Twitter #ow2con @egwada
  • 19. We have now a Big Data stack ! www.ow2.org Twitter #ow2con @egwada
  • 20. BI & Big Data for Altic ● Eventually, we still do BI as usual ● Tools evolve : – – ● New storage and processing We do not change our tools, fortunately THEY progress for us and we contribute Fundamental does not really change, only technologies do – – Hadoop Spark www.ow2.org Twitter #ow2con @egwada
  • 21. We improve our Big Data stack and its approach... And support Big Analytic customer project Our Big Data Stack Our Big Data Approach www.ow2.org Twitter #ow2con @egwada
  • 22. Questions ? Thanks ! Charly CLAIRMONT CTO at ALTIC @egwada charly.clairmont@altic.org http://altic.org www.ow2.org Twitter #ow2con @egwada