SlideShare una empresa de Scribd logo
1 de 67
Descargar para leer sin conexión
Lambda Architecture
and Open Source Tools for
Real-time Big Data
● Concepts & Techniques “Thinking with Lambda”
● Case studies in Practice
Trieu Nguyen - http://nguyentantrieu.info or @tantrieuf31
Principal Engineer at eClick Data Analytics team, FPT Online
All contents and thoughts in this slide are my subjective ideas and compiled from
Communities
Just a little introduction
● 2008 Java Developer, developed Social
Trading Network for a small startup (Yopco)
● 2011 worked at FPT Online, software engineer
in Banbe Project, Restful API for VnExpress
Mobile App
● 2012 joined Greengar Studios in 6 months,
scaling backend API mobile games (iOS, Android)
● 2013 back to FPT Online, R&D about Big Data
& Analytics, developing the new core
Analytics Platform (on JVM Platform)
Contents for this talk
●
●
●
●
●
●
●
●

The lessons from history
Problems In Practice
What is the Lambda Architecture?
Why lambda architecture for real-time big
data ?
Open Source Technology Stack
Lambda in Practice (Mobile Data and Web Data)
Lessons I have learned
Questions & Answers
History ?
The best way to predict the future is
looking at the past and now ?
Big data is a buzzword for
old problems
Explaining Big Data
http://www.youtube.com/watch?v=7D1CQ_LOizA
Learning ?
Working ?
Big Data + Old History
http://www.youtube.com/watch?v=tp4y-_VoXdA
This is Big DATA

This is most valuable things!
We can't solve problems
by using the same kind of
thinking we used when we
created them.
Albert Einstein
Think more with
Lambda and Reactive
Where Big Data
can be used
BBC Horizon 2013 The Age of Big Data
http://www.youtube.com/watch?v=RE0ITQ7XQjM
Google’s mission is to

organize

the world’s information and make it
universally accessible and useful.
Organize the world’s
information?
How did Google scale their search engine ?
How does Hadoop really work ?
http://stackoverflow.com/questions/6087834/howscalable-is-mapreduce-in-the-original-functionallanguages
Trends of Now and the Future
MapReduce Programming
Reactive Programming
Functional Programming
Streaming Computation
=> All just the special cases of Lambda

●
●
●
●
So what is the λ
(Lambda)
Architecture ?
the Lambda Architecture:
● apply the (λ) Lambda philosophy in designing big data
system
● equation “query = function(all data)” which is the basis of
all data systems
● proposed by Nathan Marz (http://nathanmarz.com/), a
software engineer from Twitter in his “Big Data” book.
● is based on three main design principles:
○ human fault-tolerance – the system is unsusceptible to data loss or data
corruption because at scale it could be irreparable. (BUGS ?)
○ data immutability – store data in it’s rawest form immutable and for
perpetuity. (INSERT/ SELECT/DELETE but no UPDATE !)
○ recomputation – with the two principles above it is always possible to
(re)-compute results by running a function on the raw data.
Lambda In Practice
2 case studies from my experiences
Case Study 1:
Mobile Data
Monitor API Backend + System KPI
Problem:

Inside “mobile data”,
What's the most
valuable piece of
information
I applied
“Lambda”
here

Backend System for mobile app
Web vs Mobile App
Web
Visitors
Visits
Pageviews
Events

Mobile App
Users
Sessions
Events
Metrics: Cause and Effect
●
●
●
●
●
●
●

Screen Size => App Design, UI/UX, Usability
App version => Deployment, Marketing
Connectivity => Code, User Experience
Location => Marketing, User Behaviour
OS => Marketing, Cost, Development
Memory => User Experience
Feature Session => How to engage app users
The data and the size, not too big for a small
startup!

Where is the lambda ?
I used Groovy + GPars (Groovy Parallel Systems) + MongoDB for fast
parallel computation (actor model) on statistical data
http://gpars.codehaus.org/
The GPars framework offers Java developers intuitive and safe ways to handle
Java or Groovy tasks concurrently.
Support:
●
●
●
●
●
●
●
●

Dataflow concurrency
Actor programming model
CSP
Agent - an thread-safe reference to mutable state
Concurrent collection processing
Composable asynchronous functions
Fork/Join
STM (Software Transactional Memory)
Mobile Apps => Backend APIs =>
Statistics => Find the Trends & Insights?
Reactive Data
Analytics for
Mobile Apps
It means real-time recommendation
by:
➔ context (location, time)
➔ user profile (preferences, level,
...)
Big Data on Small Devices: Data Science goes Mobile
http://strataconf.com/strata2013/public/schedule/detail/27605
Case Study 2:
Web Data
● Real-time Data Analytics
● Monitoring Stream Data (Reactive)

http://eclick.vn
at eClick we must
check campaigns in
near-real-time
(seconds) !

at eClick we have
30~40 GB Logs in Stream
10~20 GB Bandwidth
just for tracking user
actions (click,
impression,...)
in ONE day !
at eClick we have many types of log (video, web,
mobile, system logs, ad-campaign, articles, … )
“lambda architecture”
proposed by @nathanmarz
Internet

Netty Http
Server

TCP Connection

Kafka
Akka Workers
Hadoop Tools

Storm

Redis

Redis

KPI Report
the open-source lambda architecture at eClick
The big-data technology stack
● Netty (http://netty.io/) a framework using reactive programming
pattern for scaling HTTP system easier, by JBoss http://www.jboss.org
● Kafka (http://kafka.apache.org/) a publish-subscribe messaging
rethought as a distributed commit log, open sourced by Linkedin
● Storm (http://storm-project.net/) the framework for distributed
realtime computation system, by Twitter
● Redis (http://redis.io/) a advanced key-value in-memory NoSQL
database, all fast statistical computations in here.
● Groovy for scripting layer on JVM, ad-hoc query on Redis
● Hadoop ecosystem: HDFS, Hive, HBase for batch processing
● RxJava https://github.com/Netflix/RxJava a library for composing
asynchronous and event-based programs
● Hystrix https://github.com/Netflix/Hystrix : for Latency and Fault
Tolerance for Distributed Systems
My new ideas for the future
Connecting the active functor pattern + reactive programming +
stream computation + in-memory computing to make:
● real-time data analytics easier
● better recommendation system
● build more profitable in big data
More Information:
● http://activefunctor.blogspot.com/ (a special case of Lambda
that actively search best connections to form optimal
topology) - from ideas when internship at DRD with my
advisor.
● Can a function be persistent (stored as data), distributed in a
cluster (cloud), reactive to right data (best value in network) ?
● http://www.reactivemanifesto.org/ (reactive pattern)
Lessons
What I have learned from Lambda and Big
Data World
What I have learned
●
●
●
●
●

Study about lambda and read some books
Ask questions=> analytics=> Profit & Value
Collect any data you can, learn inside !
Implement it! Just right tools for right jobs.
Turn your data into the things everyone can
"look & feel"
read papers
Study the “lambda”
I studied Haskell in 2007 with Dr.Peter Gammie http://peteg.org/ when
internship at DRD (a non-profit organization).
● Imperative programs will always be vulnerable to data races because
they contain mutable variables.
● There are no data races in purely functional languages because they
don't have mutable variables.
Reading some books
Improve your business knowledge !
=> read the Behavioral Economics Books

http://www.goodreads.com/shelf/show/behavioral-economics
Collect the data ?
Use your imagination is more than just
knowledge you have
Think more about Butterfly Effect!
Z;
om A to
fr
l get you you
il
“Logic w n will get
in
ginatio - Albert Einste
ima
.”
ywhere
ever
Use you
r
with da imagination
ta
just log analytics, not
ic

Learn Data
Visualization
Questions & Answers
The link of this slide is here:
● http://nguyentantrieu.info/blog/lambda-architecture-andopen-source-tools-for-real-time-big-data/
More useful resources:
● http://nguyentantrieu.info/blog
● http://www.mc2ads.com

Más contenido relacionado

La actualidad más candente

UXにもの申す (黒須正明さん)
UXにもの申す (黒須正明さん)UXにもの申す (黒須正明さん)
UXにもの申す (黒須正明さん)DevLOVE
 
Laravel × レイヤードアーキテクチャを実践して得られた知見と反省 / Practice of Laravel with layered archi...
Laravel × レイヤードアーキテクチャを実践して得られた知見と反省 / Practice of Laravel with layered archi...Laravel × レイヤードアーキテクチャを実践して得られた知見と反省 / Practice of Laravel with layered archi...
Laravel × レイヤードアーキテクチャを実践して得られた知見と反省 / Practice of Laravel with layered archi...Shohei Okada
 
Mesh Bakerのご紹介 どんなアセット?
Mesh Bakerのご紹介 どんなアセット?Mesh Bakerのご紹介 どんなアセット?
Mesh Bakerのご紹介 どんなアセット?onotchi_
 
障害対応・運用におけるトリアージ的対応とZabbixの活用
障害対応・運用におけるトリアージ的対応とZabbixの活用障害対応・運用におけるトリアージ的対応とZabbixの活用
障害対応・運用におけるトリアージ的対応とZabbixの活用Masahito Zembutsu
 
Kurs nurkowy specjalizacja advanced recreational trimix iantd
Kurs nurkowy specjalizacja  advanced recreational trimix iantdKurs nurkowy specjalizacja  advanced recreational trimix iantd
Kurs nurkowy specjalizacja advanced recreational trimix iantdAdrianGaosz
 
Issueの書き方と伝え方
Issueの書き方と伝え方Issueの書き方と伝え方
Issueの書き方と伝え方Rina Fukuda
 
ナイーブベイズによる言語判定
ナイーブベイズによる言語判定ナイーブベイズによる言語判定
ナイーブベイズによる言語判定Shuyo Nakatani
 
CEDEC2015 サブディビジョンサーフェスの すべてがわかる
CEDEC2015 サブディビジョンサーフェスの すべてがわかるCEDEC2015 サブディビジョンサーフェスの すべてがわかる
CEDEC2015 サブディビジョンサーフェスの すべてがわかるTakahito Tejima
 
暗号技術入門 秘密の国のアリス 総集編
暗号技術入門 秘密の国のアリス 総集編暗号技術入門 秘密の国のアリス 総集編
暗号技術入門 秘密の国のアリス 総集編京大 マイコンクラブ
 
本当に無駄な仕事をしたくない人のためのHoudiniプロシージャル入門
本当に無駄な仕事をしたくない人のためのHoudiniプロシージャル入門本当に無駄な仕事をしたくない人のためのHoudiniプロシージャル入門
本当に無駄な仕事をしたくない人のためのHoudiniプロシージャル入門jyouryuusui
 
SmartARの使い方(基本編)
SmartARの使い方(基本編)SmartARの使い方(基本編)
SmartARの使い方(基本編)Takashi Yoshinaga
 
つぶやきGLSLのススメ
つぶやきGLSLのススメつぶやきGLSLのススメ
つぶやきGLSLのススメnotargs
 
OSSを利用したプロジェクト管理
OSSを利用したプロジェクト管理OSSを利用したプロジェクト管理
OSSを利用したプロジェクト管理Tadashi Miyazato
 
第1回UE4名古屋勉強会
第1回UE4名古屋勉強会第1回UE4名古屋勉強会
第1回UE4名古屋勉強会Masahiko Nakamura
 
ゲームの中の人工知能
ゲームの中の人工知能ゲームの中の人工知能
ゲームの中の人工知能Youichiro Miyake
 
一般的なチートの手法と対策について
一般的なチートの手法と対策について一般的なチートの手法と対策について
一般的なチートの手法と対策について優介 黒河
 
Uwpアプリケーション開発入門
Uwpアプリケーション開発入門Uwpアプリケーション開発入門
Uwpアプリケーション開発入門Makoto Nishimura
 
シリコンスタジオの最新テクノロジーデモ技術解説
シリコンスタジオの最新テクノロジーデモ技術解説シリコンスタジオの最新テクノロジーデモ技術解説
シリコンスタジオの最新テクノロジーデモ技術解説Silicon Studio Corporation
 
自律走行ロボットをプログラミングするということ ~ETロボコンの場合~
自律走行ロボットをプログラミングするということ ~ETロボコンの場合~自律走行ロボットをプログラミングするということ ~ETロボコンの場合~
自律走行ロボットをプログラミングするということ ~ETロボコンの場合~Shin-ya Koga
 

La actualidad más candente (20)

UXにもの申す (黒須正明さん)
UXにもの申す (黒須正明さん)UXにもの申す (黒須正明さん)
UXにもの申す (黒須正明さん)
 
Laravel × レイヤードアーキテクチャを実践して得られた知見と反省 / Practice of Laravel with layered archi...
Laravel × レイヤードアーキテクチャを実践して得られた知見と反省 / Practice of Laravel with layered archi...Laravel × レイヤードアーキテクチャを実践して得られた知見と反省 / Practice of Laravel with layered archi...
Laravel × レイヤードアーキテクチャを実践して得られた知見と反省 / Practice of Laravel with layered archi...
 
Mesh Bakerのご紹介 どんなアセット?
Mesh Bakerのご紹介 どんなアセット?Mesh Bakerのご紹介 どんなアセット?
Mesh Bakerのご紹介 どんなアセット?
 
障害対応・運用におけるトリアージ的対応とZabbixの活用
障害対応・運用におけるトリアージ的対応とZabbixの活用障害対応・運用におけるトリアージ的対応とZabbixの活用
障害対応・運用におけるトリアージ的対応とZabbixの活用
 
Kurs nurkowy specjalizacja advanced recreational trimix iantd
Kurs nurkowy specjalizacja  advanced recreational trimix iantdKurs nurkowy specjalizacja  advanced recreational trimix iantd
Kurs nurkowy specjalizacja advanced recreational trimix iantd
 
Issueの書き方と伝え方
Issueの書き方と伝え方Issueの書き方と伝え方
Issueの書き方と伝え方
 
ナイーブベイズによる言語判定
ナイーブベイズによる言語判定ナイーブベイズによる言語判定
ナイーブベイズによる言語判定
 
CEDEC2015 サブディビジョンサーフェスの すべてがわかる
CEDEC2015 サブディビジョンサーフェスの すべてがわかるCEDEC2015 サブディビジョンサーフェスの すべてがわかる
CEDEC2015 サブディビジョンサーフェスの すべてがわかる
 
暗号技術入門 秘密の国のアリス 総集編
暗号技術入門 秘密の国のアリス 総集編暗号技術入門 秘密の国のアリス 総集編
暗号技術入門 秘密の国のアリス 総集編
 
ローカライズって何?(UE4 Localization Deep Dive)
ローカライズって何?(UE4 Localization Deep Dive)ローカライズって何?(UE4 Localization Deep Dive)
ローカライズって何?(UE4 Localization Deep Dive)
 
本当に無駄な仕事をしたくない人のためのHoudiniプロシージャル入門
本当に無駄な仕事をしたくない人のためのHoudiniプロシージャル入門本当に無駄な仕事をしたくない人のためのHoudiniプロシージャル入門
本当に無駄な仕事をしたくない人のためのHoudiniプロシージャル入門
 
SmartARの使い方(基本編)
SmartARの使い方(基本編)SmartARの使い方(基本編)
SmartARの使い方(基本編)
 
つぶやきGLSLのススメ
つぶやきGLSLのススメつぶやきGLSLのススメ
つぶやきGLSLのススメ
 
OSSを利用したプロジェクト管理
OSSを利用したプロジェクト管理OSSを利用したプロジェクト管理
OSSを利用したプロジェクト管理
 
第1回UE4名古屋勉強会
第1回UE4名古屋勉強会第1回UE4名古屋勉強会
第1回UE4名古屋勉強会
 
ゲームの中の人工知能
ゲームの中の人工知能ゲームの中の人工知能
ゲームの中の人工知能
 
一般的なチートの手法と対策について
一般的なチートの手法と対策について一般的なチートの手法と対策について
一般的なチートの手法と対策について
 
Uwpアプリケーション開発入門
Uwpアプリケーション開発入門Uwpアプリケーション開発入門
Uwpアプリケーション開発入門
 
シリコンスタジオの最新テクノロジーデモ技術解説
シリコンスタジオの最新テクノロジーデモ技術解説シリコンスタジオの最新テクノロジーデモ技術解説
シリコンスタジオの最新テクノロジーデモ技術解説
 
自律走行ロボットをプログラミングするということ ~ETロボコンの場合~
自律走行ロボットをプログラミングするということ ~ETロボコンの場合~自律走行ロボットをプログラミングするということ ~ETロボコンの場合~
自律走行ロボットをプログラミングするということ ~ETロボコンの場合~
 

Destacado

100 blue mix days technical training
100 blue mix days technical training100 blue mix days technical training
100 blue mix days technical trainingAjit Yohannan
 
Cloud Camp: Infrastructure as a service advance workloads
Cloud Camp: Infrastructure as a service advance workloadsCloud Camp: Infrastructure as a service advance workloads
Cloud Camp: Infrastructure as a service advance workloadsAsaf Nakash
 
De Persgroep Big Data Expo
De Persgroep Big Data ExpoDe Persgroep Big Data Expo
De Persgroep Big Data ExpoBigDataExpo
 
Delivering Quality Open Data by Chelsea Ursaner
Delivering Quality Open Data by Chelsea UrsanerDelivering Quality Open Data by Chelsea Ursaner
Delivering Quality Open Data by Chelsea UrsanerData Con LA
 
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...Lucidworks
 
I1 - Securing Office 365 and Microsoft Azure like a rockstar (or like a group...
I1 - Securing Office 365 and Microsoft Azure like a rockstar (or like a group...I1 - Securing Office 365 and Microsoft Azure like a rockstar (or like a group...
I1 - Securing Office 365 and Microsoft Azure like a rockstar (or like a group...SPS Paris
 
VMs All the Way Down (BSides Delaware 2016)
VMs All the Way Down (BSides Delaware 2016)VMs All the Way Down (BSides Delaware 2016)
VMs All the Way Down (BSides Delaware 2016)John Hubbard
 
Global Azure Bootcamp - Azure OMS
Global Azure Bootcamp - Azure OMSGlobal Azure Bootcamp - Azure OMS
Global Azure Bootcamp - Azure OMSBruno Lopes
 
(SEC320) Leveraging the Power of AWS to Automate Security & Compliance
(SEC320) Leveraging the Power of AWS to Automate Security & Compliance(SEC320) Leveraging the Power of AWS to Automate Security & Compliance
(SEC320) Leveraging the Power of AWS to Automate Security & ComplianceAmazon Web Services
 
KD2017_System Center in the "cloud first" era
KD2017_System Center in the "cloud first" eraKD2017_System Center in the "cloud first" era
KD2017_System Center in the "cloud first" eraTomica Kaniski
 
Big data for cio 2015
Big data for cio 2015Big data for cio 2015
Big data for cio 2015Zohar Elkayam
 
Bioocean1 :Introduction to Biological Oceanography
Bioocean1 :Introduction to Biological Oceanography Bioocean1 :Introduction to Biological Oceanography
Bioocean1 :Introduction to Biological Oceanography Gazi Abdullah
 
Generalized B2B Machine Learning by Andrew Waage
Generalized B2B Machine Learning by Andrew WaageGeneralized B2B Machine Learning by Andrew Waage
Generalized B2B Machine Learning by Andrew WaageData Con LA
 
SRE Study Notes - CH2,3,4
SRE Study Notes - CH2,3,4SRE Study Notes - CH2,3,4
SRE Study Notes - CH2,3,4Rick Hwang
 
Docker containerization cookbook
Docker containerization cookbookDocker containerization cookbook
Docker containerization cookbookPascal Louis
 
Cleared Job Fair Job Seeker Handbook June 15, 2017, Dulles, VA
Cleared Job Fair Job Seeker Handbook June 15, 2017, Dulles, VACleared Job Fair Job Seeker Handbook June 15, 2017, Dulles, VA
Cleared Job Fair Job Seeker Handbook June 15, 2017, Dulles, VAClearedJobs.Net
 
Walmart Big Data Expo
Walmart Big Data ExpoWalmart Big Data Expo
Walmart Big Data ExpoBigDataExpo
 

Destacado (20)

Voetsporen 38
Voetsporen 38Voetsporen 38
Voetsporen 38
 
100 blue mix days technical training
100 blue mix days technical training100 blue mix days technical training
100 blue mix days technical training
 
Cloud Camp: Infrastructure as a service advance workloads
Cloud Camp: Infrastructure as a service advance workloadsCloud Camp: Infrastructure as a service advance workloads
Cloud Camp: Infrastructure as a service advance workloads
 
De Persgroep Big Data Expo
De Persgroep Big Data ExpoDe Persgroep Big Data Expo
De Persgroep Big Data Expo
 
Delivering Quality Open Data by Chelsea Ursaner
Delivering Quality Open Data by Chelsea UrsanerDelivering Quality Open Data by Chelsea Ursaner
Delivering Quality Open Data by Chelsea Ursaner
 
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
 
I1 - Securing Office 365 and Microsoft Azure like a rockstar (or like a group...
I1 - Securing Office 365 and Microsoft Azure like a rockstar (or like a group...I1 - Securing Office 365 and Microsoft Azure like a rockstar (or like a group...
I1 - Securing Office 365 and Microsoft Azure like a rockstar (or like a group...
 
VMs All the Way Down (BSides Delaware 2016)
VMs All the Way Down (BSides Delaware 2016)VMs All the Way Down (BSides Delaware 2016)
VMs All the Way Down (BSides Delaware 2016)
 
Global Azure Bootcamp - Azure OMS
Global Azure Bootcamp - Azure OMSGlobal Azure Bootcamp - Azure OMS
Global Azure Bootcamp - Azure OMS
 
(SEC320) Leveraging the Power of AWS to Automate Security & Compliance
(SEC320) Leveraging the Power of AWS to Automate Security & Compliance(SEC320) Leveraging the Power of AWS to Automate Security & Compliance
(SEC320) Leveraging the Power of AWS to Automate Security & Compliance
 
KD2017_System Center in the "cloud first" era
KD2017_System Center in the "cloud first" eraKD2017_System Center in the "cloud first" era
KD2017_System Center in the "cloud first" era
 
Big data for cio 2015
Big data for cio 2015Big data for cio 2015
Big data for cio 2015
 
Bioocean1 :Introduction to Biological Oceanography
Bioocean1 :Introduction to Biological Oceanography Bioocean1 :Introduction to Biological Oceanography
Bioocean1 :Introduction to Biological Oceanography
 
Generalized B2B Machine Learning by Andrew Waage
Generalized B2B Machine Learning by Andrew WaageGeneralized B2B Machine Learning by Andrew Waage
Generalized B2B Machine Learning by Andrew Waage
 
SRE Study Notes - CH2,3,4
SRE Study Notes - CH2,3,4SRE Study Notes - CH2,3,4
SRE Study Notes - CH2,3,4
 
Docker containerization cookbook
Docker containerization cookbookDocker containerization cookbook
Docker containerization cookbook
 
Understanding big data
Understanding big dataUnderstanding big data
Understanding big data
 
Greach 2014 Sesamestreet Grails2 Workshop
Greach 2014 Sesamestreet Grails2 Workshop Greach 2014 Sesamestreet Grails2 Workshop
Greach 2014 Sesamestreet Grails2 Workshop
 
Cleared Job Fair Job Seeker Handbook June 15, 2017, Dulles, VA
Cleared Job Fair Job Seeker Handbook June 15, 2017, Dulles, VACleared Job Fair Job Seeker Handbook June 15, 2017, Dulles, VA
Cleared Job Fair Job Seeker Handbook June 15, 2017, Dulles, VA
 
Walmart Big Data Expo
Walmart Big Data ExpoWalmart Big Data Expo
Walmart Big Data Expo
 

Similar a Lambda Architecture and open source technology stack for real time big data

Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big dataTrieu Nguyen
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineTrieu Nguyen
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Rendy Bambang Junior
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxTrayan Iliev
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futuremarkgrover
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesKarthik Murugesan
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesRudiger Wolf
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain
 
Keynote at-icpc-2020
Keynote at-icpc-2020Keynote at-icpc-2020
Keynote at-icpc-2020Ralf Laemmel
 
Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Ahmed Kamal
 
UX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product DevelopmentUX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product DevelopmentTrieu Nguyen
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...HostedbyConfluent
 
MongoDB .local Houston 2019: Wide Ranging Analytical Solutions on MongoDB
MongoDB .local Houston 2019: Wide Ranging Analytical Solutions on MongoDBMongoDB .local Houston 2019: Wide Ranging Analytical Solutions on MongoDB
MongoDB .local Houston 2019: Wide Ranging Analytical Solutions on MongoDBMongoDB
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinTuri, Inc.
 
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learningVinoth Kannan
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Demi Ben-Ari
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Codemotion
 

Similar a Lambda Architecture and open source technology stack for real time big data (20)

Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFlux
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slides
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier Dominguez
 
Keynote at-icpc-2020
Keynote at-icpc-2020Keynote at-icpc-2020
Keynote at-icpc-2020
 
Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?
 
UX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product DevelopmentUX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product Development
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
 
MongoDB .local Houston 2019: Wide Ranging Analytical Solutions on MongoDB
MongoDB .local Houston 2019: Wide Ranging Analytical Solutions on MongoDBMongoDB .local Houston 2019: Wide Ranging Analytical Solutions on MongoDB
MongoDB .local Houston 2019: Wide Ranging Analytical Solutions on MongoDB
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
 
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learning
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
 

Más de Trieu Nguyen

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfTrieu Nguyen
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessTrieu Nguyen
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Trieu Nguyen
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPTrieu Nguyen
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDPTrieu Nguyen
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch DeckTrieu Nguyen
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022Trieu Nguyen
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnTrieu Nguyen
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Trieu Nguyen
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data PlatformTrieu Nguyen
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkTrieu Nguyen
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyTrieu Nguyen
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Trieu Nguyen
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformTrieu Nguyen
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0Trieu Nguyen
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataTrieu Nguyen
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Trieu Nguyen
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content PlatformTrieu Nguyen
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisTrieu Nguyen
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Trieu Nguyen
 

Más de Trieu Nguyen (20)

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDP
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch Deck
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sản
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data Platform
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA framework
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technology
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation Platform
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big data
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content Platform
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Lambda Architecture and open source technology stack for real time big data

  • 1. Lambda Architecture and Open Source Tools for Real-time Big Data ● Concepts & Techniques “Thinking with Lambda” ● Case studies in Practice Trieu Nguyen - http://nguyentantrieu.info or @tantrieuf31 Principal Engineer at eClick Data Analytics team, FPT Online All contents and thoughts in this slide are my subjective ideas and compiled from Communities
  • 2. Just a little introduction ● 2008 Java Developer, developed Social Trading Network for a small startup (Yopco) ● 2011 worked at FPT Online, software engineer in Banbe Project, Restful API for VnExpress Mobile App ● 2012 joined Greengar Studios in 6 months, scaling backend API mobile games (iOS, Android) ● 2013 back to FPT Online, R&D about Big Data & Analytics, developing the new core Analytics Platform (on JVM Platform)
  • 3. Contents for this talk ● ● ● ● ● ● ● ● The lessons from history Problems In Practice What is the Lambda Architecture? Why lambda architecture for real-time big data ? Open Source Technology Stack Lambda in Practice (Mobile Data and Web Data) Lessons I have learned Questions & Answers
  • 4. History ? The best way to predict the future is looking at the past and now ?
  • 5. Big data is a buzzword for old problems
  • 9.
  • 10. Big Data + Old History http://www.youtube.com/watch?v=tp4y-_VoXdA
  • 11. This is Big DATA This is most valuable things!
  • 12.
  • 13. We can't solve problems by using the same kind of thinking we used when we created them. Albert Einstein Think more with Lambda and Reactive
  • 14.
  • 15.
  • 17.
  • 18. BBC Horizon 2013 The Age of Big Data http://www.youtube.com/watch?v=RE0ITQ7XQjM
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. Google’s mission is to organize the world’s information and make it universally accessible and useful.
  • 24.
  • 26.
  • 27. How did Google scale their search engine ? How does Hadoop really work ?
  • 28.
  • 30. Trends of Now and the Future MapReduce Programming Reactive Programming Functional Programming Streaming Computation => All just the special cases of Lambda ● ● ● ●
  • 31.
  • 32. So what is the λ (Lambda) Architecture ?
  • 33.
  • 34.
  • 35. the Lambda Architecture: ● apply the (λ) Lambda philosophy in designing big data system ● equation “query = function(all data)” which is the basis of all data systems ● proposed by Nathan Marz (http://nathanmarz.com/), a software engineer from Twitter in his “Big Data” book. ● is based on three main design principles: ○ human fault-tolerance – the system is unsusceptible to data loss or data corruption because at scale it could be irreparable. (BUGS ?) ○ data immutability – store data in it’s rawest form immutable and for perpetuity. (INSERT/ SELECT/DELETE but no UPDATE !) ○ recomputation – with the two principles above it is always possible to (re)-compute results by running a function on the raw data.
  • 36. Lambda In Practice 2 case studies from my experiences
  • 37. Case Study 1: Mobile Data Monitor API Backend + System KPI
  • 38. Problem: Inside “mobile data”, What's the most valuable piece of information
  • 40. Web vs Mobile App Web Visitors Visits Pageviews Events Mobile App Users Sessions Events
  • 41. Metrics: Cause and Effect ● ● ● ● ● ● ● Screen Size => App Design, UI/UX, Usability App version => Deployment, Marketing Connectivity => Code, User Experience Location => Marketing, User Behaviour OS => Marketing, Cost, Development Memory => User Experience Feature Session => How to engage app users
  • 42. The data and the size, not too big for a small startup! Where is the lambda ? I used Groovy + GPars (Groovy Parallel Systems) + MongoDB for fast parallel computation (actor model) on statistical data http://gpars.codehaus.org/ The GPars framework offers Java developers intuitive and safe ways to handle Java or Groovy tasks concurrently. Support: ● ● ● ● ● ● ● ● Dataflow concurrency Actor programming model CSP Agent - an thread-safe reference to mutable state Concurrent collection processing Composable asynchronous functions Fork/Join STM (Software Transactional Memory)
  • 43. Mobile Apps => Backend APIs => Statistics => Find the Trends & Insights?
  • 44. Reactive Data Analytics for Mobile Apps It means real-time recommendation by: ➔ context (location, time) ➔ user profile (preferences, level, ...)
  • 45. Big Data on Small Devices: Data Science goes Mobile http://strataconf.com/strata2013/public/schedule/detail/27605
  • 46. Case Study 2: Web Data ● Real-time Data Analytics ● Monitoring Stream Data (Reactive) http://eclick.vn
  • 47. at eClick we must check campaigns in near-real-time (seconds) ! at eClick we have 30~40 GB Logs in Stream 10~20 GB Bandwidth just for tracking user actions (click, impression,...) in ONE day ! at eClick we have many types of log (video, web, mobile, system logs, ad-campaign, articles, … )
  • 48.
  • 49.
  • 51. Internet Netty Http Server TCP Connection Kafka Akka Workers Hadoop Tools Storm Redis Redis KPI Report the open-source lambda architecture at eClick
  • 52. The big-data technology stack ● Netty (http://netty.io/) a framework using reactive programming pattern for scaling HTTP system easier, by JBoss http://www.jboss.org ● Kafka (http://kafka.apache.org/) a publish-subscribe messaging rethought as a distributed commit log, open sourced by Linkedin ● Storm (http://storm-project.net/) the framework for distributed realtime computation system, by Twitter ● Redis (http://redis.io/) a advanced key-value in-memory NoSQL database, all fast statistical computations in here. ● Groovy for scripting layer on JVM, ad-hoc query on Redis ● Hadoop ecosystem: HDFS, Hive, HBase for batch processing ● RxJava https://github.com/Netflix/RxJava a library for composing asynchronous and event-based programs ● Hystrix https://github.com/Netflix/Hystrix : for Latency and Fault Tolerance for Distributed Systems
  • 53. My new ideas for the future Connecting the active functor pattern + reactive programming + stream computation + in-memory computing to make: ● real-time data analytics easier ● better recommendation system ● build more profitable in big data More Information: ● http://activefunctor.blogspot.com/ (a special case of Lambda that actively search best connections to form optimal topology) - from ideas when internship at DRD with my advisor. ● Can a function be persistent (stored as data), distributed in a cluster (cloud), reactive to right data (best value in network) ? ● http://www.reactivemanifesto.org/ (reactive pattern)
  • 54. Lessons What I have learned from Lambda and Big Data World
  • 55.
  • 56. What I have learned ● ● ● ● ● Study about lambda and read some books Ask questions=> analytics=> Profit & Value Collect any data you can, learn inside ! Implement it! Just right tools for right jobs. Turn your data into the things everyone can "look & feel"
  • 58. Study the “lambda” I studied Haskell in 2007 with Dr.Peter Gammie http://peteg.org/ when internship at DRD (a non-profit organization). ● Imperative programs will always be vulnerable to data races because they contain mutable variables. ● There are no data races in purely functional languages because they don't have mutable variables.
  • 60.
  • 61.
  • 62. Improve your business knowledge ! => read the Behavioral Economics Books http://www.goodreads.com/shelf/show/behavioral-economics
  • 64. Use your imagination is more than just knowledge you have
  • 65. Think more about Butterfly Effect!
  • 66. Z; om A to fr l get you you il “Logic w n will get in ginatio - Albert Einste ima .” ywhere ever Use you r with da imagination ta just log analytics, not ic Learn Data Visualization
  • 67. Questions & Answers The link of this slide is here: ● http://nguyentantrieu.info/blog/lambda-architecture-andopen-source-tools-for-real-time-big-data/ More useful resources: ● http://nguyentantrieu.info/blog ● http://www.mc2ads.com