SlideShare una empresa de Scribd logo
1 de 48
Descargar para leer sin conexión
Copyright © 2017 The Nielsen Company (US), LLC. Confidential and proprietary. Do not distribute.
The Big
Web
Theory
The Big
Web
Theory
2
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
AIRFLOWAIRFLOW
3
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
● Tal Sharon - Software Architect
● Aviel Buskila - DevOps Tech Lead
● Max Peres - Data Engineer
Who we are?
4
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
What will you learn today ?
• Airflow and how it solved our problems
• How you can deploy Airflow to production
• Best practices for annoying data problems
5
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Nielsen Marketing Cloud
● eXelate - acquired by Nielsen in 2015
● Marketing data cloud service
● Creating targeting profiles
● VERY BIG DATA
● Machine learning
6
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Nielsen Marketing Cloud - main challenge
How many unique users of a certain profile can we
reach?
e.g. campaign for young women who love tech
7
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
What we had...
8
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Data pipeline UI
Multiple workflow parameters
are missing - duration,
successes & failures, ..
9
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
And this is how we had to configure it ...
Not all
configurations
are visible
The Workflow
is hidden
10
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
We wanted something else..
● Configuration and workflow visibility
● Monitoring and statistics
● Share common configuration/code between our
workflows
● Ability to have only 1 concurrent execution of a
workflow
11
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
The Alternatives
Community Main Purpose Flow
Definition
UI Auto
scheduling
Smart
Scheduling
Airflow
(Apache)
Very Active General Purpose Python Rich V V
Luigi
(Spotify)
Active General Purpose Python Limited X X
Oozie
(Apache)
Active Hadoop Job
Scheduling
XML Limited V X
Azkaban
(LinkedIn)
Not very active Hadoop Job
Scheduling
Custom DSL Rich V X
12
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
What is Airflow ?
‘A platform to programmatically author, schedule, and monitor workflows’
Each workflow is described by a DAG(Directed Acyclic Graph) which is constructed by -
● Operators: determine what actually gets done
● Sensors: monitor the job and report success/failure
13
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Airflow overview
Webserver
Accepts HTTP requests and allows the
user to interact with it. It provides the
ability to act on the DAG status (pause,
unpause, trigger).
Scheduler
Monitors the DAGs and periodically
inspects tasks to see if they can be
triggered.
Worker
Daemons that actually execute
the logic of tasks.
14
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Airflow UI
15
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Taking it to Production
16
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Challenges
1. Code dependency management
2. Automated deployment from dev to prod
3. Long running tasks infrastructure
4. Scaling airflow workers
5. Deploy to kubernetes
17
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Code Dependency management - DAG Structure
DAG Dependency
DAG Dependency
DAG Helpers
DAG File
18
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Code Dependency management - DAG Build
19
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Automated deployment from dev to prod
Package artifact with
a version and push
Deploy a DAG file
with a specific
version
20
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Function as a service (FaaS)
FaaS is a category of cloud computing services that
provides a platform allowing customers to develop, run,
and manage application functionalities without the
complexity of building and maintaining the infrastructure
typically associated with developing and launching an app.
Source:
https://en.wikipedia.org/wiki/Function_as_a_service
21
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
OpenFaaS - Long running tasks infrastructure
OpenFaaS (Functions as a Service) is a framework for building Serverless
functions with Docker and Kubernetes which has first-class support for
metrics. Any process can be packaged as a function enabling you to consume
a range of web events without repetitive boiler-plate coding.
22
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Airflow and OpenFaaS
+ =
23
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Open source donations
24
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Scaling airflow workers
Airflow has 3 types of executors:
1. Sequential Executor - One task & one worker
2. Local Executor - Parallel tasks & one worker
3. Celery Executor - Parallel tasks & multiple workers
25
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Celery executor setup
26
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Deployment to kubernetes
27
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
So what did we achieve?
✓ Dependency management
✓ Automated deployment from dev to prod
✓ Long running tasks infrastructure
✓ Scalable airflow workers
✓ Deploy to kubernetes
28
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Annoying
Problems
Annoying
Problems
29
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Annoying Data Maintenance Problems
✓ Troubleshooting broken pipelines
✓ Rerunning parts of a pipeline
30
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Classical Cron Scheduling
31
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Troubleshooting broken pipelines
32
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Understanding a Pipeline (DAG)
33
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Sub Pipeline (sub-DAG)
Create_EMR_Cluster Create_EMR_Step Step_Sensor
34
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Rerunning a Task
Create_EMR_Cluster Create_EMR_Step Step_Sensor
Failed Task
35
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Cool Features
36
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Scaling Out
37
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Scaling Out
38
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Message Exchange (Xcom)
Allows sharing data between tasks
(example: configs)
39
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Sample DAG
40
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Sample DAG
41
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
42
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Summary
● Maintenance becomes a BRE E Z E …...
● Recovery is a NO BRAINER !
● Lots of cool & convenient features
43
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
● Amazing tool for data pipelines
● Open source community
● Cool UI & API
AIRFLOWAIRFLOW
Make you life EASIER !!!Make you life EASIER !!!
44
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Questions?
45
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Come and join us!
https://www.comeet.co/jobs/nielsen/33.000
46
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
Thank you!
47
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
BACKUP SLIDES
48
Copyright©2018TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute.
{
"Data-In-Dpu_us": {
"bucket": "xl8emr-assets",
"file": "data-in/config/dpu_airflow.json"
},
"Data-In-Dpu_eu": {
"bucket": "xl8emr-assets-eu",
"file": "data-in/config/dpu_airflow.json"
},
"Data-In-Dpu_ap": {
"bucket": "xl8emr-assets-eu",
"file": "data-in/config/dpu_airflow.json"
}
}

Más contenido relacionado

La actualidad más candente

Performance Stability, Tips and Tricks and Underscores
Performance Stability, Tips and Tricks and UnderscoresPerformance Stability, Tips and Tricks and Underscores
Performance Stability, Tips and Tricks and UnderscoresJitendra Singh
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
Oracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinarOracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinarMinnie Seungmin Cho
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateContinuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateMichael Rainey
 
Postgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuitePostgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuiteEDB
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseSnowflake Computing
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLPGConf APAC
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data EngineeringHarald Erb
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxHong Ong
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureDatabricks
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Amazon Web Services
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 

La actualidad más candente (20)

Performance Stability, Tips and Tricks and Underscores
Performance Stability, Tips and Tricks and UnderscoresPerformance Stability, Tips and Tricks and Underscores
Performance Stability, Tips and Tricks and Underscores
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
Migrating Oracle to PostgreSQL
Migrating Oracle to PostgreSQLMigrating Oracle to PostgreSQL
Migrating Oracle to PostgreSQL
 
Oracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinarOracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinar
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateContinuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
 
Elastic Data Warehousing
Elastic Data WarehousingElastic Data Warehousing
Elastic Data Warehousing
 
Big query
Big queryBig query
Big query
 
Postgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuitePostgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster Suite
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 

Similar a From AWS Data Pipeline to Airflow - managing data pipelines in Nielsen Marketing Cloud

MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB
 
Cross Section and Deep Dive into GE Predix
Cross Section and Deep Dive into GE PredixCross Section and Deep Dive into GE Predix
Cross Section and Deep Dive into GE PredixAltoros
 
How to Handle the Realities of DevOps Monitoring Today
How to Handle the Realities of DevOps Monitoring TodayHow to Handle the Realities of DevOps Monitoring Today
How to Handle the Realities of DevOps Monitoring TodayDevOps.com
 
9 Hyperion Performance Myths and How to Debunk Them
9 Hyperion Performance Myths and How to Debunk Them9 Hyperion Performance Myths and How to Debunk Them
9 Hyperion Performance Myths and How to Debunk ThemDatavail
 
Robert Murphy Driving Value from Smart Manufacturing
Robert Murphy Driving Value from Smart ManufacturingRobert Murphy Driving Value from Smart Manufacturing
Robert Murphy Driving Value from Smart ManufacturingRockwell Automation
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data SnapLogic
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
Big Data Monitoring Cockpit
Big Data Monitoring CockpitBig Data Monitoring Cockpit
Big Data Monitoring CockpitStefan Bergstein
 
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...InfluxData
 
Office 365 Monitoring Best Practices
Office 365 Monitoring Best PracticesOffice 365 Monitoring Best Practices
Office 365 Monitoring Best PracticesThousandEyes
 
Webinar: Maximizing the ROI of IT by Simplifying Technology Complexity
Webinar: Maximizing the ROI of IT by Simplifying Technology ComplexityWebinar: Maximizing the ROI of IT by Simplifying Technology Complexity
Webinar: Maximizing the ROI of IT by Simplifying Technology ComplexityFlexera
 
Who Broke My Cloud? SaaS Monitoring Best Practices
Who Broke My Cloud? SaaS Monitoring Best PracticesWho Broke My Cloud? SaaS Monitoring Best Practices
Who Broke My Cloud? SaaS Monitoring Best PracticesThousandEyes
 
In-Memory Data Management Goes Mainstream - OpenSlava 2015
In-Memory Data Management Goes Mainstream - OpenSlava 2015In-Memory Data Management Goes Mainstream - OpenSlava 2015
In-Memory Data Management Goes Mainstream - OpenSlava 2015Software AG
 
Improving Adobe Experience Cloud Services Dependability with Machine Learning
Improving Adobe Experience Cloud Services Dependability with Machine LearningImproving Adobe Experience Cloud Services Dependability with Machine Learning
Improving Adobe Experience Cloud Services Dependability with Machine LearningNicolas Brousse
 
Jazz for Service Management
Jazz for Service ManagementJazz for Service Management
Jazz for Service ManagementIBM Danmark
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsAll Things Open
 
Building Microservices with the 12 Factor App Pattern on AWS - Tony Pujals
Building Microservices with the 12 Factor App Pattern on AWS - Tony PujalsBuilding Microservices with the 12 Factor App Pattern on AWS - Tony Pujals
Building Microservices with the 12 Factor App Pattern on AWS - Tony PujalsAmazon Web Services
 
Taking IT Analytics to the Next Level
Taking IT Analytics to the Next LevelTaking IT Analytics to the Next Level
Taking IT Analytics to the Next LevelCA Technologies
 

Similar a From AWS Data Pipeline to Airflow - managing data pipelines in Nielsen Marketing Cloud (20)

MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
 
Cross Section and Deep Dive into GE Predix
Cross Section and Deep Dive into GE PredixCross Section and Deep Dive into GE Predix
Cross Section and Deep Dive into GE Predix
 
How to Handle the Realities of DevOps Monitoring Today
How to Handle the Realities of DevOps Monitoring TodayHow to Handle the Realities of DevOps Monitoring Today
How to Handle the Realities of DevOps Monitoring Today
 
9 Hyperion Performance Myths and How to Debunk Them
9 Hyperion Performance Myths and How to Debunk Them9 Hyperion Performance Myths and How to Debunk Them
9 Hyperion Performance Myths and How to Debunk Them
 
Robert Murphy Driving Value from Smart Manufacturing
Robert Murphy Driving Value from Smart ManufacturingRobert Murphy Driving Value from Smart Manufacturing
Robert Murphy Driving Value from Smart Manufacturing
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data Monitoring Cockpit
Big Data Monitoring CockpitBig Data Monitoring Cockpit
Big Data Monitoring Cockpit
 
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
 
Office 365 Monitoring Best Practices
Office 365 Monitoring Best PracticesOffice 365 Monitoring Best Practices
Office 365 Monitoring Best Practices
 
Webinar: Maximizing the ROI of IT by Simplifying Technology Complexity
Webinar: Maximizing the ROI of IT by Simplifying Technology ComplexityWebinar: Maximizing the ROI of IT by Simplifying Technology Complexity
Webinar: Maximizing the ROI of IT by Simplifying Technology Complexity
 
Industrial IoT bootcamp
Industrial IoT bootcampIndustrial IoT bootcamp
Industrial IoT bootcamp
 
Who Broke My Cloud? SaaS Monitoring Best Practices
Who Broke My Cloud? SaaS Monitoring Best PracticesWho Broke My Cloud? SaaS Monitoring Best Practices
Who Broke My Cloud? SaaS Monitoring Best Practices
 
In-Memory Data Management Goes Mainstream - OpenSlava 2015
In-Memory Data Management Goes Mainstream - OpenSlava 2015In-Memory Data Management Goes Mainstream - OpenSlava 2015
In-Memory Data Management Goes Mainstream - OpenSlava 2015
 
Improving Adobe Experience Cloud Services Dependability with Machine Learning
Improving Adobe Experience Cloud Services Dependability with Machine LearningImproving Adobe Experience Cloud Services Dependability with Machine Learning
Improving Adobe Experience Cloud Services Dependability with Machine Learning
 
Jazz for Service Management
Jazz for Service ManagementJazz for Service Management
Jazz for Service Management
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source tools
 
Building Microservices with the 12 Factor App Pattern on AWS - Tony Pujals
Building Microservices with the 12 Factor App Pattern on AWS - Tony PujalsBuilding Microservices with the 12 Factor App Pattern on AWS - Tony Pujals
Building Microservices with the 12 Factor App Pattern on AWS - Tony Pujals
 
Taking IT Analytics to the Next Level
Taking IT Analytics to the Next LevelTaking IT Analytics to the Next Level
Taking IT Analytics to the Next Level
 

Más de Itai Yaffe

Mastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingMastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingItai Yaffe
 
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationSolving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationItai Yaffe
 
Lessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsLessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsItai Yaffe
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Itai Yaffe
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Itai Yaffe
 
Evaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesEvaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesItai Yaffe
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
Data Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsData Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsItai Yaffe
 
Unleashing the Power of your Data
Unleashing the Power of your DataUnleashing the Power of your Data
Unleashing the Power of your DataItai Yaffe
 
Data Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesData Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesItai Yaffe
 
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Itai Yaffe
 
DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidDevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidItai Yaffe
 
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Itai Yaffe
 
Introducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsIntroducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsItai Yaffe
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
 
Scalable Incremental Index for Druid
Scalable Incremental Index for DruidScalable Incremental Index for Druid
Scalable Incremental Index for DruidItai Yaffe
 
Funnel Analysis with Spark and Druid
Funnel Analysis with Spark and DruidFunnel Analysis with Spark and Druid
Funnel Analysis with Spark and DruidItai Yaffe
 
The benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerThe benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerItai Yaffe
 
Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Itai Yaffe
 
Scheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureScheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureItai Yaffe
 

Más de Itai Yaffe (20)

Mastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingMastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data Processing
 
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationSolving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
 
Lessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsLessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark Applications
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"
 
Evaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesEvaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening Notes
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Data Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsData Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management Monoliths
 
Unleashing the Power of your Data
Unleashing the Power of your DataUnleashing the Power of your Data
Unleashing the Power of your Data
 
Data Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesData Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening Notes
 
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
 
DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidDevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
 
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
 
Introducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsIntroducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom Connectors
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 
Scalable Incremental Index for Druid
Scalable Incremental Index for DruidScalable Incremental Index for Druid
Scalable Incremental Index for Druid
 
Funnel Analysis with Spark and Druid
Funnel Analysis with Spark and DruidFunnel Analysis with Spark and Druid
Funnel Analysis with Spark and Druid
 
The benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerThe benefits of running Spark on your own Docker
The benefits of running Spark on your own Docker
 
Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?
 
Scheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureScheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructure
 

Último

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 

Último (20)

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 

From AWS Data Pipeline to Airflow - managing data pipelines in Nielsen Marketing Cloud